Item response theoryIn psychometrics, item response theory (IRT) (also known as latent trait theory, strong true score theory, or modern mental test theory) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is a theory of testing based on the relationship between individuals' performances on a test item and the test takers' levels of performance on an overall measure of the ability that item was designed to measure.
Likert scaleA Likert scale (ˈlɪkərt ,) is a psychometric scale named after its inventor, American social psychologist Rensis Likert, which is commonly used in research questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term (or more fully the Likert-type scale) is often used interchangeably with rating scale, although there are other types of rating scales. Likert distinguished between a scale proper, which emerges from collective responses to a set of items (usually eight or more), and the format in which responses are scored along a range.
ExamAn examination (exam or evaluation) or test is an educational assessment intended to measure a test-taker's knowledge, skill, aptitude, physical fitness, or classification in many other topics (e.g., beliefs). A test may be administered verbally, on paper, on a computer, or in a predetermined area that requires a test taker to demonstrate or perform a set of skills. Tests vary in style, rigor and requirements. There is no general consensus or invariable standard for test formats and difficulty.
IntelligenceIntelligence has been defined in many ways: The capacity for abstraction, logic, understanding, self-awareness, learning, emotional knowledge, reasoning, planning, creativity, critical thinking, and problem-solving. More generally, it can be described as the ability to perceive or infer information, and to retain it as knowledge to be applied towards adaptive behaviors within an environment or context. Intelligence is most often studied in humans but has also been observed in both non-human animals and in plants despite controversy as to whether some of these forms of life exhibit intelligence.
Construct validityConstruct validity concerns how well a set of indicators represent or reflect a concept that is not directly measurable. Construct validation is the accumulation of evidence to support the interpretation of what a measure reflects. Modern validity theory defines construct validity as the overarching concern of validity research, subsuming all other types of validity evidence such as content validity and criterion validity.
Psychological testingPsychological testing is the administration of psychological tests. Psychological tests are administered by trained evaluators. A person's responses are evaluated according to carefully prescribed guidelines. Scores are thought to reflect individual or group differences in the construct the test purports to measure. The science behind psychological testing is psychometrics. According to Anastasi and Urbina, psychological tests involve observations made on a "carefully chosen sample [emphasis authors] of an individual's behavior.
Standardized testA standardized test is a test that is administered and scored in a consistent, or "standard", manner. Standardized tests are designed in such a way that the questions and interpretations are consistent and are administered and scored in a predetermined, standard manner. Any test in which the same test is given in the same manner to all test takers, and graded in the same manner for everyone, is a standardized test. Standardized tests do not need to be high-stakes tests, time-limited tests, or multiple-choice tests.
Latent and observable variablesIn statistics, latent variables (from Latin: present participle of lateo, “lie hidden”) are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or measured. Such latent variable models are used in many disciplines, including political science, demography, engineering, medicine, ecology, physics, machine learning/artificial intelligence, bioinformatics, chemometrics, natural language processing, management, psychology and the social sciences.
Level of measurementLevel of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and ratio. This framework of distinguishing levels of measurement originated in psychology and has since had a complex history, being adopted and extended in some disciplines and by some scholars, and criticized or rejected by others.
Test validityTest validity is the extent to which a test (such as a chemical, physical, or scholastic test) accurately measures what it is supposed to measure. In the fields of psychological testing and educational testing, "validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests". Although classical models divided the concept into various "validities" (such as content validity, criterion validity, and construct validity), the currently dominant view is that validity is a single unitary construct.
Rasch modelThe Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, attitudes, or personality traits, and the item difficulty. For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire.
Reliability (statistics)In statistics and psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions:"It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are precise, reproducible, and consistent from one testing occasion to another.
Validity (statistics)Validity is the main extent to which a concept, conclusion or measurement is well-founded and likely corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong. The validity of a measurement tool (for example, a test in education) is the degree to which the tool measures what it claims to measure. Validity is based on the strength of a collection of different types of evidence (e.g. face validity, construct validity, etc.) described in greater detail below.
Trait theoryIn psychology, trait theory (also called dispositional theory) is an approach to the study of human personality. Trait theorists are primarily interested in the measurement of traits, which can be defined as habitual patterns of behavior, thought, and emotion. According to this perspective, traits are aspects of personality that are relatively stable over time, differ across individuals (e.g. some people are outgoing whereas others are not), are relatively consistent over situations, and influence behaviour.
Classical test theoryClassical test theory (CTT) is a body of related psychometric theory that predicts outcomes of psychological testing such as the difficulty of items or the ability of test-takers. It is a theory of testing based on the idea that a person's observed or obtained score on a test is the sum of a true score (error-free score) and an error score. Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological tests. Classical test theory may be regarded as roughly synonymous with true score theory.
Big Five personality traitsThe Big Five personality traits is a suggested taxonomy, or grouping, for personality traits, developed from the 1980s onward in psychological trait theory. Starting in the 1990s, the theory identified five factors by labels, for the US English population, typically referred to as: openness to experience (inventive/curious vs. consistent/cautious) conscientiousness (efficient/organized vs. extravagant/careless) extraversion (outgoing/energetic vs. solitary/reserved) agreeableness (friendly/compassionate vs.
Factor analysisFactor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed variables mainly reflect the variations in two unobserved (underlying) variables. Factor analysis searches for such joint variations in response to unobserved latent variables.
Criterion validityIn psychometrics, criterion validity, or criterion-related validity, is the extent to which an operationalization of a construct, such as a test, relates to, or predicts, a theoretical representation of the construct—the criterion. Criterion validity is often divided into concurrent and predictive validity based on the timing of measurement for the "predictor" and outcome. Concurrent validity refers to a comparison between the measure in question and an outcome assessed at the same time.
Quantitative researchQuantitative research is a research strategy that focuses on quantifying the collection and analysis of data. It is formed from a deductive approach where emphasis is placed on the testing of theory, shaped by empiricist and positivist philosophies. Associated with the natural, applied, formal, and social sciences this research strategy promotes the objective empirical investigation of observable phenomena to test and understand relationships.
Norm-referenced testA norm-referenced test (NRT) is a type of test, assessment, or evaluation which yields an estimate of the position of the tested individual in a predefined population, with respect to the trait being measured. Assigning scores on such tests may be described as relative grading, marking on a curve (BrE) or grading on a curve (AmE, CanE) (also referred to as curved grading, bell curving, or using grading curves).