期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

40,000 memories in young teenagers: Psychometric properties of the Autobiographical Memory Test in a UK cohort study

Jon Heron Catherine Crane David Gunnell Glyn Lewis Jonathan Evans J. Mark G. Williams 《Memory (Hove, England)》2013,21(3):300-320

Although the Autobiographical Memory Test (AMT) is widely used its psychometric properties have rarely been investigated. This paper utilises data gathered from a 10-item written version of the AMT, completed by 5792 adolescents participating in the Avon Longitudinal Study of Parents and Children, to examine the psychometric properties of the measure. The results show that the scale derived from responses to the AMT operates well over a wide range of scores, consistent with the aim of deriving a continuous measure of over-general memory. There was strong evidence of group differences in terms of gender, low negative mood, and IQ, and these were in agreement when comparing an item response theory (IRT) approach with that based on a sum score. One advantage of the IRT model is the ability to assess and consequently allow for differential item functioning. This additional analysis showed evidence of response bias for both gender and mood, resulting in attenuation in the mean differences in AMT across these groups. Implications of the findings for the use of the AMT measure in different samples are discussed. 相似文献

2.

An item factor analysis and item response theory-based revision of the Everyday Discrimination Scale

Stucky BD Gottfredson NC Panter AT Daye CE Allen WR Wightman LF 《Cultural diversity & ethnic minority psychology》2011,17(2):175-185

The Everyday Discrimination Scale (EDS), a widely used measure of daily perceived discrimination, is purported to be unidimensional, to function well among African Americans, and to have adequate construct validity. Two separate studies and data sources were used to examine and cross-validate the psychometric properties of the EDS. In Study 1, an exploratory factor analysis was conducted on a sample of African American law students (N = 589), providing strong evidence of local dependence, or nuisance multidimensionality within the EDS. In Study 2, a separate nationally representative community sample (N = 3,527) was used to model the identified local dependence in an item factor analysis (i.e., bifactor model). Next, item response theory (IRT) calibrations were conducted to obtain item parameters. A five-item, revised-EDS was then tested for gender differential item functioning (in an IRT framework). Based on these analyses, a summed score to IRT-scaled score translation table is provided for the revised-EDS. Our results indicate that the revised-EDS is unidimensional, with minimal differential item functioning, and retains predictive validity consistent with the original scale. 相似文献

3.

Establishing Measurement Equivalence and Invariance in Longitudinal Data With Item Response Theory

《International Journal of Testing》2013,13(3):279-300

If measurement invariance does not hold over 2 or more measurement occasions, differences in observed scores are not directly interpretable. Golembiewski, Billingsley, and Yeager (1976) identified 2 types of psychometric differences over time as beta change and gamma change. Gamma change is a fundamental change in thinking about the nature of a construct over time. Beta change can be described as respondents' change in calibration of the response scale over time. Recently, researchers have had considerable success establishing measurement invariance using confirmatory factor analytic (CFA) techniques. However, the use of item response theory (IRT) techniques for assessing item parameter drift can provide additional useful information regarding the psychometric equivalence of a measure over time that is not attainable with traditional CFA techniques. This article marries the terminology commonly used in CFA and IRT techniques and illustrates real advantages for identifying beta change over time with IRT methods rather than typical CFA methods, utilizing a longitudinal assessment of job satisfaction as an example. 相似文献

4.

The factor structure of the Autobiographical Memory Test in recent trauma survivors

Griffith JW Kleim B Sumner JA Ehlers A 《心理评价》2012,24(3):640-646

The objective of this study was to examine the psychometric properties of the Autobiographical Memory Test (AMT), which is widely used to measure overgeneral autobiographical memory in individuals with depression and a trauma history. Its factor structure and internal consistency have not been explored in a clinical sample. This study examined the psychometric properties of the AMT in a sample of recent trauma survivors (N = 194), who completed the AMT 2 weeks after a trauma. Participants were also assessed with structured clinical interviews for current acute stress disorder and current and past major depressive disorder. Confirmatory factor analysis and item response theory were used to analyze the AMT in the whole sample. The factor structure of the AMT was also compared for (a) individuals with and without lifetime major depressive disorder and (b) individuals with current (posttrauma) major depressive disorder and/or acute stress disorder versus those with neither disorder. In all of these analyses, the AMT with cues of positive and negative valence had a 1-factor structure, which replicates work in nonclinical samples. Based on analyses of the whole sample, scores from the AMT had a reliability estimate of .72, and standard error of measurement was lowest for people who scored low on memory specificity. In conclusion, the AMT measures 1 factor of memory specificity in a clinical sample and can yield reliable scores for memory specificity. More psychometric studies of the AMT are needed to replicate these results with similar and other clinical populations. 相似文献

5.

Advances in Clinical Personality Measurement: An Item Response Theory Analysis of the MMPI-2 PSY-5 Scales

《Journal of personality assessment》2013,95(2):282-307

Item response theory (IRT) provides valuable methods for the analysis of the psychometric properties of a psychological measure. To date, however, these methods have not been used frequently by personality assessment researchers, in part because many researchers have not been introduced to the methods and in part because most of the development of IRT has taken place in applied education assessment settings, resulting in terminology that is ability focused rather than trait focused. The purpose of this article is twofold. First, an overview of IRT is presented, highlighting the concepts of the three-parameter IRT model, item and test information, and conditional standard error of measurement. Second, the psychometric properties of the (MMPI-2) PSY-5 scales are examined to demonstrate IRT's value. 相似文献

6.

Measurement of alcohol-related consequences among high school and college students: application of item response models to the Rutgers Alcohol Problem Index

Neal DJ Corbin WR Fromme K 《心理评价》2006,18(4):402-414

The Rutgers Alcohol Problem Index (RAPI; H. R. White & E. W. Labouvie, 1989) is a frequently used measure of alcohol-related consequences in adolescents and college students, but psychometric evaluations of the RAPI are limited and it has not been validated with college students. This study used item response theory (IRT) to examine the RAPI on students (N = 895; 65% female, 35% male) assessed in both high school and college. A series of 2-parameter IRT models were computed, examining differential item functioning across gender and time points. A reduced 18-item measure demonstrating strong clinical utility is proposed, with scores of 8 or greater implying greater need for treatment. 相似文献

7.

Item response theory and the measurement of clinical change

Reise SP Haviland MG 《Journal of personality assessment》2005,84(3):228-238

An instrument's sensitivity to detect individual-level change is an important consideration for both psychometric and clinical researchers. In this article, we develop a cognitive problems measure and evaluate its sensitivity to detect change from an item response theory (IRT) perspective. After illustrating assumption checking and model fit assessment, we detail 4 features of IRT modeling: (a) the scale information curve and its relation to the bandwidth of measurement precision, (b) the scale response curve and how it is used to link the latent trait metric with the raw score metric, (c) content-based versus norm-based score referencing, and (d) the level of measurement of the latent trait scale. We conclude that IRT offers an informative, alternative framework for understanding an instrument's psychometric properties and recommend that IRT analyses be considered prior to investigations of change, growth, or the effectiveness of clinical interventions. 相似文献

8.

The Spiritual Transcendence Index: An Item Response Theory Analysis

Alexis D. Abernethy Seong-Hyeon Kim 《The International journal for the psychology of religion》2013,23(4):240-256

ABSTRACT

In an attempt to measure understudied dimensions of spirituality, recent efforts have focused on the transcendent dimension of spirituality. The Spiritual Transcendence Index (STI) was developed to assess a perceived experience of the sacred that affects one’s ability to transcend life’s difficulties. The main focus of the current study was to investigate the psychometric properties of the STI by utilizing the microscopic item-level examination tools unique in item response theory (IRT), as well as its scale-level exploration devices for psychometric properties of an assessment measure. IRT analyses were conducted to investigate the STI’s psychometric properties across samples (N = 712) including how well the measure assesses the latent construct, spiritual transcendence, from the low to high range of the construct. The findings confirm that the 8-item index is a single factor that assesses the latent construct, spiritual transcendence. Instead of the original 6-category version, these findings support a 4-category response version; the 3 categories of disagreement may be collapsed into a single category. These findings not only inform the refinement of the STI but also highlight an important psychometric approach for the refinement of spirituality/religiousness measures, especially those with ceiling effects. 相似文献

9.

An item response theory analysis of the Impulsive Behaviors Checklist for Adolescents

You J Leung F Lai CM Fu K 《Assessment》2011,18(4):464-475

This study used item response theory (IRT) to examine the Impulsive Behaviors Checklist for Adolescents (IBCL-A) among 6,276 (67.7% girls) Chinese secondary school students. The IBCL-A included 15 maladaptive impulsive behaviors adapted from the Revised Diagnostic Interview for Borderlines. The authors obtained the severity and discrimination parameters for each item in the IBCL-A, examined differential item functioning across gender and age groups, and tested reliability and concurrent validity of the IBCL-A IRT-scaled score. Most items in the IBCL-A were the most accurate in assessing moderate to high levels of impulsivity and discriminated well among adolescents with varied levels of impulsivity. Differential item functioning emerged in several items across gender. The IRT-scaled score showed good construct validity and incremental predictive validity. Findings demonstrate the sound psychometric properties of the IBCL-A and support the clinical utility of this scale. 相似文献

10.

Generating items during testing: Psychometric issues and models 总被引：2，自引：0，他引：2

Susan E. Embretson 《Psychometrika》1999,64(4):407-433

On-line item generation is becoming increasingly feasible for many cognitive tests. Item generation seemingly conflicts with the well established principle of measuring persons from items with known psychometric properties. This paper examines psychometric principles and models required for measurement from on-line item generation. Three psychometric issues are elaborated for item generation. First, design principles to generate items are considered. A cognitive design system approach is elaborated and then illustrated with an application to a test of abstract reasoning. Second, psychometric models for calibrating generating principles, rather than specific items, are required. Existing item response theory (IRT) models are reviewed and a new IRT model that includes the impact on item discrimination, as well as difficulty, is developed. Third, the impact of item parameter uncertainty on person estimates is considered. Results from both fixed content and adaptive testing are presented.This article is based on the Presidential Address Susan E. Embretson gave on June 26, 1999 at the 1999 Annual Meeting of the Psychometric Society held at the University of Kansas in Lawrence, Kansas. —Editor 相似文献

11.

A differential item functioning analysis of the PSDQ with Turkish and New Zealand/Australian adolescents

F. Hülya Aşçı Richard B. Fletcher Emine Çağlar 《Psychology of sport and exercise》2009,10(1):12-18

相似文献

12.

An examination of the psychometric properties of the physical self-description questionnaire using a polytomous item response model

《Psychology of sport and exercise》2004,5(4):423-446

相似文献

13.

Evaluating the Psychometric and Measurement Characteristics of a Measure of Sexual Orientation Harassment

Armando X. Estrada Tahira M. Probst Jeremiah Brown Maja Graso 《Military psychology》2013,25(2):220-236

We use classical test theory (CTT) and item response theory (IRT) methodologies to examine the psychometric and measurement properties of an instrument designed to assess sexual orientation harassment among military personnel (N?=?71,989). CTT analyses indicated that items were unidimensional and exhibited adequate levels of reliability. IRT analyses demonstrated that the items functioned similarly and exhibited appropriate levels of item discrimination. However, the analyses also suggested that the sensitivity of the items may be limited. Differential test functioning analyses provided evidence of the measurement equivalence of the instrument across male and female respondents. The findings provide support for the psychometric properties and measurement equivalence of the instrument for measuring sexual orientation harassment among male and female military personnel. We discuss the implications of our findings for future research on sexual orientation harassment in the workplace. 相似文献

14.

A generalized longitudinal mixture IRT model for measuring differential growth in learning environments

Damazo T. Kadengye Eva Ceulemans Wim Van den Noortgate 《Behavior research methods》2014,46(3):823-840

This article describes a generalized longitudinal mixture item response theory (IRT) model that allows for detecting latent group differences in item response data obtained from electronic learning (e-learning) environments or other learning environments that result in large numbers of items. The described model can be viewed as a combination of a longitudinal Rasch model, a mixture Rasch model, and a random-item IRT model, and it includes some features of the explanatory IRT modeling framework. The model assumes the possible presence of latent classes in item response patterns, due to initial person-level differences before learning takes place, to latent class-specific learning trajectories, or to a combination of both. Moreover, it allows for differential item functioning over the classes. A Bayesian model estimation procedure is described, and the results of a simulation study are presented that indicate that the parameters are recovered well, particularly for conditions with large item sample sizes. The model is also illustrated with an empirical sample data set from a Web-based e-learning environment. 相似文献

15.

测验项目反应机制与心理测量模型假设的对应性分析

杨向东《心理科学进展》2010,18(8):1349-1358

从测验项目解决的认知过程的视角分析了在不同测验理论框架下的测量模型中的基本假设, 指出测量模型是测验开发者有关测验项目反应机制的理论假设的具体表征, 是系统检验测量假设和过程的统计框架。然而, 不管是经典测验理论、概化理论, 还是早期的项目反应理论模型, 相关假设都过于简化, 缺少相应实质理论的支持。与之相比, 认知测量模型强调与个体在测验项目反应过程中的认知过程、认知策略和知识结构的对应性, 提供了在实质理论基础上界定测量建构、设计测验项目、进行建模分析和解释的可能性, 为日益边缘化的心理测量学和主流心理学研究的融合奠定了基础。相似文献

16.

Psychometric Properties of the Wisconsin Schizotypy Scales in an Undergraduate Sample: Classical Test Theory,Item Response Theory,and Differential Item Functioning

Beate?P.?Winterstein Terry?A.?Ackerman Paul?J.?Silvia Thomas?R.?Kwapil Email author 《Journal of psychopathology and behavioral assessment》2011,33(4):480-490

The Wisconsin Schizotypy Scales are widely used for assessing schizotypy in nonclinical and clinical samples. However, they were developed using classical test theory (CTT) and have not had their psychometric properties examined with more sophisticated measurement models. The present study employed item response theory (IRT) as well as traditional CTT to examine psychometric properties of four of the schizotypy scales on the item and scale level, using a large sample of undergraduate students (n = 6,137). In addition, we investigated differential item functioning (DIF) for sex and ethnicity. The analyses revealed many strengths of the four scales, but some items had low discrimination values and many items had high DIF. The results offer useful guidance for applied users and for future development of these scales. 相似文献

17.

Psychometric Approaches Help Resolve Competing Cognitive Models: When Less Is More Than It Seems

《认知与教导》2013,31(4):503-521

Simple arithmetic word problems are often featured in elementary school education. One type of problem, "compare with unknown reference set," ranks among the most difficult to solve. Differences in item difficulty for compare problems with unknown reference set are observed depending on the direction of the relational statement (more than vs. less than). Various cognitive models have been proposed to account for these differences. We employed item response theory (IRT) to compare competing cognitive models of student performance. The responses of 100 second-grade students to a series of compare problems with unknown reference set, along with other measures of individual differences, were fit to IRT models. Results indicated that the construction integration model (Kintsch, 1988, 1998) provided the best fit to the data. We discuss the potential contribution of psychometric approaches to the study of thinking. 相似文献

18.

Automatic Generation of Rasch-Calibrated Items: Figural Matrices Test GEOM and Endless-Loops Test EC

《International Journal of Testing》2013,13(3):197-224

The future of test construction for certain psychological ability domains that can be analyzed well in a structured manner may lie—at the very least for reasons of test security—in the field of automatic item generation. In this context, a question that has not been explicitly addressed is whether it is possible to embed an item response theory (IRT) based psychometric quality control procedure directly into the process of automatic item generation. Research in this area was conducted using 2 item generators (for the 2 domains of reasoning and spatial ability) that were developed and based on relevant models of cognitive psychology. During the course of the 4 studies reported here, those parts of the generators that check for possible violations of psychometric quality ("constraints") were improved. The main findings indicate that quality control procedures can be embedded in automatic item generators depending on (a) the degree to which the domain to be measured can be structured; (b) item-specific, content-based analyses; and (c) the degree to which the constraints can be implemented in software. Furthermore, beyond the global check of given model fit via IRT, the content-based analysis of items may be valuable in terms of finding such item properties that may lead to violations of psychometric quality. 相似文献

19.

Testing Differential Item Functioning in Small Samples

William C. M. Belzak 《Multivariate behavioral research》2020,55(5):722-747

Abstract

Differential item functioning (DIF) is a pernicious statistical issue that can mask true group differences on a target latent construct. A considerable amount of research has focused on evaluating methods for testing DIF, such as using likelihood ratio tests in item response theory (IRT). Most of this research has focused on the asymptotic properties of DIF testing, in part because many latent variable methods require large samples to obtain stable parameter estimates. Much less research has evaluated these methods in small sample sizes despite the fact that many social and behavioral scientists frequently encounter small samples in practice. In this article, we examine the extent to which model complexity—the number of model parameters estimated simultaneously—affects the recovery of DIF in small samples. We compare three models that vary in complexity: logistic regression with sum scores, the 1-parameter logistic IRT model, and the 2-parameter logistic IRT model. We expected that logistic regression with sum scores and the 1-parameter logistic IRT model would more accurately estimate DIF because these models yielded more stable estimates despite being misspecified. Indeed, a simulation study and empirical example of adolescent substance use show that, even when data are generated from / assumed to be a 2-parameter logistic IRT, using parsimonious models in small samples leads to more powerful tests of DIF while adequately controlling for Type I error. We also provide evidence for minimum sample sizes needed to detect DIF, and we evaluate whether applying corrections for multiple testing is advisable. Finally, we provide recommendations for applied researchers who conduct DIF analyses in small samples. 相似文献

20.

Analysis of the Reliability of the Leadership Practices Inventory in the Item Response Theory Framework

Hugo Zagorsek Stanley J. Stough Marko Jaklic 《International Journal of Selection & Assessment》2006,14(2):180-191

The paper examines the psychometric properties of the leadership practices inventory (LPI) in the framework of item response theory (IRT). The LPI assesses five dimensions (i.e. leadership practices) of transformational leadership and consists of 30 items. IRT is a model‐based theory that relates the characteristics of questionnaire items (item parameters) and characteristics of individuals (latent variables) to the probability of choosing each of the response categories. The theory does not assume that the instrument is equally reliable for all levels of the latent variable examined. Samejima's graded response model was used to estimate LPI item characteristics, such as item difficulty and item discrimination power. The results show that some items are redundant in the sense they contribute little to the overall precision of the instrument. Moreover, the LPI seems to be most precise and reliable for respondents with low to medium leadership competence, whereas it becomes increasingly unreliable for high‐quality leaders. These findings suggest that the LPI is best used for training and development purposes, but not for leader selection purposes. 相似文献