首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Through surveying of children in 10 nations with parent, teacher, and Youth Self-Report (YSR) forms of the Child Behavior Checklist (CBCL), cross-informant syndromes (CISs) were derived and cross-validated by sample-dependent methodology. Generalizing CBCL syndromes and norms to nations excluded from its normative sample is problematic. This study used confirmatory factor analyses (CFAs) to test factor model fit for CISs on the YSR responses of 625 Jamaican children ages 11 to 18 years. Item response theory (IRT), a sample-independent methodology, was used to estimate the psychometric properties of individual items on each dimension. CFAs indicated poor to moderate model-to-data fit. Across all syndromes, IRT analyses revealed that more than 3/4 of the cross-informant items yielded little information. Eliminating such items could be cost effective in terms of administration time yet improve the measures discrimination across syndrome severity levels.  相似文献   

2.
The present study examined measurement equivalence of the Satisfaction with Life Scale between American and Chinese samples using multigroup Structural Equation Modeling (SEM), Multiple indicator multiple cause model (MIMIC), and Item Response Theory (IRT). Whereas SEM and MIMIC identified only one biased item across cultures, the IRT analysis revealed that four of the five items had differential item functioning. According to IRT, Chinese whose latent life satisfaction scores were quite high did not endorse items such as “So far I have gotten the important things I want in life” and “If I could live my life over, I would change almost nothing.” The IRT analysis also showed that even when the unbiased items were weighted more heavily than the biased items, the latent mean life satisfaction score of Chinese was substantially lower than that of Americans. The differences among SEM, MIMIC, and IRT are discussed.  相似文献   

3.
4.
In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.  相似文献   

5.
Most item response theory (IRT) models for dichotomous responses are based on probit or logit link functions which assume a symmetric relationship between the probability of a correct response and the latent traits of individuals taking a test. This assumption restricts the use of those models to the case in which all items behave symmetrically. On the other hand, asymmetric models proposed in the literature impose that all the items in a test behave asymmetrically. This assumption is inappropriate for great majority of tests which are, in general, composed of both symmetric and asymmetric items. Furthermore, a straightforward extension of the existing models in the literature would require a prior selection of the items' symmetry/asymmetry status. This paper proposes a Bayesian IRT model that accounts for symmetric and asymmetric items in a flexible but parsimonious way. That is achieved by assigning a finite mixture prior to the skewness parameter, with one of the mixture components being a point mass at zero. This allows for analyses under both model selection and model averaging approaches. Asymmetric item curves are designed through the centred skew normal distribution, which has a particularly appealing parametrization in terms of parameter interpretation and computational efficiency. An efficient Markov chain Monte Carlo algorithm is proposed to perform Bayesian inference and its performance is investigated in some simulated examples. Finally, the proposed methodology is applied to a data set from a large-scale educational exam in Brazil.  相似文献   

6.
Over the past decade, Mokken scale analysis (MSA) has rapidly grown in popularity among researchers from many different research areas. This tutorial provides researchers with a set of techniques and a procedure for their application, such that the construction of scales that have superior measurement properties is further optimized, taking full advantage of the properties of MSA. First, we define the conceptual context of MSA, discuss the two item response theory (IRT) models that constitute the basis of MSA, and discuss how these models differ from other IRT models. Second, we discuss dos and don'ts for MSA; the don'ts include misunderstandings we have frequently encountered with researchers in our three decades of experience with real‐data MSA. Third, we discuss a methodology for MSA on real data that consist of a sample of persons who have provided scores on a set of items that, depending on the composition of the item set, constitute the basis for one or more scales, and we use the methodology to analyse an example real‐data set.  相似文献   

7.
The main aim of this article is to explicate why a transition to ideal point methods of scale construction is needed to advance the field of personality assessment. The study empirically demonstrated the substantive benefits of ideal point methodology as compared with the dominance framework underlying traditional methods of scale construction. Specifically, using a large, heterogeneous pool of order items, the authors constructed scales using traditional classical test theory, dominance item response theory (IRT), and ideal point IRT methods. The merits of each method were examined in terms of item pool utilization, model-data fit, measurement precision, and construct and criterion-related validity. Results show that adoption of the ideal point approach provided a more flexible platform for creating future personality measures, and this transition did not adversely affect the validity of personality test scores.  相似文献   

8.
Using SAS PROC NLMIXED to fit item response theory models   总被引:1,自引:0,他引:1  
Researchers routinely construct tests or questionnaires containing a set of items that measure personality traits, cognitive abilities, political attitudes, and so forth. Typically, responses to these items are scored in discrete categories, such as points on a Likert scale or a choice out of several mutually exclusive alternatives. Item response theory (IRT) explains observed responses to items on a test (questionnaire) by a person’s unobserved trait, ability, or attitude. Although applications of IRT modeling have increased considerably because of its utility in developing and assessing measuring instruments, IRT modeling has not been fully integrated into the curriculum of colleges and universities, mainly because existing general purpose statistical packages do not provide built-in routines with which to perform IRT modeling. Recent advances in statistical theory and the incorporation of those advances into general purpose statistical software such as the Statistical Analysis System (SAS) allow researchers to analyze measurement data by using a class of models known as generalized linear mixed effects models (McCulloch & Searle, 2001), which include IRT models as special cases. The purpose of this article is to demonstrate the generality and flexibility of using SAS to estimate IRT model parameters. With real data examples, we illustrate the implementations of a variety of IRT models for dichotomous, polytomous, and nominal responses. Since SAS is widely available in educational institutions, it is hoped that this article will contribute to the spread of IRT modeling in quantitative courses.  相似文献   

9.
Likert量表分析中不同IRT模型的有效性   总被引:4,自引:1,他引:3  
5级Likert量表可直接分析,也可以转化为3级评分,或转化为2级评分。前二者可以采用等级IRT模型,后者可以采用2级IRT模型。研究表明2级IRT模型中的2参数模型是最适合的模型。多级评分模型与数据拟合也很好,而且等级越多测量精度越大。  相似文献   

10.
题组作为众多测验中的一种常见题型,由于项目间存在一定程度的依赖性而违背了局部独立性假设,若用项目反应模型进行参数估计将会出现较大的偏差.题组反应理论将被试与题组的交互作用纳入到模型中,解决了项目间相依性的问题.笔者对题组反应理论的发展、基本原理及其相关研究进行了综述,并将其应用在中学英语考试中.与项目反应理论相对比,结果发现:(1)题组反应模型与项目反应模型在各参数估计值的相关系数较强,尤其是能力参数和难度参数;(2)在置信区间宽度的比较上,题组反应模型在各个参数上均窄于项目反应模型,即题组反应模型的估计精度优于项目反应模型.  相似文献   

11.
The use of multidimensional forced-choice (MFC) items to assess non-cognitive traits such as personality, interests and values in psychological tests has a long history, because MFC items show strengths in preventing response bias. Recently, there has been a surge of interest in developing item response theory (IRT) models for MFC items. However, nearly all of the existing IRT models have been developed for MFC items with binary scores. Real tests use MFC items with more than two categories; such items are more informative than their binary counterparts. This study developed a new IRT model for polytomous MFC items based on the cognitive model of choice, which describes the cognitive processes underlying humans' preferential choice behaviours. The new model is unique in its ability to account for the ipsative nature of polytomous MFC items, to assess individual psychological differentiation in interests, values and emotions, and to compare the differentiation levels of latent traits between individuals. Simulation studies were conducted to examine the parameter recovery of the new model with existing computer programs. The results showed that both statement parameters and person parameters were well recovered when the sample size was sufficient. The more complete the linking of the statements was, the more accurate the parameter estimation was. This paper provides an empirical example of a career interest test using four-category MFC items. Although some aspects of the model (e.g., the nature of the person parameters) require additional validation, our approach appears promising.  相似文献   

12.
Item response theory (IRT) analyses have, over the past 3 decades, added much to our understanding of the relationships among and characteristics of test items, as revealed in examinees response patterns. Assessment instruments used outside the educational context have only infrequently been analyzed using IRT, however. This study demonstrates the relevance of IRT to personality data through analyses of Scale 2 (the Depression Scale) on the revised Minnesota Multiphasic Personality Inventory (MMPI-2). A rich set of hypotheses regarding the items on this scale, including contrasts among the Harris-Lingoes and Wiener-Harmon subscales and differences in the items measurement characteristics for men and women, are investigated through the IRT analyses.  相似文献   

13.
认知元反应理论--IRT直接应用于多值记分题   总被引:1,自引:0,他引:1  
缪源  李绍珠 《心理科学》2000,23(2):196-199
0-1记分测验的项目反应理论已经得到广泛的研究和应用.但是,许多测验都含有多值记分题,所以需要将IRT推广到此类情况.从认知理论的观点看,每个0-1记分题(项目)和多值记分题的每个测试点都可同样地看成一个由若干知识点构成的集合,称之为认知元;根据认知元之间存在的关系可以确定各受测者对各试题作出特定答案的概率,从而不需要引用任何其它假设就可将IRT的方法直接应用于含多值记分题的测验.本文应用这一理论分析了某些测验样本,结果表明是可行的.  相似文献   

14.
Item response theory (IRT) methods were applied to items from the 80-item Psychological Inventory of Criminal Thinking Styles (PICTS; G. D. Walters, 1995) to determine how well they measure the latent trait of criminal thinking in a group of 2,872 male medium security prison inmates. Preliminary analyses revealed that the 64 PICTS thinking style items, 32 PICTS proactive criminal thinking items, and 24 PICTS reactive criminal thinking items were sufficiently unidimensional to meet the local independence requirements of IRT. The PICTS was fitted to a 2-parameter logistic-graded response IRT model, the results of which showed that the 8 items measuring denial of harm (Sentimentality) displayed weak discrimination (a < 0.5), whereas most of the proactive and reactive items displayed moderate to good discrimination (a > 1.0). Information function analysis revealed that all 3 components of a hierarchical model of criminal thinking--PICTS total scale, PICTS proactive factor, and PICTS reactive factor--displayed greater precision at higher rather than lower levels of the trait dimension. The study findings indicate that items from the PICTS Sentimentality scale do a poor job of measuring general criminal thinking, whereas items from the other 7 PICTS thinking style scales provide their most precise estimates at the upper end of the trait dimension.  相似文献   

15.
Statistical methods designed for categorical data were used to perform confirmatory factor analyses and item response theory (IRT) analyses of the Fear of Negative Evaluation scale (FNE; D. Watson & R. Friend, 1969) and the Brief FNE (BFNE; M. R. Leary, 1983). Results suggested that a 2-factor model fit the data better for both the FNE and the BFNE, although the evidence was less strong for the FNE. The IRT analyses indicated that although both measures had items with good discrimination, the FNE items discriminated only at lower levels of the underlying construct, whereas the BFNE items discriminated across a wider range. Convergent validity analyses indicated that the straightforwardly-worded items on each scale had significantly stronger relationships with theoretically related measures than did the reverse-worded items. On the basis of all analyses, usage of the straightforwardly-worded BFNE factor is recommended for the assessment of fear of negative evaluation.  相似文献   

16.
初中词汇理解能力量表的编制   总被引:4,自引:2,他引:2  
曹亦薇 《心理学报》1999,32(2):215-221
应用项目反应理论为初中各年级编制了词汇理解能力的测验,其中包含了143个多项选择的词汇项目,经过反复预测和大规模的正式测试,证关了这三个测验的量表拟全于2PL模型,项目特征曲线拟合度良好的项目占全体项目数90%以上,能力的一维性也得以确认,经等值化后,各年级的区分度均值分别为0.61(初一),0.59(初二),0.55(初三)难度均值分别为-1.61,-1.30,-0.56。  相似文献   

17.
蔡艳  丁树良  涂冬波  戴海琦 《心理科学》2012,35(6):1497-1501
传统上,群体评估都是以个体的评估结果的平均值为基础进行的。而群体水平IRT理论则可以避开对个体的评估,直接实现对群体的评估,它具有许多传统方法难以企及的优点。本文将群体水平IRT模型应用于2007年某省高考英语阅读理解的410所学校的能力评估,评估结果发现:410所学校的英语阅读理解能力几乎都在[-1,1]区间内,没有能力极高或极低的学校。对这些学校而言,测验中所有项目的难度较易,区分度适中。所有的评估结果与IRT模型的评估结果在 的水平上相关显著,表明GIRT模型在实践中是可以选择的一种群体评估方法。  相似文献   

18.
This study has used an Item Response Theory model (IRT model), specifically the Rasch model, to construct a new observational scale for the assessment of pain in infants less than 36 months old. Results showed that the Rasch model is ideally suited to building an assessment instrument for postoperative pain in this particular age group. Indeed, based on this methodology 21 dichotomous items have been selected that fit to the assessment of postoperative pain in infants aged three to 24 months. Despite some limitations, this study makes a convincing case for the use of Rasch model in instrument design and allowed us to highlight some developmental and behavioral dimensions of the postoperative pain experience in infants.  相似文献   

19.
In a broad class of item response theory (IRT) models for dichotomous items the unweighted total score has monotone likelihood ratio (MLR) in the latent trait. In this study, it is shown that for polytomous items MLR holds for the partial credit model and a trivial generalization of this model. MLR does not necessarily hold if the slopes of the item step response functions vary over items, item steps, or both. MLR holds neither for Samejima's graded response model, nor for nonparametric versions of these three polytomous models. These results are surprising in the context of Grayson's and Huynh's results on MLR for nonparametric dichotomous IRT models, and suggest that establishing stochastic ordering properties for nonparametric polytomous IRT models will be much harder.Hemker's research was supported by the Netherlands Research Council, Grant 575-67-034. Junker's research was supported in part by the National Institutes of Health, Grant CA54852, and by the National Science Foundation, Grant DMS-94.04438.  相似文献   

20.
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The objective was to provide bounds of the likely DIF effects on these measurement consequences. Five factors were manipulated: test length, percentage of DIF items per form, item type, sample size, and level of group ability difference. Results indicate that the greatest DIF effect was less than 2 points on the 0 to 60 total score scale and about 0.15 on the IRT ability scale. DIF had a limited effect on the ratio of true-score variance to observed-score variance, but its influence on the standard error of estimation for the IRT ability parameter was evident for certain ability values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号