首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The simultaneous and nonparametric estimation of latent abilities and item characteristic curves is considered. The asymptotic properties of ordinal ability estimation and kernel smoothed nonparametric item characteristic curve estimation are investigated under very general assumptions on the underlying item response theory model as both the test length and the sample size increase. A large deviation probability inequality is stated for ordinal ability estimation. The mean squared error of kernel smoothed item characteristic curve estimates is studied and a strong consistency result is obtained showing that the worst case error in the item characteristic curve estimates over all items and ability levels converges to zero with probability equal to one.  相似文献   

2.
The problem of characterizing the manifest probabilities of a latent trait model is considered. The item characteristic curve is transformed to the item passing-odds curve and a corresponding transformation is made on the distribution of ability. This results in a useful expression for the manifest probabilities of any latent trait model. The result is then applied to give a characterization of the Rasch model as a log-linear model for a 2 J -contingency table. Partial results are also obtained for other models. The question of the identifiability of “guessing” parameters is also discussed. The research reported here is collaborative in every respect and the order of authorship is alphabetical. Dr. Cressie was a Visiting Research Scientist at ETS during the Fall of 1980. His current address is: School of Mathematical Sciences, The Flinders University of South Australia, Bedford Park SA, 5042, AUSTRALIA. The preparation of this paper was supported, in part, by the Program Statistics Research Project in the Research Statistics Group at ETS.  相似文献   

3.
A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales.  相似文献   

4.
In this study, we contrast results from two differential item functioning (DIF) approaches (manifest and latent class) by the number of items and sources of items identified as DIF using data from an international reading assessment. The latter approach yielded three latent classes, presenting evidence of heterogeneity in examinee response patterns. It also yielded more DIF items with larger effect sizes and more consistent item response patterns by substantive aspects (e.g., reading comprehension processes and cognitive complexity of items). Based on our findings, we suggest empirically evaluating the homogeneity assumption in international assessments because international populations cannot be assumed to have homogeneous item response patterns. Otherwise, differences in response patterns within these populations may be under-detected when conducting manifest DIF analyses. Detecting differences in item responses across international examinee populations has implications on the generalizability and meaningfulness of DIF findings as they apply to heterogeneous examinee subgroups.  相似文献   

5.
Assessment of irrational beliefs by such measures as the Common Beliefs Survey III (CBS) has traditionally relied upon classical test theory assumptions, in which the properties of specific test items are less important than the total test score as the aggregate of all item responses. An alternative approach using item response theory (IRT) methodology allows one to specify the parameters of difficulty and discrimination for each test item. Difficulty levels of CBS items range along a continuum of irrationality, the implied latent trait measured by responses to the questionnaire as a whole. We evaluated the CBS responses of 605 individuals from clinical and college settings, drawing from current and archival data. The original Likert scale ratings were recoded into dichotomous scores. Fourteen of the 54 items were highly or very highly discriminating in distinguishing respondents with high and low irrationality levels. However, discriminating items exhibited a very narrow range of difficulty; most functioned at a point a little above the halfway mark on the continuum of irrationality. Item characteristic curves and test information curves were very similar for female (n = 424) and male (n = 179) respondents. We derived a 4-item screening test for irrationality from our IRT analyses of the 54 CBS items. Further test development, focused on the selection and scaling of items with a much broader range of difficulty, would facilitate evaluation of the hierarchical structure of irrational beliefs. Portions of this paper were presented at the 39th Annual Convention of the Association for Behavioral and Cognitive Therapies, Washington, DC, November, 2005.  相似文献   

6.
The Dutch Identity: A new tool for the study of item response models   总被引:1,自引:0,他引:1  
The Dutch Identity is a useful way to reexpress the basic equations of item response models that relate the manifest probabilities to the item response functions (IRFs) and the latent trait distribution. The identity may be exploited in several ways. For example: (a) to suggest how item response models behave for large numbers of items—they are approximate submodels of second-order loglinear models for 2 J tables; (b) to suggest new ways to assess the dimensionality of the latent trait—principle components analysis of matrices composed of second-order interactions from loglinear models; (c) to give insight into the structure of latent class models; and (d) to illuminate the problem of identifying the IRFs and the latent trait distribution from sample data.This research was supported in part by contract number N00014-87-K-0730 from the Cognitive Science Program of the Office of Naval Research. I realized the usefulness of the identity in Theorem 1 while lecturing in the Netherlands during October, 1986. Because this was in no small part due to the stimulating psychometric atmosphere there, I call the result the Dutch Identity.  相似文献   

7.
Constant latent odds-ratios models and the mantel-haenszel null hypothesis   总被引:1,自引:0,他引:1  
In the present paper, a new family of item response theory (IRT) models for dichotomous item scores is proposed. Two basic assumptions define the most general model of this family. The first assumption is local independence of the item scores given a unidimensional latent trait. The second assumption is that the odds-ratios for all item-pairs are constant functions of the latent trait. Since the latter assumption is characteristic of the whole family, the models are called constant latent odds-ratios (CLORs) models. One nonparametric special case and three parametric special cases of the general CLORs model are shown to be generalizations of the one-parameter logistic Rasch model. For all CLORs models, the total score (the unweighted sum of the item scores) is shown to be a sufficient statistic for the latent trait. In addition, conditions under the general CLORs model are studied for the investigation of differential item functioning (DIF) by means of the Mantel-Haenszel procedure. This research was supported by the Dutch Organization for Scientific Research (NWO), grant number 400-20-026.  相似文献   

8.
本文首先分析了经典测验理论存在的局限,然后在潜在特质理论和项目特征曲线两大概念基础上阐述了项目反应理论及其基础模型的测量学原理,介绍了多个项目反应理论基础模型.最后简要介绍了七项当前应用项目反应理论指导大型题库建设和指导编制各种新型测验的热点内容.  相似文献   

9.
In analyzing responses and response times to personality questionnaire items, models have been proposed which include the so-called “inverted-U effect.” These models predict that response times to personality test items decrease as the latent trait value of a given person gets closer to the attractiveness of an item. Initial studies into these models have focused on dichotomous personality items, and more recently, models for Likert-type scale items have been proposed. In all these models, it is assumed that the inverted-U effect is symmetrical around 0, while, as will be explained in this article, there are substantive and statistical reasons to study this assumption. Therefore, in this article, a general inverted-U model is proposed which accommodates two sources of asymmetry between the response times and the attractiveness of the items. The viability of this model is demonstrated in a simulation study, and the model is applied to the responses and response times of the Temperament and Character Inventory–Revised, covering a broad range of personality dimensions.  相似文献   

10.
Edward H. Ip 《Psychometrika》2002,67(3):367-386
In this paper, we propose a class of locally dependent latent trait models for responses to psychological and educational tests. Typically, item response models treat an individual's multiple response to stimuli as conditional independent given the individual's latent trait. In this paper, instead the focus is on models based on a family of conditional distributions, or kernel, that describes joint multiple item responses as a function of student latent trait, not assuming conditional independence. Specifically, we examine a hybrid kernel which comprises a component for one-way item response functions and a component for conditional associations between items given latent traits. The class of models allows the extension of item response theory to cover some new and innovative applications in psychological and educational research. An EM algorithm for marginal maximum likelihood of the hybrid kernel model is proposed. Furthermore, we delineate the relationship of the class of locally dependent models and the log-linear model by revisiting the Dutch identity (Holland, 1990). The work is supported by a research grant from the Marshall School of Business, University of Southern California. The author thanks the anonymous referees for their suggestions.  相似文献   

11.
Test items are often evaluated and compared by contrasting the shapes of their item characteristics curves (ICC's) or surfaces. The current paper develops and applies three general (i.e., nonparametric) comparisons of the shapes of two item characteristic surfaces: (i) proportional latent odds, (ii) uniform relative difficulty, and (iii) item sensitivity. Two items may be compared in these ways while making no assumption about the shapes of item characteristic surfaces for other items, and no assumption about the dimensionality of the latent variable. Also studied is a method for comparing the relative shapes of two item characteristic curves in two examinee populations.The author is grateful to Paul Holland, Robert Mislevy, Tue Tjur, Rebecca Zwick, the editor and reviewers for valuable comments on the subject of this paper, to Mari A. Pearlman for advice on the pairing of items in the examples, and to Dorothy Thayer for assistance with computing.  相似文献   

12.
A central assumption that is implicit in estimating item parameters in item response theory (IRT) models is the normality of the latent trait distribution, whereas a similar assumption made in categorical confirmatory factor analysis (CCFA) models is the multivariate normality of the latent response variables. Violation of the normality assumption can lead to biased parameter estimates. Although previous studies have focused primarily on unidimensional IRT models, this study extended the literature by considering a multidimensional IRT model for polytomous responses, namely the multidimensional graded response model. Moreover, this study is one of few studies that specifically compared the performance of full-information maximum likelihood (FIML) estimation versus robust weighted least squares (WLS) estimation when the normality assumption is violated. The research also manipulated the number of nonnormal latent trait dimensions. Results showed that FIML consistently outperformed WLS when there were one or multiple skewed latent trait distributions. More interestingly, the bias of the discrimination parameters was non-ignorable only when the corresponding factor was skewed. Having other skewed factors did not further exacerbate the bias, whereas biases of boundary parameters increased as more nonnormal factors were added. The item parameter standard errors recovered well with both estimation algorithms regardless of the number of nonnormal dimensions.  相似文献   

13.
The aim of latent variable selection in multidimensional item response theory (MIRT) models is to identify latent traits probed by test items of a multidimensional test. In this paper the expectation model selection (EMS) algorithm proposed by Jiang et al. (2015) is applied to minimize the Bayesian information criterion (BIC) for latent variable selection in MIRT models with a known number of latent traits. Under mild assumptions, we prove the numerical convergence of the EMS algorithm for model selection by minimizing the BIC of observed data in the presence of missing data. For the identification of MIRT models, we assume that the variances of all latent traits are unity and each latent trait has an item that is only related to it. Under this identifiability assumption, the convergence of the EMS algorithm for latent variable selection in the multidimensional two-parameter logistic (M2PL) models can be verified. We give an efficient implementation of the EMS for the M2PL models. Simulation studies show that the EMS outperforms the EM-based L1 regularization in terms of correctly selected latent variables and computation time. The EMS algorithm is applied to a real data set related to the Eysenck Personality Questionnaire.  相似文献   

14.
A rasch model for partial credit scoring   总被引:24,自引:0,他引:24  
A unidimensional latent trait model for responses scored in two or more ordered categories is developed. This “Partial Credit” model is a member of the family of latent trait models which share the property of parameter separability and so permit “specifically objective” comparisons of persons and items. The model can be viewed as an extension of Andrich's Rating Scale model to situations in which ordered response alternatives are free to vary in number and structure from item to item. The difference between the parameters in this model and the “category boundaries” in Samejima's Graded Response model is demonstrated. An unconditional maximum likelihood procedure for estimating the model parameters is developed. Preparation of this paper was supported by grants from the Spencer Foundation and the National Institute for Justice. I would like to thank Professor Benjamin D. Wright of the University of Chicago for his very kind help with the various drafts of this paper.  相似文献   

15.
Most item response theory (IRT) models for dichotomous responses are based on probit or logit link functions which assume a symmetric relationship between the probability of a correct response and the latent traits of individuals taking a test. This assumption restricts the use of those models to the case in which all items behave symmetrically. On the other hand, asymmetric models proposed in the literature impose that all the items in a test behave asymmetrically. This assumption is inappropriate for great majority of tests which are, in general, composed of both symmetric and asymmetric items. Furthermore, a straightforward extension of the existing models in the literature would require a prior selection of the items' symmetry/asymmetry status. This paper proposes a Bayesian IRT model that accounts for symmetric and asymmetric items in a flexible but parsimonious way. That is achieved by assigning a finite mixture prior to the skewness parameter, with one of the mixture components being a point mass at zero. This allows for analyses under both model selection and model averaging approaches. Asymmetric item curves are designed through the centred skew normal distribution, which has a particularly appealing parametrization in terms of parameter interpretation and computational efficiency. An efficient Markov chain Monte Carlo algorithm is proposed to perform Bayesian inference and its performance is investigated in some simulated examples. Finally, the proposed methodology is applied to a data set from a large-scale educational exam in Brazil.  相似文献   

16.
The comparative format used in ranking and paired comparisons tasks can significantly reduce the impact of uniform response biases typically associated with rating scales. Thurstone's (1927, 1931) model provides a powerful framework for modeling comparative data such as paired comparisons and rankings. Although Thurstonian models are generally presented as scaling models, that is, stimuli-centered models, they can also be used as person-centered models. In this article, we discuss how Thurstone's model for comparative data can be formulated as item response theory models so that respondents' scores on underlying dimensions can be estimated. Item parameters and latent trait scores can be readily estimated using a widely used statistical modeling program. Simulation studies show that item characteristic curves can be accurately estimated with as few as 200 observations and that latent trait scores can be recovered to a high precision. Empirical examples are given to illustrate how the model may be applied in practice and to recommend guidelines for designing ranking and paired comparisons tasks in the future.  相似文献   

17.
罗芬  王晓庆  蔡艳  涂冬波 《心理学报》2020,52(12):1452-1465
双目标CD-CAT的测验结果既可用于形成性评估也可用于终结性评估。基尼指数可度量随机变量的不确定性程度, 值越小则随机变量的不确定程度越低。本文用基尼指数度量被试知识状态类别以及能力估计置信区间后验概率的变化, 提出基于基尼指数的选题策略。Monte Carlo实验表明与已有的选题策略相比, 新策略的知识状态分类精度和能力估计精度都较高, 同时能有效兼顾题库利用均匀性, 并能快速实时响应, 且受认知诊断模型和被试知识状态分布的影响较小, 可用于实际测验中含多种认知诊断模型的混合题库。  相似文献   

18.
Higher-order latent trait models for cognitive diagnosis   总被引:9,自引:0,他引:9  
Higher-order latent traits are proposed for specifying the joint distribution of binary attributes in models for cognitive diagnosis. This approach results in a parsimonious model for the joint distribution of a high-dimensional attribute vector that is natural in many situations when specific cognitive information is sought but a less informative item response model would be a reasonable alternative. This approach stems from viewing the attributes as the specific knowledge required for examination performance, and modeling these attributes as arising from a broadly-defined latent trait resembling theϑ of item response models. In this way a relatively simple model for the joint distribution of the attributes results, which is based on a plausible model for the relationship between general aptitude and specific knowledge. Markov chain Monte Carlo algorithms for parameter estimation are given for selected response distributions, and simulation results are presented to examine the performance of the algorithm as well as the sensitivity of classification to model misspecification. An analysis of fraction subtraction data is provided as an example. This research was funded by National Institute of Health grant R01 CA81068. We would like to thank William Stout and Sarah Hartz for many useful discussions, three anonymous reviewers for helpful comments and suggestions, and Kikumi Tatsuoka and Curtis Tatsuoka for generously sharing data.  相似文献   

19.
A definition ofessential independence is proposed for sequences of polytomous items. For items satisfying the reasonable assumption that the expected amount of credit awarded increases with examinee ability, we develop a theory ofessential unidimensionality which closely parallels that of Stout. Essentially unidimensional item sequences can be shown to have a unique (up to change-of-scale) dominant underlying trait, which can be consistently estimated by a monotone transformation of the sum of the item scores. In more general polytomous-response latent trait models (with or without ordered responses), anM-estimator based upon maximum likelihood may be shown to be consistent for under essentially unidimensional violations of local independence and a variety of monotonicity/identifiability conditions. A rigorous proof of this fact is given, and the standard error of the estimator is explored. These results suggest that ability estimation methods that rely on the summation form of the log likelihood under local independence should generally be robust under essential independence, but standard errors may vary greatly from what is usually expected, depending on the degree of departure from local independence. An index of departure from local independence is also proposed.This work was supported in part by Office of Naval Research Grant N00014-87-K-0277 and National Science Foundation Grant NSF-DMS-88-02556. The author is grateful to William F. Stout for many helpful comments, and to an anonymous reviewer for raising the questions addressed in section 2. A preliminary version of section 6 appeared in the author's Ph.D. thesis.  相似文献   

20.
Usually, methods for detection of differential item functioning (DIF) compare the functioning of items across manifest groups. However, the manifest groups with respect to which the items function differentially may not necessarily coincide with the true source of the bias. It is expected that DIF detection under a model that includes a latent DIF variable is more sensitive to this source of bias. In a simulation study, it is shown that a mixture item response theory model, which includes a latent grouping variable, performs better in identifying DIF items than DIF detection methods using manifest variables only. The difference between manifest and latent DIF detection increases as the correlation between the manifest variable and the true source of the DIF becomes smaller. Different sample sizes, relative group sizes, and significance levels are studied. Finally, an empirical example demonstrates the detection of heterogeneity in a minority sample using a latent grouping variable. Manifest and latent DIF detection methods are applied to a Vocabulary test of the General Aptitude Test Battery (GATB).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号