首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Higher-order latent trait models for cognitive diagnosis   总被引:9,自引:0,他引:9  
Higher-order latent traits are proposed for specifying the joint distribution of binary attributes in models for cognitive diagnosis. This approach results in a parsimonious model for the joint distribution of a high-dimensional attribute vector that is natural in many situations when specific cognitive information is sought but a less informative item response model would be a reasonable alternative. This approach stems from viewing the attributes as the specific knowledge required for examination performance, and modeling these attributes as arising from a broadly-defined latent trait resembling theϑ of item response models. In this way a relatively simple model for the joint distribution of the attributes results, which is based on a plausible model for the relationship between general aptitude and specific knowledge. Markov chain Monte Carlo algorithms for parameter estimation are given for selected response distributions, and simulation results are presented to examine the performance of the algorithm as well as the sensitivity of classification to model misspecification. An analysis of fraction subtraction data is provided as an example. This research was funded by National Institute of Health grant R01 CA81068. We would like to thank William Stout and Sarah Hartz for many useful discussions, three anonymous reviewers for helpful comments and suggestions, and Kikumi Tatsuoka and Curtis Tatsuoka for generously sharing data.  相似文献   

2.
This paper studies three models for cognitive diagnosis, each illustrated with an application to fraction subtraction data. The objective of each of these models is to classify examinees according to their mastery of skills assumed to be required for fraction subtraction. We consider the DINA model, the NIDA model, and a new model that extends the DINA model to allow for multiple strategies of problem solving. For each of these models the joint distribution of the indicators of skill mastery is modeled using a single continuous higher-order latent trait, to explain the dependence in the mastery of distinct skills. This approach stems from viewing the skills as the specific states of knowledge required for exam performance, and viewing these skills as arising from a broadly defined latent trait resembling the θ of item response models. We discuss several techniques for comparing models and assessing goodness of fit. We then implement these methods using the fraction subtraction data with the aim of selecting the best of the three models for this application. We employ Markov chain Monte Carlo algorithms to fit the models, and we present simulation results to examine the performance of these algorithms. The work reported here was performed under the auspices of the External Diagnostic Research Team funded by Educational Testing Service. Views expressed in this paper does not necessarily represent the views of Educational Testing Service.  相似文献   

3.
Test items are often evaluated and compared by contrasting the shapes of their item characteristics curves (ICC's) or surfaces. The current paper develops and applies three general (i.e., nonparametric) comparisons of the shapes of two item characteristic surfaces: (i) proportional latent odds, (ii) uniform relative difficulty, and (iii) item sensitivity. Two items may be compared in these ways while making no assumption about the shapes of item characteristic surfaces for other items, and no assumption about the dimensionality of the latent variable. Also studied is a method for comparing the relative shapes of two item characteristic curves in two examinee populations.The author is grateful to Paul Holland, Robert Mislevy, Tue Tjur, Rebecca Zwick, the editor and reviewers for valuable comments on the subject of this paper, to Mari A. Pearlman for advice on the pairing of items in the examples, and to Dorothy Thayer for assistance with computing.  相似文献   

4.
A general latent trait model for response processes   总被引:1,自引:0,他引:1  
The purpose of the current paper is to propose a general multicomponent latent trait model (GLTM) for response processes. The proposed model combines the linear logistic latent trait (LLTM) with the multicomponent latent trait model (MLTM). As with both LLTM and MLTM, the general multicomponent latent trait model can be used to (1) test hypotheses about the theoretical variables that underlie response difficulty and (2) estimate parameters that describe test items by basic substantive properties. However, GLTM contains both component outcomes and complexity factors in a single model and may be applied to data that neither LLTM nor MLTM can handle. Joint maximum likelihood estimators are presented for the parameters of GLTM and an application to cognitive test items is described.This research was partially supported by the National Institute of Education grant number NIE-6-7-0156 to Susan Embretson (Whitely), principal investigator. However the optinions expressed herein do not necessarily reflect the position or policy of the National Institute of Education, and no official endorsement by the National Institute of Education should be inferred.  相似文献   

5.
In categorical data analysis, two-sample cross-validation is used not only for model selection but also to obtain a realistic impression of the overall predictive effectiveness of the model. The latter is of particular importance in the case of highly parametrized models capable of capturing every idiosyncracy of the calibrating sample. We show that for maximum likelihood estimators or other asymptotically efficient estimators Pearson's X 2 is not asymptotically chi-square in the two-sample cross-validation framework due to extra variability induced by using different samples for estimation and goodness-of-fit testing. We propose an alternative test statistic, X xval 2, obtained as a modification of X 2 which is asymptotically chi-square with C−1 degrees of freedom in cross-validation samples. Stochastically, X xval 2X 2. Furthermore, the use of X 2 instead of X xval 2 with a χ C −12 reference distribution may provide an unduly poor impression of fit of the model in the cross-validation sample. This paper is dedicated to the memory of Michael V. Levine. Requests for reprints should be sent to Albert Maydeu-Olivares, Faculty of Psychology, University of Barcelona, P. Valle de Hebrón, 171, 0835 Barcelona, Spain.  相似文献   

6.
    
This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and conditions. When a test measured weakly discriminated dimensions, it became harder to recover the latent correlation. Results also showed that increasing the sample size, test length, or using simpler models (i.e., two-parameter logistic rather than three-parameter logistic, compensatory rather than noncompensatory) could improve the recovery of latent correlation.  相似文献   

7.
In tailored testing, it is important to determine the optimal difficulty of the next item to present to the examinee. This paper shows that the difference that maximizes information for the three-parameter normal ogive response model is approximately 1.7 times the optimal differenceb for the three-parameter logistic model. Under the normal model, calculation of the optimal difficulty for minimizing the Bayes risk is equivalent to maximizing an associated information function.The views expressed herein, are those of the author and do not necessarily reflect those of the Department of the Navy.  相似文献   

8.
The paper addresses and discusses whether the tradition of accepting point-symmetric item characteristic curves is justified by uncovering the inconsistent relationship between the difficulties of items and the order of maximum likelihood estimates of ability. This inconsistency is intrinsic in models that provide point-symmetric item characteristic curves, and in this paper focus is put on the normal ogive model for observation. It is also questioned if in the logistic model the sufficient statistic has forfeited the rationale that is appropriate to the psychological reality. It is observed that the logistic model can be interpreted as the case in which the inconsistency in ordering the maximum likelihood estimates is degenerated.The paper proposes a family of models, called the logistic positive exponent family, which provides asymmetric item chacteristic curves. A model in this family has a consistent principle in ordering the maximum likelihood estimates of ability. The family is divided into two subsets each of which has its own principle, and includes the logistic model as a transition from one principle to the other. Rationale and some illustrative examples are given.  相似文献   

9.
Established results on latent variable models are applied to the study of the validity of a psychological test. When the test predicts a criterion by measuring a unidimensional latent construct, not only must the total score predict the criterion, but the joint distribution of criterion scores and item responses must exhibit a certain pattern. The presence of this population pattern may be tested with sample data using the stratified Wilcoxon rank sum test. Often, criterion information is available only for selected examinees, for instance, those who are admitted or hired. Three cases are discussed: (i) selection at random, (ii) selection based on the current test, and (iii) selection based on other measures of the latent construct. Discriminant validity is also discussed.This work was supported in part by Grant SES-87-01890 from the Measurement Methods and Data Improvement Program of the U.S. National Science Foundation.  相似文献   

10.
项目反应理论是测量被试潜在特质的现代测量理论, 潜在类别分析是基于模型的潜在特质分类技术。混合项目反应理论将项目反应理论与潜在类别分析相结合, 能够同时对被试分类并量化其潜在特质。在阐述混合项目反应理论概念、原理的基础上, 介绍了MRM、mNRM和mPCM等几种常见混合模型及其参数估计方法, 并从心理与行为特征分类、项目功能差异检测、测验效度评价等方面评述了其在心理测验中的应用发展轨迹。  相似文献   

11.
Estimating multiple classification latent class models   总被引:4,自引:0,他引:4  
E. Maris 《Psychometrika》1999,64(2):187-212
This paper presents a new class of models for persons-by-items data. The essential new feature of this class is the representation of the persons: every person is represented by its membership tomultiple latent classes, each of which belongs to onelatent classification. The models can be considered as a formalization of the hypothesis that the responses come about in a process that involves the application of a number ofmental operations. Two algorithms for maximum likelihood (ML) and maximum a posteriori (MAP) estimation are described. They both make use of the tractability of the complete data likelihood to maximize the observed data likelihood. Properties of the MAP estimators (i.e., uniqueness and goodness-of-recovery) and the existence of asymptotic standard errors were examined in a simulation study. Then, one of these models is applied to the responses to a set of fraction addition problems. Finally, the models are compared to some related models in the literature.Thanks are to Paul De Boeck for creating the intellectually stimulating atmosphere in which this class of models came about, Iven van Mechelen for theone-sided idea, Kikumi Tatsuoka for the use of her data, and Theodoor Bouw for running part of the simulation study.  相似文献   

12.
Normal assumptions have been used in many psychometric methods, to the extent that most researchers do not even question their adequacy. With the rapid advancement of computer technologies in recent years, psychometrics has extended its territory to include intensive cognitive diagnosis, etcetera, and substantive mathematical modeling ha become essential. As a natural consequence, it is time to consider departure from normal assumptions seriously. As examples of models which are not based on normality or its approximation, the logistic positive exponent family of models is discussed. These models include the item task complexity as the third parameter, which determines the single principle of ordering individuals on the ability scale.  相似文献   

13.
在中国最大的资格考试之一的经济专业资格考试中,为保证不同年度间考试的可比性、进行题库建设和为计算机自适应考试做准备,应用项目反应理论中等级反应模型下的项目特征曲线等值法,采用铆测验等值设计,实现了4个年度考试资料的项目参数和能力参数的等值,并成功地组建了经济专业题库。在此基础上,利用等值技术对不同年份试卷的划界分数进行了比较,为经济考试的合格标准制定、确保考试的公平性提供了实证依据。  相似文献   

14.
    
Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full‐information test statistics such as Pearson's X2 and the likelihood ratio statistic G2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited‐information fit statistics such as Maydeu‐Olivares and Joe's (2006, Psychometrika, 71, 713) M2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu‐Olivares and Joe's (2006, Psychometrika, 71, 713) M2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q‐matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M2 was largely insensitive to misspecifications in the distribution of higher‐order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M2, we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic X LD 2 for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The X LD 2 statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M2 and X LD 2 statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144).  相似文献   

15.
Background: There is accumulating evidence that positive mental health and psychopathology should be seen as separate indicators of mental health. This study contributes to this evidence by investigating the bidirectional relation between positive mental health and psychopathological symptoms over time. Methods: Positive mental health (MHC-SF) and psychopathological symptoms (BSI) were longitudinally measured in a representative adult sample (N?=?1932) on four measurement occasions in nine months. A cross-lagged panel design was applied and evaluated with a latent growth model combined with an item response theory measurement model. Results: Psychopathological symptoms were longitudinally related to positive mental health and vice versa, controlling for initial levels. The changes over time were even more important than the absolute levels of psychopathological symptoms and positive mental health, respectively. Conclusions: The results underline the need for a comprehensive perspective on mental health, incorporating both the treatment of symptoms and the enhancement of well-being.  相似文献   

16.
A number of models for categorical item response data have been proposed in recent years. The models appear to be quite different. However, they may usefully be organized as members of only three distinct classes, within which the models are distinguished only by assumptions and constraints on their parameters. “Difference models” are appropriate for ordered responses, “divide-by-total” models may be used for either ordered or nominal responses, and “left-side added” models are used for multiple-choice responses with guessing. The details of the taxonomy and the models are described in this paper. The present study was supported in part by two postdoctoral fellowships awarded to Lynne Steinberg: an Educational Testing Service Postdoctoral Fellowship at ETS, Princeton, NJ and an NIMH Individual National Research Service Award at Stanford University, Stanford, CA. Helpful comments by the editor and three anonymous reviewers are gratefully acknowledged.  相似文献   

17.
This paper uses log-linear models with latent variables (Hagenaars, in Loglinear Models with Latent Variables, 1993) to define a family of cognitive diagnosis models. In doing so, the relationship between many common models is explicitly defined and discussed. In addition, because the log-linear model with latent variables is a general model for cognitive diagnosis, new alternatives to modeling the functional relationship between attribute mastery and the probability of a correct response are discussed.  相似文献   

18.
Item response theory posits local independence, or conditional independence of item responses given item parameters and examinee proficiency parameters. The usual definition of local independence, however, addresses the context of fixed tests, and initially appears to yield incorrect response-pattern probabilities in the context of adaptive testing. The paradox is resolved by introducing additional notation to deal with the item selection mechanism.We are grateful to Charlie Lewis, Ming-Mei Wang, and Pao-Kuei Wu for discussions on this topic, and to the Editor, the reviewers, and Howard Wainer for helpful comments on an earlier version of the paper. The first author's work was supported in part by the National Center for Research on Evaluation, Standards, Student Testing (CRESST), Educational Research and Development Program, cooperative agreement number R117G10027 and CFDA catalog number 84.117G, as administered by the Office of Educational Research and Improvement, U.S. Department of Education.  相似文献   

19.
An IRT model with a parameter-driven process for change is proposed. Quantitative differences between persons are taken into account by a continuous latent variable, as in common IRT models. In addition, qualitative interindividual differences and autodependencies are accounted for by assuming within-subject variability with respect to the parameters of the IRT model. In particular, the parameters of the IRT model are governed by an unobserved or “hidden'” homogeneous Markov process. The model includes the mixture linear logistic test model (Mislevy & Verhelst, 1990), the mixture Rasch model (Rost, 1990), and the Saltus model (Wilson, 1989) as specific instances. The model is applied to a longitudinal experiment on discontinuity in conservation acquisition (van der Maas, 1993). Frank Rijmen was supported by the Fund for Scientific Research Flanders (FWO), the GOA/2000/02 granted by the Katholieke Universiteit Leuven to Paul De Boeck and Iven Van Mechelen, and the PDM/02/067 granted by the Katholieke Universiteit Leuven to Paul De Boeck.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号