共查询到20条相似文献,搜索用时 15 毫秒
1.
Higher-order latent trait models for cognitive diagnosis 总被引:9,自引:0,他引:9
Higher-order latent traits are proposed for specifying the joint distribution of binary attributes in models for cognitive
diagnosis. This approach results in a parsimonious model for the joint distribution of a high-dimensional attribute vector
that is natural in many situations when specific cognitive information is sought but a less informative item response model
would be a reasonable alternative. This approach stems from viewing the attributes as the specific knowledge required for
examination performance, and modeling these attributes as arising from a broadly-defined latent trait resembling theϑ of item response models. In this way a relatively simple model for the joint distribution of the attributes results, which
is based on a plausible model for the relationship between general aptitude and specific knowledge. Markov chain Monte Carlo
algorithms for parameter estimation are given for selected response distributions, and simulation results are presented to
examine the performance of the algorithm as well as the sensitivity of classification to model misspecification. An analysis
of fraction subtraction data is provided as an example.
This research was funded by National Institute of Health grant R01 CA81068. We would like to thank William Stout and Sarah
Hartz for many useful discussions, three anonymous reviewers for helpful comments and suggestions, and Kikumi Tatsuoka and
Curtis Tatsuoka for generously sharing data. 相似文献
2.
Model Evaluation and Multiple Strategies in Cognitive Diagnosis: An Analysis of Fraction Subtraction Data 总被引:1,自引:0,他引:1
This paper studies three models for cognitive diagnosis, each illustrated with an application to fraction subtraction data.
The objective of each of these models is to classify examinees according to their mastery of skills assumed to be required
for fraction subtraction. We consider the DINA model, the NIDA model, and a new model that extends the DINA model to allow
for multiple strategies of problem solving. For each of these models the joint distribution of the indicators of skill mastery
is modeled using a single continuous higher-order latent trait, to explain the dependence in the mastery of distinct skills.
This approach stems from viewing the skills as the specific states of knowledge required for exam performance, and viewing
these skills as arising from a broadly defined latent trait resembling the θ of item response models. We discuss several techniques for comparing models and assessing goodness of fit. We then implement
these methods using the fraction subtraction data with the aim of selecting the best of the three models for this application.
We employ Markov chain Monte Carlo algorithms to fit the models, and we present simulation results to examine the performance
of these algorithms.
The work reported here was performed under the auspices of the External Diagnostic Research Team funded by Educational Testing
Service. Views expressed in this paper does not necessarily represent the views of Educational Testing Service. 相似文献
3.
Paul R. Rosenbaum 《Psychometrika》1987,52(2):217-233
Test items are often evaluated and compared by contrasting the shapes of their item characteristics curves (ICC's) or surfaces. The current paper develops and applies three general (i.e., nonparametric) comparisons of the shapes of two item characteristic surfaces: (i) proportional latent odds, (ii) uniform relative difficulty, and (iii) item sensitivity. Two items may be compared in these ways while making no assumption about the shapes of item characteristic surfaces for other items, and no assumption about the dimensionality of the latent variable. Also studied is a method for comparing the relative shapes of two item characteristic curves in two examinee populations.The author is grateful to Paul Holland, Robert Mislevy, Tue Tjur, Rebecca Zwick, the editor and reviewers for valuable comments on the subject of this paper, to Mari A. Pearlman for advice on the pairing of items in the examples, and to Dorothy Thayer for assistance with computing. 相似文献
4.
A general latent trait model for response processes 总被引:1,自引:0,他引:1
Susan Embretson 《Psychometrika》1984,49(2):175-186
The purpose of the current paper is to propose a general multicomponent latent trait model (GLTM) for response processes. The proposed model combines the linear logistic latent trait (LLTM) with the multicomponent latent trait model (MLTM). As with both LLTM and MLTM, the general multicomponent latent trait model can be used to (1) test hypotheses about the theoretical variables that underlie response difficulty and (2) estimate parameters that describe test items by basic substantive properties. However, GLTM contains both component outcomes and complexity factors in a single model and may be applied to data that neither LLTM nor MLTM can handle. Joint maximum likelihood estimators are presented for the parameters of GLTM and an application to cognitive test items is described.This research was partially supported by the National Institute of Education grant number NIE-6-7-0156 to Susan Embretson (Whitely), principal investigator. However the optinions expressed herein do not necessarily reflect the position or policy of the National Institute of Education, and no official endorsement by the National Institute of Education should be inferred. 相似文献
5.
In categorical data analysis, two-sample cross-validation is used not only for model selection but also to obtain a realistic
impression of the overall predictive effectiveness of the model. The latter is of particular importance in the case of highly
parametrized models capable of capturing every idiosyncracy of the calibrating sample. We show that for maximum likelihood
estimators or other asymptotically efficient estimators Pearson's X
2 is not asymptotically chi-square in the two-sample cross-validation framework due to extra variability induced by using different
samples for estimation and goodness-of-fit testing. We propose an alternative test statistic, X
xval
2, obtained as a modification of X
2 which is asymptotically chi-square with C−1 degrees of freedom in cross-validation samples. Stochastically, X
xval
2 ≤ X
2. Furthermore, the use of X
2 instead of X
xval
2 with a χ
C
−12 reference distribution may provide an unduly poor impression of fit of the model in the cross-validation sample.
This paper is dedicated to the memory of Michael V. Levine.
Requests for reprints should be sent to Albert Maydeu-Olivares, Faculty of Psychology, University of Barcelona, P. Valle de
Hebrón, 171, 0835 Barcelona, Spain. 相似文献
6.
Yanyan Fu Tyler Strachan Edward H. Ip John T. Willse Shyh-Huei Chen Terry Ackerman 《International Journal of Testing》2020,20(2):169-186
This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and conditions. When a test measured weakly discriminated dimensions, it became harder to recover the latent correlation. Results also showed that increasing the sample size, test length, or using simpler models (i.e., two-parameter logistic rather than three-parameter logistic, compensatory rather than noncompensatory) could improve the recovery of latent correlation. 相似文献
7.
John H. Wolfe 《Psychometrika》1981,46(4):461-464
In tailored testing, it is important to determine the optimal difficulty of the next item to present to the examinee. This paper shows that the difference that maximizes information for the three-parameter normal ogive response model is approximately 1.7 times the optimal difference –b for the three-parameter logistic model. Under the normal model, calculation of the optimal difficulty for minimizing the Bayes risk is equivalent to maximizing an associated information function.The views expressed herein, are those of the author and do not necessarily reflect those of the Department of the Navy. 相似文献
8.
Fumiko Samejima 《Psychometrika》2000,65(3):319-335
The paper addresses and discusses whether the tradition of accepting point-symmetric item characteristic curves is justified by uncovering the inconsistent relationship between the difficulties of items and the order of maximum likelihood estimates of ability. This inconsistency is intrinsic in models that provide point-symmetric item characteristic curves, and in this paper focus is put on the normal ogive model for observation. It is also questioned if in the logistic model the sufficient statistic has forfeited the rationale that is appropriate to the psychological reality. It is observed that the logistic model can be interpreted as the case in which the inconsistency in ordering the maximum likelihood estimates is degenerated.The paper proposes a family of models, called the logistic positive exponent family, which provides asymmetric item chacteristic curves. A model in this family has a consistent principle in ordering the maximum likelihood estimates of ability. The family is divided into two subsets each of which has its own principle, and includes the logistic model as a transition from one principle to the other. Rationale and some illustrative examples are given. 相似文献
9.
Paul R. Rosenbaum 《Psychometrika》1989,54(4):625-633
Established results on latent variable models are applied to the study of the validity of a psychological test. When the test predicts a criterion by measuring a unidimensional latent construct, not only must the total score predict the criterion, but the joint distribution of criterion scores and item responses must exhibit a certain pattern. The presence of this population pattern may be tested with sample data using the stratified Wilcoxon rank sum test. Often, criterion information is available only for selected examinees, for instance, those who are admitted or hired. Three cases are discussed: (i) selection at random, (ii) selection based on the current test, and (iii) selection based on other measures of the latent construct. Discriminant validity is also discussed.This work was supported in part by Grant SES-87-01890 from the Measurement Methods and Data Improvement Program of the U.S. National Science Foundation. 相似文献
10.
11.
Estimating multiple classification latent class models 总被引:4,自引:0,他引:4
E. Maris 《Psychometrika》1999,64(2):187-212
This paper presents a new class of models for persons-by-items data. The essential new feature of this class is the representation of the persons: every person is represented by its membership tomultiple latent classes, each of which belongs to onelatent classification. The models can be considered as a formalization of the hypothesis that the responses come about in a process that involves the application of a number ofmental operations. Two algorithms for maximum likelihood (ML) and maximum a posteriori (MAP) estimation are described. They both make use of the tractability of the complete data likelihood to maximize the observed data likelihood. Properties of the MAP estimators (i.e., uniqueness and goodness-of-recovery) and the existence of asymptotic standard errors were examined in a simulation study. Then, one of these models is applied to the responses to a set of fraction addition problems. Finally, the models are compared to some related models in the literature.Thanks are to Paul De Boeck for creating the intellectually stimulating atmosphere in which this class of models came about, Iven van Mechelen for theone-sided idea, Kikumi Tatsuoka for the use of her data, and Theodoor Bouw for running part of the simulation study. 相似文献
12.
Fumiko Samejima 《Psychometrika》1997,62(4):471-493
Normal assumptions have been used in many psychometric methods, to the extent that most researchers do not even question their adequacy. With the rapid advancement of computer technologies in recent years, psychometrics has extended its territory to include intensive cognitive diagnosis, etcetera, and substantive mathematical modeling ha become essential. As a natural consequence, it is time to consider departure from normal assumptions seriously. As examples of models which are not based on normality or its approximation, the logistic positive exponent family of models is discussed. These models include the item task complexity as the third parameter, which determines the single principle of ordering individuals on the ability scale. 相似文献
13.
14.
Mark Hansen Li Cai Scott Monroe Zhen Li 《The British journal of mathematical and statistical psychology》2016,69(3):225-252
Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full‐information test statistics such as Pearson's X2 and the likelihood ratio statistic G2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited‐information fit statistics such as Maydeu‐Olivares and Joe's (2006, Psychometrika, 71, 713) M2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu‐Olivares and Joe's (2006, Psychometrika, 71, 713) M2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q‐matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M2 was largely insensitive to misspecifications in the distribution of higher‐order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M2, we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M2 and statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144). 相似文献
15.
《The journal of positive psychology》2013,8(6):553-560
Background: There is accumulating evidence that positive mental health and psychopathology should be seen as separate indicators of mental health. This study contributes to this evidence by investigating the bidirectional relation between positive mental health and psychopathological symptoms over time. Methods: Positive mental health (MHC-SF) and psychopathological symptoms (BSI) were longitudinally measured in a representative adult sample (N?=?1932) on four measurement occasions in nine months. A cross-lagged panel design was applied and evaluated with a latent growth model combined with an item response theory measurement model. Results: Psychopathological symptoms were longitudinally related to positive mental health and vice versa, controlling for initial levels. The changes over time were even more important than the absolute levels of psychopathological symptoms and positive mental health, respectively. Conclusions: The results underline the need for a comprehensive perspective on mental health, incorporating both the treatment of symptoms and the enhancement of well-being. 相似文献
16.
A number of models for categorical item response data have been proposed in recent years. The models appear to be quite different.
However, they may usefully be organized as members of only three distinct classes, within which the models are distinguished
only by assumptions and constraints on their parameters. “Difference models” are appropriate for ordered responses, “divide-by-total”
models may be used for either ordered or nominal responses, and “left-side added” models are used for multiple-choice responses
with guessing. The details of the taxonomy and the models are described in this paper.
The present study was supported in part by two postdoctoral fellowships awarded to Lynne Steinberg: an Educational Testing
Service Postdoctoral Fellowship at ETS, Princeton, NJ and an NIMH Individual National Research Service Award at Stanford University,
Stanford, CA. Helpful comments by the editor and three anonymous reviewers are gratefully acknowledged. 相似文献
17.
Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables 总被引:2,自引:0,他引:2
This paper uses log-linear models with latent variables (Hagenaars, in Loglinear Models with Latent Variables, 1993) to define a family of cognitive diagnosis models. In doing so, the relationship between many common models is explicitly
defined and discussed. In addition, because the log-linear model with latent variables is a general model for cognitive diagnosis,
new alternatives to modeling the functional relationship between attribute mastery and the probability of a correct response
are discussed. 相似文献
18.
Item response theory posits local independence, or conditional independence of item responses given item parameters and examinee proficiency parameters. The usual definition of local independence, however, addresses the context of fixed tests, and initially appears to yield incorrect response-pattern probabilities in the context of adaptive testing. The paradox is resolved by introducing additional notation to deal with the item selection mechanism.We are grateful to Charlie Lewis, Ming-Mei Wang, and Pao-Kuei Wu for discussions on this topic, and to the Editor, the reviewers, and Howard Wainer for helpful comments on an earlier version of the paper. The first author's work was supported in part by the National Center for Research on Evaluation, Standards, Student Testing (CRESST), Educational Research and Development Program, cooperative agreement number R117G10027 and CFDA catalog number 84.117G, as administered by the Office of Educational Research and Improvement, U.S. Department of Education. 相似文献
19.
An IRT model with a parameter-driven process for change is proposed. Quantitative differences between persons are taken into
account by a continuous latent variable, as in common IRT models. In addition, qualitative interindividual differences and
autodependencies are accounted for by assuming within-subject variability with respect to the parameters of the IRT model.
In particular, the parameters of the IRT model are governed by an unobserved or “hidden'” homogeneous Markov process. The
model includes the mixture linear logistic test model (Mislevy & Verhelst, 1990), the mixture Rasch model (Rost, 1990), and
the Saltus model (Wilson, 1989) as specific instances. The model is applied to a longitudinal experiment on discontinuity
in conservation acquisition (van der Maas, 1993).
Frank Rijmen was supported by the Fund for Scientific Research Flanders (FWO), the GOA/2000/02 granted by the Katholieke Universiteit
Leuven to Paul De Boeck and Iven Van Mechelen, and the PDM/02/067 granted by the Katholieke Universiteit Leuven to Paul De
Boeck. 相似文献