首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
It is shown that the presently available statistical tests for the Rasch model are insensitive to violation of the unidimensionality axiom. Two new test statistics are presented. The first one,Q 1, is sensitive to the same effects as the presently available statistics, but has some desirable properties of a nonstatistical nature. The second statistic,Q 2, is sensitive to violation of local stochastic independence and unidimensionality and thus fills an existing gap.  相似文献   

2.
The LLRA (linear logistic model with relaxed assumptions; Fischer, 1974, 1977a, 1977b, 1983a) was developed, within the framework of generalized Rasch models, for assessing change in dichotomous item score matrices between two points in time; it allows to quantify change on latent trait dimensions and to explain change in terms of treatment effects, treatment interactions, and a trend effect. A remarkable feature of the model is that unidimensionality of the item set is not required. The present paper extends this model to designs with any number of time points and even with different sets of items presented on different occasions, provided that one unidimensional subscale is available per latent trait. Thus unidimensionality assumptions within subscales are combined with multidimensionality of the item set. Conditional maximum likelihood methods for parameter estimation and hypothesis testing are developed, and a necessary and sufficient condition for unique identification of the model, given the data, is derived. Finally, a sample application is presented.To my friend Josef Roppert who has taught me how to apply statistical reasoning to substantive problems.This research was supported in part by Österreichische Forschungsgemeinschaft under grant No. 01/0054. The author wishes to thank B. Wild for the numerical computation of the sample application in section 5.  相似文献   

3.
The identifiability of item response models with nonparametrically specified item characteristic curves is considered. Strict identifiability is achieved, with a fixed latent trait distribution, when only a single set of item characteristic curves can possibly generate the manifest distribution of the item responses. When item characteristic curves belong to a very general class, this property cannot be achieved. However, for assessments with many items, it is shown that all models for the manifest distribution have item characteristic curves that are very near one another and pointwise differences between them converge to zero at all values of the latent trait as the number of items increases. An upper bound for the rate at which this convergence takes place is given. The main result provides theoretical support to the practice of nonparametric item response modeling, by showing that models for long assessments have the property of asymptotic identifiability. The research was partially supported by the National Institute of Health grant R01 CA81068-01.  相似文献   

4.
To assess the reliability of congeneric tests, specifically designed reliability measures have been proposed. This paper emphasizes that such measures rely on a unidimensionality hypothesis, which can neither be confirmed nor rejected when there are only three test parts, and will invariably be rejected when there are more than three test parts. Jackson and Agunwamba's (1977) greatest lower bound to reliability is proposed instead. Although this bound has a reputation for overestimating the population value when the sample size is small, this is no reason to prefer the unidimensionality-based reliability. Firstly, the sampling bias problem of the glb does not play a role when the number of test parts is small, as is often the case with congeneric measures. Secondly, glb and unidimensionality based reliability are often equal when there are three test parts, and when there are more test parts, their numerical values are still very similar. To the extent that the bias problem of the greatest lower bound does play a role, unidimensionality-based reliability is equally affected. Although unidimensionality and reliability are often thought of as unrelated, this paper shows that, from at least two perspectives, they act as antagonistic concepts. A measure, based on the same framework that led to the greatest lower bound, is discussed for assessing how close is a set of variables to unidimensionality. It is the percentage of common variance that can be explained by a single factor. An empirical example is given to demonstrate the main points of the paper. The authors are obliged to Henk Kiers for commenting on a previous version. Gregor Sočan is now at the University of Ljubljana.  相似文献   

5.
A fundamental assumption of most IRT models is that items measure the same unidimensional latent construct. For the polytomous Rasch model two ways of testing this assumption against specific multidimensional alternatives are discussed. One, a marginal approach assuming a multidimensional parametric latent variable distribution, and, two, a conditional approach with no distributional assumptions about the latent variable. The second approach generalizes the Martin-Löf test for the dichotomous Rasch model in two ways: to polytomous items and to a test against an alternative that may have more than two dimensions. A study on occupational health is used to motivate and illustrate the methods.The authors would like to thank Niels Keiding, Klaus Larsen and the anonymous reviewers for valuable comments to a previous version of this paper. This research was supported by a grant from the Danish Research Academy and by a general research grant from Quality Metric, Inc.  相似文献   

6.
We illustrate a class of multidimensional item response theory models in which the items are allowed to have different discriminating power and the latent traits are represented through a vector having a discrete distribution. We also show how the hypothesis of unidimensionality may be tested against a specific bidimensional alternative by using a likelihood ratio statistic between two nested models in this class. For this aim, we also derive an asymptotically equivalent Wald test statistic which is faster to compute. Moreover, we propose a hierarchical clustering algorithm which can be used, when the dimensionality of the latent structure is completely unknown, for dividing items into groups referred to different latent traits. The approach is illustrated through a simulation study and an application to a dataset collected within the National Assessment of Educational Progress, 1996. The author would like to thank the Editor, an Associate Editor and three anonymous referees for stimulating comments. I also thank L. Scaccia, F. Pennoni and M. Lupparelli for having done part of the simulations.  相似文献   

7.
The Rasch model predicts that an individual's ability level is invariant over subtests of the total test, and thus, all subtests measure the same latent trait. A person test of this invariance hypothesis is discussed that is uniformly most powerful and standardized in the sense that the conditional distribution of the test statistic, given a particular level of ability, does not depend on the absolute value of the examinee's ability parameter. The test can be routinely performed by applying a computer program designed by and obtainable from the author. Finally, a suboptimal test is derived that is extremely easy to use, and an overall group test of the invariance hypothesis discussed. All tests considered do not rely on asymptotic approximations; hence, they may be applied when the test is of only moderate length and the group of examinees is small.  相似文献   

8.
Although several goodness of fit tests have been developed for the Rasch model for dichotomous items, most of them are of a global, asymptotic, and confirmatory type. This paper, based on ideas from a recent thesis by Van den Wollenberg, offers some suggestions for local, small sample, and exploratory techniques: difficulty plots for person groups scoring right and wrong on a specific item, a slope test per item based on a binomial distribution per score group, and a unidimensionality check based on an extended hypergeometric distribution per score group. This paper owes much to the inspiring and pioneering work of Arnold Van den Wollenberg, of which only minor aspects are criticized. Thanks go to Charles Lewis for stimulating discussions and for solutions to some programming problems.  相似文献   

9.
A major research direction for ability measurement has been to identify the information-processes that are involved in solving test items through mathematical modeling of item difficulty. However, this research has had limited impact on ability measurement, since person parameters are not included in the process models. The current paper presents some multicomponent latent trait models for reproducing test performance from both item and person parameters on processing components. Components are identified from item subtasks, in which performance is a logistic function (i.e., Rasch model) of person and item parameters, and then are combined according to a mathematical model of processing on the composite item.The author would like to thank David Thissen for his invaluable insights concerning this model and an anonymous reviewer for his suggestion about the sample space for the model.This research was partially supported by National Institute of Education grant number NIE-6-7-0156 to Susan E. Whitely, principal investigator. However the opinions expressed herein do not necessarily reflect the position or policy of the National Institute of Education, and no official endorsement by the National Institute of Education should be referred. Part of this paper was presented at the annual meeting of thePsychometric Society, Monterey, California: June, 1979.  相似文献   

10.
项目反应理论是测量被试潜在特质的现代测量理论, 潜在类别分析是基于模型的潜在特质分类技术。混合项目反应理论将项目反应理论与潜在类别分析相结合, 能够同时对被试分类并量化其潜在特质。在阐述混合项目反应理论概念、原理的基础上, 介绍了MRM、mNRM和mPCM等几种常见混合模型及其参数估计方法, 并从心理与行为特征分类、项目功能差异检测、测验效度评价等方面评述了其在心理测验中的应用发展轨迹。  相似文献   

11.
A general latent trait model for response processes   总被引:1,自引:0,他引:1  
The purpose of the current paper is to propose a general multicomponent latent trait model (GLTM) for response processes. The proposed model combines the linear logistic latent trait (LLTM) with the multicomponent latent trait model (MLTM). As with both LLTM and MLTM, the general multicomponent latent trait model can be used to (1) test hypotheses about the theoretical variables that underlie response difficulty and (2) estimate parameters that describe test items by basic substantive properties. However, GLTM contains both component outcomes and complexity factors in a single model and may be applied to data that neither LLTM nor MLTM can handle. Joint maximum likelihood estimators are presented for the parameters of GLTM and an application to cognitive test items is described.This research was partially supported by the National Institute of Education grant number NIE-6-7-0156 to Susan Embretson (Whitely), principal investigator. However the optinions expressed herein do not necessarily reflect the position or policy of the National Institute of Education, and no official endorsement by the National Institute of Education should be inferred.  相似文献   

12.
The simultaneous and nonparametric estimation of latent abilities and item characteristic curves is considered. The asymptotic properties of ordinal ability estimation and kernel smoothed nonparametric item characteristic curve estimation are investigated under very general assumptions on the underlying item response theory model as both the test length and the sample size increase. A large deviation probability inequality is stated for ordinal ability estimation. The mean squared error of kernel smoothed item characteristic curve estimates is studied and a strong consistency result is obtained showing that the worst case error in the item characteristic curve estimates over all items and ability levels converges to zero with probability equal to one.  相似文献   

13.
Extensions of the partial credit model   总被引:1,自引:0,他引:1  
The partial credit model, developed by Masters (1982), is a unidimensional latent trait model for responses scored in two or more ordered categories. In the present paper some extensions of the model are presented. First, a marginal maximum likelihood estimation procedure is developed which allows for incomplete data and linear restrictions on both the item and the population parameters. Secondly, two statistical tests for evaluating model fit are presented: the former test has power against violation of the assumption about the ability distribution, the latter test offers the possibility of identifying specific items that do not fit the model.The authors are indepted to professor Wim van der Linden and Huub Verstralen for their helpful comments.  相似文献   

14.
Using the responses of 197 suicidal and nonsuicidal patients of a crisis and short-term intervention unit, the item factor structure of the SCL-90 was examined. Results suggested a reduced dimensionality for the checklist rather than supporting the nine scales associated with the instrument's scoring key. The influence of various response styles on the SCL-90's dimensionality was also explored. Although a social desirability response style was strongly related to all SCL-90 subscales and factors, it did not offer a complete explanation for the checklist's apparent unidimensionality. Other explanations relating to methods of test construction and to patients' inability to differentiate symptoms were also preferred. Finally, although subscale scores may facilitate clinical interpretation, it was suggested that the SCL-90 might best be scored as a single index of general symptomatology.We wish to thank G. C. Fekken and R. N. MacLennan for their assistance with this project.This work was supported in part by Social Sciences and Humanities Research Council of Canada Grant 410-85-1043 and by an Ontario Ministry of Health Research Personnel Award.  相似文献   

15.
A method of estimating item characteristic functions is proposed, in which a set of test items, whose operating characteristics are known and which give a constant test information function for a substantially wide range of ability, are used. The method is based on the maximum likelihood estimates of ability for a group of several hundred examinees. Throughout the present study the Monte Carlo method is used.  相似文献   

16.
J. O. Ramsay 《Psychometrika》1995,60(3):323-339
The probability that an examinee chooses a particular option within an item is estimated by averaging over the responses to that item of examinees with similar response patterns for the whole test. The approach does not presume any latent variable structure or any dimensionality. But simulated and actual data analyses are presented to show that when the responses are determined by a latent ability variable, this similarity-based smoothing procedure can reveal the dimensionality of ability very satisfactorily.The author wishes to acknowledge the support of the Natural Sciences and Engineering Research Council of Canada through grant A320, and to thank Educational Testing Service for making the data on the Advanced Placement Chemistry Exam available.  相似文献   

17.
The aim of latent variable selection in multidimensional item response theory (MIRT) models is to identify latent traits probed by test items of a multidimensional test. In this paper the expectation model selection (EMS) algorithm proposed by Jiang et al. (2015) is applied to minimize the Bayesian information criterion (BIC) for latent variable selection in MIRT models with a known number of latent traits. Under mild assumptions, we prove the numerical convergence of the EMS algorithm for model selection by minimizing the BIC of observed data in the presence of missing data. For the identification of MIRT models, we assume that the variances of all latent traits are unity and each latent trait has an item that is only related to it. Under this identifiability assumption, the convergence of the EMS algorithm for latent variable selection in the multidimensional two-parameter logistic (M2PL) models can be verified. We give an efficient implementation of the EMS for the M2PL models. Simulation studies show that the EMS outperforms the EM-based L1 regularization in terms of correctly selected latent variables and computation time. The EMS algorithm is applied to a real data set related to the Eysenck Personality Questionnaire.  相似文献   

18.
19.
A definition ofessential independence is proposed for sequences of polytomous items. For items satisfying the reasonable assumption that the expected amount of credit awarded increases with examinee ability, we develop a theory ofessential unidimensionality which closely parallels that of Stout. Essentially unidimensional item sequences can be shown to have a unique (up to change-of-scale) dominant underlying trait, which can be consistently estimated by a monotone transformation of the sum of the item scores. In more general polytomous-response latent trait models (with or without ordered responses), anM-estimator based upon maximum likelihood may be shown to be consistent for under essentially unidimensional violations of local independence and a variety of monotonicity/identifiability conditions. A rigorous proof of this fact is given, and the standard error of the estimator is explored. These results suggest that ability estimation methods that rely on the summation form of the log likelihood under local independence should generally be robust under essential independence, but standard errors may vary greatly from what is usually expected, depending on the degree of departure from local independence. An index of departure from local independence is also proposed.This work was supported in part by Office of Naval Research Grant N00014-87-K-0277 and National Science Foundation Grant NSF-DMS-88-02556. The author is grateful to William F. Stout for many helpful comments, and to an anonymous reviewer for raising the questions addressed in section 2. A preliminary version of section 6 appeared in the author's Ph.D. thesis.  相似文献   

20.
Statistical methods are presented to facilitate a more complete analysis of results obtained when a scaling model is applied to data from two or more groups. These methods can be used to (a) compare the corresponding estimated latent distributions obtained using the scaling model applied to the different groups, (b) compare the corresponding estimated item reliabilities (or item response error rates) for the different groups, and (c) test whether the scaling model applied to the several groups can be replaced by a more parsimonious scaling model that includes various homogeneity constraints (i.e., constraints that describe which parameters in the model are the same for the several groups). Various kinds of scaling models are considered here in the multiple-group context.Support for this research was provided in part by the National Science Foundation, to Clogg by Grant No. SES-7823759 and to Goodman by Grant No. SES-8303838. Clogg and Goodman were Fellows at the Center for Advanced Study in the Behavioral Sciences when part of the research was done, with financial support provided in part by National Science Foundation grant BNS-8011494 to the Center. The authors are indebted to Mark P. Becker and James W. Shockey for helpful comments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号