首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 25 毫秒
1.
相比多参数多维度IRT模型通过增加参数的方式来提升模型拟合度和解释度,Rasch模型流派强调“理论驱动研究”和“数据符合模型”,推崇单参数单维度的测量模型能最大限度地减少额外因素对真实测量目的的影响和干扰,从而保证测量的客观性和准确性。Rasch模型关注测量目标与测量工具的对应关系,它的“简单”特性有助于研究者更准确地评估和解释被测目标与测量工具间的适配性,且在将非线性数据转化为等距数据时具有天然的优势。  相似文献   

2.
We introduce a general response model that allows for several simple restrictions, resulting in other models such as the extended Rasch model. For the extended Rasch model, a dynamic Bayesian estimation procedure is provided, which is able to deal with data sets that change over time, and possibly include many missing values. To ensure comparability over time, a data augmentation method is used, which provides an augmented person-by-item data matrix and reproduces the sufficient statistics of the complete data matrix. Hence, longitudinal comparisons can be easily made based on simple summaries, such as proportion correct, sum score, etc. As an illustration of the method, an example is provided using data from a computer-adaptive practice mathematical environment.  相似文献   

3.
Kohlberg's characterization of moral development as displaying an invariant hierarchical order of structurally consistent stages is losing ground. However, by applying Rasch analysis, Dawson recently gave new interpretation and support to his characterization of stage development. Using Rasch models, we replicated and strengthened her findings in a re-analysis of three sets of longitudinal socio-moral reasoning data collected in Iceland. A new application of Rasch analysis provided support for upward development. Our results supported Kohlberg's characterization of stage development and the cross-cultural stability of Dawson's findings that were exclusively based on US samples. We conclude that proposals to replace Kohlberg's characterization of moral development are premature.  相似文献   

4.
Loglinear unidimensional and multidimensional Rasch models are considered for the analysis of repeated observations of polytomous indicators with ordered response categories. Reparameterizations and parameter restrictions are provided which facilitate specification of a variety of hypotheses about latent processes of change. Models of purely quantitative change in latent traits are proposed as well as models including structural change. A conditional likelihood ratio test is presented for the comparison of unidimensional and multiple scales Rasch models. In the context of longitudinal research, this renders possible the statistical test of homogeneity of change against subject-specific change in latent traits. Applications to two empirical data sets illustrate the use of the models.The author is greatly indebted to Ulf Böckenholt, Rolf Langeheine, and several anonymous reviewers for many helpful suggestions.  相似文献   

5.
We present a method for studying experimental data based on a psychometric model, the “Rasch model” (Rasch, 1966; Thissen & Steinberg, 1986). We illustrate the method with the use of a data set in the field of concept research. More specifically, we investigate whether a conjunctive concept can be seen as an additive combination of its constituents. High correlations between model and data are obtained, but a formal goodness-of-fit test indicates that the model does not completely account for the data. We then alter the Rasch model in such a way as to capture our idea of why the model deviates from the data. This results in higher correlations and a strong increase in goodness-of-fit. It is concluded that our ideas, as incorporated in the model, adequately summarize the data. More generally, this research illustrates that applying the Rasch model and altering it according to one’s hypotheses is an excellent way to analyze experimental data.  相似文献   

6.
篇章形式的阅读测验是一种典型的题组测验,在进行项目功能差异(DIF)检验时需要采用与之匹配的DIF检验方法.基于题组反应模型的DIF检验方法是真正能够处理题组效应的DIF检验方法,能够提供题组中每个项目的DIF效应测量,是题组DIF检验方法中较有理论优势的一种,主要使用的方法是Rasch题组DIF检验方法.该研究将Rasch题组DIF检验方法引入篇章阅读测验的DIF检验中,对某阅读成就测验进行题组DIF检验,结果显示,该测验在内容维度和能力维度的部分子维度上出现了具有显著DIF效应的项目,研究从测验公平的角度对该测验的进一步修改及编制提出了一定的建议.研究中进一步将Rasch题组DIF检验方法与基于传统Rasch模型的DIF检验方法以及变通的题组DIF检验方法的结果进行比较,研究结果体现了进行题组DIF检验的必要性与优越性.研究结果表明,在篇章阅读测验中,能够真正处理题组效应的题组DIF检验方法更加具有理论优势且对于阅读测验的编制与质量的提高具有更重要的意义.  相似文献   

7.
Research has demonstrated that individual differences in numeracy may have important consequences for decision making. In the present paper, we develop a shorter, psychometrically improved measure of numeracy—the ability to understand, manipulate, and use numerical information, including probabilities. Across two large independent samples that varied widely in age and educational level, participants completed 18 items from existing numeracy measures. In Study 1, we conducted a Rasch analysis on the item pool and created an eight‐item numeracy scale that assesses a broader range of difficulty than previous scales. In Study 2, we replicated this eight‐item scale in a separate Rasch analysis using data from an independent sample. We also found that the new Rasch‐based numeracy scale, compared with previous measures, could predict decision‐making preferences obtained in past studies, supporting its predictive validity. In Study, 3, we further established the predictive validity of the Rasch‐based numeracy scale. Specifically, we examined the associations between numeracy and risk judgments, compared with previous scales. Overall, we found that the Rasch‐based scale was a better linear predictor of risk judgments than prior measures. Moreover, this study is the first to present the psychometric properties of several popular numeracy measures across a diverse sample of ages and educational level. We discuss the usefulness and the advantages of the new scale, which we feel can be used in a wide range of subject populations, allowing for a more clear understanding of how numeracy is associated with decision processes. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

8.
分别采用四维度和十五维度Rasch模型分析包含项目内多维度结构的科学测验数据,估计两种维度结构下维度分数的信度.结果表明,对比相应的单维模型而言,四维度与十五维度Rasch模型均能够极大提高各内容维度上分数估计的信度.四维度与十五维度Rasch模型拟合结果的比较表明,对于总长度固定的测验,维度数目的增加能够补偿子维度长度减少引起的信度损失.但是这一作用必须以维度间较高的相关性为前提.  相似文献   

9.
Jansen and Roskam (1986) discussed the compatibility of the unidimensional polytomous Rasch model with dichotomization of the response continuum. They derived a rather strict condition in which dichotomization of multicategory data that fit the unidimensional polytomous Rasch model, results in dichotomous data which fit the dichotomous Research model with effectively the same subject parameter. In this paper a more general dichotomization condition is derived for the polytomous Rasch model, which appears less restrictive, but upholds that the intrinsic logic of the unidimensional polytomous Rasch model defies dichotomization in general. The robustness of dichotomous analysis investigated in a simulation study. It shows a close relation with the two-parameters (Birnbaum) model. Theoretical and methodological implications are discussed.The authors are indebted to H. Müller (personal communication, August 1986), for giving an example which pointed toward the core equation in this paper. The authors also acknowledge the critical comments of Th. Bezambinder and P. Wakker, and of Psychometrika's reviewers to an earlier version of this paper.  相似文献   

10.
In this paper we derive optimal designs for the Rasch Poisson counts model and its extended version of the (generalized) negative binomial counts model incorporating several binary predictors for the difficulty parameter. To efficiently estimate the regression coefficients of the predictors, locally D-optimal designs are developed. After an introduction to the Rasch Poisson counts model and its extension, we will specify these models as particular generalized linear models. Based on this embedding, optimal designs for both models including several binary explanatory variables will be presented. Therefore, we will derive conditions on the effect sizes for certain designs to be locally D-optimal. Finally, it is pointed out that the results derived for the Rasch Poisson models can be applied for more general Poisson regression models which should receive more attention in future psychological research.  相似文献   

11.
晏子 《心理科学进展》2010,18(8):1298-1305
Rasch模型是在国外学术界受到广泛关注和深入研究的一个潜在特质模型。该模型为解决心理科学领域内测量的客观性问题提供了一个可行性很高的解决方案。而国内关于Rasch模型的理论探讨和应用研究却并不多见。不同于一般项目反应理论, Rasch模型要求所收集的数据必须符合模型的先验要求, 而不是使用不同的参数去适应数据的特点。Rasch模型的主要特点(包括个体与题目共用标尺、线性数据、参数分离)确保了客观测量的实现。未来关于Rasch模型的研究方向包括多维度Rasch模型、测验的等值与链接、计算机自适应性考试, 大型应用测量系统(比如Lexile系统)等等。  相似文献   

12.
Methods for the identification of differential item functioning (DIF) in Rasch models are typically restricted to the case of two subgroups. A boosting algorithm is proposed that is able to handle the more general setting where DIF can be induced by several covariates at the same time. The covariates can be both continuous and (multi‐)categorical, and interactions between covariates can also be considered. The method works for a general parametric model for DIF in Rasch models. Since the boosting algorithm selects variables automatically, it is able to detect the items which induce DIF. It is demonstrated that boosting competes well with traditional methods in the case of subgroups. The method is illustrated by an extensive simulation study and an application to real data.  相似文献   

13.
Most natural domains can be represented in multiple ways: we can categorize foods in terms of their nutritional content or social role, animals in terms of their taxonomic groupings or their ecological niches, and musical instruments in terms of their taxonomic categories or social uses. Previous approaches to modeling human categorization have largely ignored the problem of cross-categorization, focusing on learning just a single system of categories that explains all of the features. Cross-categorization presents a difficult problem: how can we infer categories without first knowing which features the categories are meant to explain? We present a novel model that suggests that human cross-categorization is a result of joint inference about multiple systems of categories and the features that they explain. We also formalize two commonly proposed alternative explanations for cross-categorization behavior: a features-first and an objects-first approach. The features-first approach suggests that cross-categorization is a consequence of attentional processes, where features are selected by an attentional mechanism first and categories are derived second. The objects-first approach suggests that cross-categorization is a consequence of repeated, sequential attempts to explain features, where categories are derived first, then features that are poorly explained are recategorized. We present two sets of simulations and experiments testing the models’ predictions about human categorization. We find that an approach based on joint inference provides the best fit to human categorization behavior, and we suggest that a full account of human category learning will need to incorporate something akin to these capabilities.  相似文献   

14.
A new algorithm for obtaining exact person fit indexes for the Rasch model is introduced which realizes most powerful tests for a very general family of alternative hypotheses, including tests concerning DIF as well as model-deviating item correlations. The method is also used as a goodness-of-fit test for whole data sets where the item parameters are assumed to be known. For tests with 30 items at most, exact values are obtained, for longer tests a Monte Carlo-algorithm is proposed. Simulated examples and an empirical investigation demonstrate test power and applicability to item elimination.The author wishes to thank Elisabeth Ponocny-Seliger and the reviewers for many helpful comments. All exact goodness-of-fit tests proposed in this article are implemented in the menu-driven program T-Rasch 1.0 by Ponocny and Ponocny-Seliger (1999) which can be obtained from ProGAMMA (WWW: http://www.gamma.rug.nl) and also performs nonparametric tests.  相似文献   

15.
The paper addresses three neglected questions from IRT. In section 1, the properties of the “measurement” of ability or trait parameters and item difficulty parameters in the Rasch model are discussed. It is shown that the solution to this problem is rather complex and depends both on general assumptions about properties of the item response functions and on assumptions about the available item universe. Section 2 deals with the measurement of individual change or “modifiability” based on a Rasch test. A conditional likelihood approach is presented that yields (a) an ML estimator of modifiability for given item parameters, (b) allows one to test hypotheses about change by means of a Clopper-Pearson confidence interval for the modifiability parameter, or (c) to estimate modifiability jointly with the item parameters. Uniqueness results for all three methods are also presented. In section 3, the Mantel-Haenszel method for detecting DIF is discussed under a novel perspective: What is the most general framework within which the Mantel-Haenszel method correctly detects DIF of a studied item? The answer is that this is a 2PL model where, however, all discrimination parameters are known and the studied item has the same discrimination in both populations. Since these requirements would hardly be satisfied in practical applications, the case of constant discrimination parameters, that is, the Rasch model, is the only realistic framework. A simple Pearsonx 2 test for DIF of one studied item is proposed as an alternative to the Mantel-Haenszel test; moreover, this test is generalized to the case of two items simultaneously studied for DIF.  相似文献   

16.
In between-item multidimensional item response models, it is often desirable to compare individual latent trait estimates across dimensions. These comparisons are only justified if the model dimensions are scaled relative to each other. Traditionally, this scaling is done using approaches such as standardization—fixing the latent mean and standard deviation to 0 and 1 for all dimensions. However, approaches such as standardization do not guarantee that Rasch model properties hold across dimensions. Specifically, for between-item multidimensional Rasch family models, the unique ordering of items holds within dimensions, but not across dimensions. Previously, Feuerstahler and Wilson described the concept of scale alignment, which aims to enforce the unique ordering of items across dimensions by linearly transforming item parameters within dimensions. In this article, we extend the concept of scale alignment to the between-item multidimensional partial credit model and to models fit using incomplete data. We illustrate this method in the context of the Kindergarten Individual Development Survey (KIDS), a multidimensional survey of kindergarten readiness used in the state of Illinois. We also present simulation results that demonstrate the effectiveness of scale alignment in the context of polytomous item response models and missing data.  相似文献   

17.
Two generalizations of the Rasch model are compared: the between-item multidimensional model (Adams, Wilson, and Wang, 1997), and the mixture Rasch model (Mislevy & Verhelst, 1990; Rost, 1990). It is shown that the between-item multidimensional model is formally equivalent with a continuous mixture of Rasch models for which, within each class of the mixture, the item parameters are equal to the item parameters of the multidimensional model up to a shift parameter that is specific for the dimension an item belongs to in the multidimensional model. In a simulation study, the relation between both types of models also holds when the number of classes of the mixture is as small as two. The relation is illustrated with a study on verbal aggression. Frank Rijmen was supported by the Fund for Scientific Research Flanders (FWO). This research is also funded by the GOA/2000/02 granted from the KU Leuven. We would like to thank Kristof Vansteelandt for providing the data of the study on verbal aggression.  相似文献   

18.
The many null distributions of person fit indices   总被引:1,自引:0,他引:1  
This paper deals with the situation of an investigator who has collected the scores ofn persons to a set ofk dichotomous items, and wants to investigate whether the answers of all respondents are compatible with the one parameter logistic test model of Rasch. Contrary to the standard analysis of the Rasch model, where all persons are kept in the analysis and badly fittingitems may be removed, this paper studies the alternative model in which a small minority ofpersons has an answer strategy not described by the Rasch model. Such persons are called anomalous or aberrant. From the response vectors consisting ofk symbols each equal to 0 or 1, it is desired to classify each respondent as either anomalous or as conforming to the model. As this model is probabilistic, such a classification will possibly involve false positives and false negatives. Both for the Rasch model and for other item response models, the literature contains several proposals for a person fit index, which expresses for each individual the plausibility that his/her behavior follows the model. The present paper argues that such indices can only provide a satisfactory solution to the classification problem if their statistical distribution is known under the null hypothesis that all persons answer according to the model. This distribution, however, turns out to be rather different for different values of the person's latent trait value. This value will be called ability parameter, although our results are equally valid for Rasch scales measuring other attributes.As the true ability parameter is unknown, one can only use its estimate in order to obtain an estimated person fit value and an estimated null hypothesis distribution. The paper describes three specifications for the latter: assuming that the true ability equals its estimate, integrating across the ability distribution assumed for the population, and conditioning on the total score, which is in the Rasch model the sufficient statistic for the ability parameter.Classification rules for aberrance will be worked out for each of the three specifications. Depending on test length, item parameters and desired accuracy, they are based on the exact distribution, its Monte Carlo estimate and a new and promising approximation based on the moments of the person fit statistic. Results for the likelihood person fit statistic are given in detail, the methods could also be applied to other fit statistics. A comparison of the three specifications results in the recommendation to condition on the total score, as this avoids some problems of interpretation that affect the other two specifications.The authors express their gratitude to the reviewers and to many colleagues for comments on an earlier version.  相似文献   

19.
Depression is one of the most clinically relevant mood disorders, and many assessment instruments have been developed to measure it. Probably the most frequently used instrument is Beck’s Depression Inventory (BDI). The simplified BDI (BDI-S) is a more efficient version of the BDI that has been shown to be no less reliable or valid. As the BDI-S has not yet been subjected to rigorous tests of Item Response Theory, it is the aim of the present paper to conduct such an analysis using the Rasch model. This study subjected a simplified version of the BDI consisting of 20 items (BDI-S20) to a Rasch analysis in a sample of N = 5,035 participants. The scale, minus one misfitting item (BDI-S19), yielded a good approximation to Rasch assumptions. Moderate differential item functioning (DIF) was present. It is concluded that the BDI-S19 is an internally valid instrument for assessing depression, although some room for improvement exists.  相似文献   

20.
In this article, we emphasize that the Rasch model is not only very useful for psychological test calibration but is also necessary if the number of solved items is to be used as an examinee's score. Simplified proof that the Rasch model implies specific objective parameter comparisons is given. Consequently, a model check per se is possible. For data and item pools that fail to fit the Rasch model, various reasons are listed. For instance, the two-parameter logistic or three-parameter logistic models would probably be more suitable. Several suggestions are given for controlling the overall Type I risk, for including a power analysis (i.e., taking the Type II risk into account), for disclosing artificial model check results, and for the deletion of Rasch model misfitting examinees. These suggestions are empirically founded and may serve in the establishment of certain rough state-of-the-art standards. However, a degree of statistical elaboration is needed; and forthcoming test authors will still suffer from the fact that no standard software exists that offers all of the given approaches as a package.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号