首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In optimal design research, designs are optimized with respect to some statistical criterion under a certain model for the data. The ideas from optimal design research have spread into various fields of research, and recently have been adopted in test theory and applied to item response theory (IRT) models. In this paper a generalized variance criterion is used for sequential sampling in the two-parameter IRT model. Some general principles are offered to enable a researcher to select the best sampling design for the efficient estimation of item parameters.  相似文献   

2.
Analysing ordinal data is becoming increasingly important in psychology, especially in the context of item response theory. The generalized partial credit model (GPCM) is probably the most widely used ordinal model and has found application in many large-scale educational assessment studies such as PISA. In the present paper, optimal test designs are investigated for estimating persons’ abilities with the GPCM for calibrated tests when item parameters are known from previous studies. We find that local optimality may be achieved by assigning non-zero probability only to the first and last categories independently of a person's ability. That is, when using such a design, the GPCM reduces to the dichotomous two-parameter logistic (2PL) model. Since locally optimal designs require the true ability to be known, we consider alternative Bayesian design criteria using weight distributions over the ability parameter space. For symmetric weight distributions, we derive necessary conditions for the optimal one-point design of two response categories to be Bayes optimal. Furthermore, we discuss examples of common symmetric weight distributions and investigate under what circumstances the necessary conditions are also sufficient. Since the 2PL model is a special case of the GPCM, all of these results hold for the 2PL model as well.  相似文献   

3.
For testlet response data, traditional item response theory (IRT) models are often not appropriate due to local dependence presented among items within a common testlet. Several testlet‐based IRT models have been developed to model examinees' responses. In this paper, a new two‐parameter normal ogive testlet response theory (2PNOTRT) model for dichotomous items is proposed by introducing testlet discrimination parameters. A Bayesian model parameter estimation approach via a data augmentation scheme is developed. Simulations are conducted to evaluate the performance of the proposed 2PNOTRT model. The results indicated that the estimation of item parameters is satisfactory overall from the viewpoint of convergence. Finally, the proposed 2PNOTRT model is applied to a set of real testlet data.  相似文献   

4.
A Bayesian procedure is developed for the estimation of parameters in the two-parameter logistic item response model. Joint modal estimates of the parameters are obtained and procedures for the specification of prior information are described. Through simulation studies it is shown that Bayesian estimates of the parameters are superior to maximum likelihood estimates in the sense that they are (a) more meaningful since they do not drift out of range, and (b) more accurate in that they result in smaller mean squared differences between estimates and true values.The research reported here was performed pursuant to Grant No. N0014-79-C-0039 with the Office of Naval Research.  相似文献   

5.
The linear logistic test model (LLTM) specifies the item parameters as a weighted sum of basic parameters. The LLTM is a special case of a more general nonlinear logistic test model (NLTM) where the weights are partially unknown. This paper is about the identifiability of the NLTM. Sufficient and necessary conditions for global identifiability are presented for a NLTM where the weights are linear functions, while conditions for local identifiability are shown to require a model with less restrictions. It is also discussed how these conditions are checked using an algorithm due to Bekker, Merckens, and Wansbeek (1994). Several illustrations are given.This article was written while the first author was a post doctoral fellow at the university of Twente. He gratefully acknowledges the university's hospitality and the financial support by NWO (project nr. 30002).  相似文献   

6.
A joint Bayesian estimation procedure for the estimation of parameters in the three-parameter logistic model is developed in this paper. Procedures for specifying prior beliefs for the parameters are given. It is shown through simulation studies that the Bayesian procedure (i) ensures that the estimates stay in the parameter space, and (ii) produces better estimates than the joint maximum likelihood procedure as judged by such criteria as mean squared differences between estimates and true values. The research reported here was performed pursuant to Grant No. N0014-79-C-0039 with the Office of Naval Research. A related article by Robert J. Mislevy (1986) appeared when the present paper was in the printing stage.  相似文献   

7.
Two methods of estimating parameters in the Rasch model are compared. It is shown that estimates for a certain loglinear model for the score × item × response table are equivalent to the unconditional maximum likelihood estimates for the Rasch model.  相似文献   

8.
In item response theory, modelling the item response times in addition to the item responses may improve the detection of possible between- and within-subject differences in the process that resulted in the responses. For instance, if respondents rely on rapid guessing on some items but not on all, the joint distribution of the responses and response times will be a multivariate within-subject mixture distribution. Suitable parametric methods to detect these within-subject differences have been proposed. In these approaches, a distribution needs to be assumed for the within-class response times. In this paper, it is demonstrated that these parametric within-subject approaches may produce false positives and biased parameter estimates if the assumption concerning the response time distribution is violated. A semi-parametric approach is proposed which resorts to categorized response times. This approach is shown to hardly produce false positives and parameter bias. In addition, the semi-parametric approach results in approximately the same power as the parametric approach.  相似文献   

9.
10.
A number of models for categorical item response data have been proposed in recent years. The models appear to be quite different. However, they may usefully be organized as members of only three distinct classes, within which the models are distinguished only by assumptions and constraints on their parameters. “Difference models” are appropriate for ordered responses, “divide-by-total” models may be used for either ordered or nominal responses, and “left-side added” models are used for multiple-choice responses with guessing. The details of the taxonomy and the models are described in this paper. The present study was supported in part by two postdoctoral fellowships awarded to Lynne Steinberg: an Educational Testing Service Postdoctoral Fellowship at ETS, Princeton, NJ and an NIMH Individual National Research Service Award at Stanford University, Stanford, CA. Helpful comments by the editor and three anonymous reviewers are gratefully acknowledged.  相似文献   

11.
四参数Logistic模型潜在特质参数的Warm加权极大似然估计   总被引:1,自引:0,他引:1  
孟祥斌  陶剑  陈莎莉 《心理学报》2016,(8):1047-1056
本文以四参数Logistic(4-parameter Logistic,4PL)模型为研究对象,根据Warm的加权极大似然估计技巧,提出了4PL模型潜在特质参数的加权极大似然估计方法,并借助模拟研究对加权极大似然估计的性质进行验证。研究结果表明,与通常的极大似然估计和后验期望估计相比,加权极大似然估计的偏差(bias)明显减小,并且具有良好的返真性能。此外,在测试的长度较短和项目的区分度较小的情况下,加权极大似然估计依然保持了良好的统计性质,表现出更加显著的优势。  相似文献   

12.
詹沛达 《心理科学》2019,(1):170-178
随着心理与教育测量研究的发展和科技的进步,计算机化(大规模)测验逐渐受到人们的关注。为探究在计算机化多维测验中如何利用作答时间数据来辅助评估多维潜在能力,以及为我国义务教育阶段教育质量监测提供数据分析方法上的理论支持。本研究以2012年和2015年国际学生能力评估(PISA)计算机化数学测验数据为例,提出了一种可同时利用作答时间和作答精度数据的联合作答与时间的多维Rasch模型。根据新模型对PISA数据的分析结果,表明引入作答时间数据,不仅有助于提高模型参数的估计精度,还有助于数据分析者利用被试的作答时间信息来做进一步的决策和干预(e.g., 对异常作答行为或预备知识的诊断)。  相似文献   

13.
In analyzing responses and response times to personality questionnaire items, models have been proposed which include the so-called “inverted-U effect.” These models predict that response times to personality test items decrease as the latent trait value of a given person gets closer to the attractiveness of an item. Initial studies into these models have focused on dichotomous personality items, and more recently, models for Likert-type scale items have been proposed. In all these models, it is assumed that the inverted-U effect is symmetrical around 0, while, as will be explained in this article, there are substantive and statistical reasons to study this assumption. Therefore, in this article, a general inverted-U model is proposed which accommodates two sources of asymmetry between the response times and the attractiveness of the items. The viability of this model is demonstrated in a simulation study, and the model is applied to the responses and response times of the Temperament and Character Inventory–Revised, covering a broad range of personality dimensions.  相似文献   

14.
Samejima (Psychometrika 65:319–335, 2000) proposed the logistic positive exponent family of models (LPEF) for dichotomous responses in the unidimensional latent space. The objective of the present paper is to propose and discuss a graded response model that is expanded from the LPEF, in the context of item response theory (IRT). This specific graded response model belongs to the general framework of graded response model (Samejima, Psychometrika Monograph, No. 17, 1969 and No. 18, 1972; Handbook of modern item response theory, Springer, New York, 1997; Encyclopedia of Social Measurement, Academic Press, San Diego, 2004), and, in particular to the heterogeneous case (Samejima, Psychometrika Monograph, No. 18, 1972). Thus, the model can deal with any number of ordered polytomous responses, such as letter grades (e.g., A, B, C, D, F), etc. For brevity, hereafter, the model will be called the LPEF graded response model, or LPEFG. This model reflects the opposing two principles contained in the LPEF for dichotomous responses, with the logistic model (Birnbaum, Statistical theories of mental test scores, Addison Wesley, Reading, 1968) as their transition, which provide a reasonable rationale for partial credits in LPEFG, among others.  相似文献   

15.
For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if 21 < 4 ln 2 and bimodal otherwi The locations and values of the maxima are derived. Furthermore, it is demonstrated that the value of the maximum is decreasing in 21. Consequently, the maximum of a unimodal item information function is always larger than the maximum of a bimodal one, and similarly for the item discrimination function.The work reported herein was partially supported under the National Assessment of Educational Progress (Grant No. R999G30002; CFDA No. 84.999G) as administered by the Office of Educational Research and Improvement, US Department of Education.  相似文献   

16.
A commonly used method to evaluate the accuracy of a measurement is to provide a confidence interval that contains the parameter of interest with a given high probability. Smallest exact confidence intervals for the ability parameter of the Rasch model are derived and compared to the traditional, asymptotically valid intervals based on the Fisher information. Tables of the exact confidence intervals, termed Clopper-Pearson intervals, can be routinely drawn up by applying a computer program designed by and obtainable from the author. These tables are particularly useful for tests of only moderate lengths where the asymptotic method does not provide valid confidence intervals.  相似文献   

17.
By considering information about response time (RT) in addition to response accuracy (RA), joint models for RA and RT such as the hierarchical model (van der Linden, 2007) can improve the precision with which ability is estimated over models that only consider RA. The hierarchical model, however, assumes that only the person's speed is informative of ability. This assumption of conditional independence between RT and ability given speed may be violated in practice, and ignores collateral information about ability that may be present in the residual RTs. We propose a posterior predictive check for evaluating the assumption of conditional independence between RT and ability given speed. Furthermore, we propose an extension of the hierarchical model that contains cross-loadings between ability and RT, which enables one to take additional collateral information about ability into account beyond what is possible in the standard hierarchical model. A Bayesian estimation procedure is proposed for the model. Using simulation studies, the performance of the model is evaluated in terms of parameter recovery, and the possible gain in precision over the standard hierarchical model and an RA-only model is considered. The model is applied to data from a high-stakes educational test.  相似文献   

18.
The paper addresses and discusses whether the tradition of accepting point-symmetric item characteristic curves is justified by uncovering the inconsistent relationship between the difficulties of items and the order of maximum likelihood estimates of ability. This inconsistency is intrinsic in models that provide point-symmetric item characteristic curves, and in this paper focus is put on the normal ogive model for observation. It is also questioned if in the logistic model the sufficient statistic has forfeited the rationale that is appropriate to the psychological reality. It is observed that the logistic model can be interpreted as the case in which the inconsistency in ordering the maximum likelihood estimates is degenerated.The paper proposes a family of models, called the logistic positive exponent family, which provides asymmetric item chacteristic curves. A model in this family has a consistent principle in ordering the maximum likelihood estimates of ability. The family is divided into two subsets each of which has its own principle, and includes the logistic model as a transition from one principle to the other. Rationale and some illustrative examples are given.  相似文献   

19.
A model for multiple-choice exams is developed from a signal-detection perspective. A correct alternative in a multiple-choice exam can be viewed as being a signal embedded in noise (incorrect alternatives). Examinees are assumed to have perceptions of the plausibility of each alternative, and the decision process is to choose the most plausible alternative. It is also assumed that each examinee either knows or does not know each item. These assumptions together lead to a signal detection choice model for multiple-choice exams. The model can be viewed, statistically, as a mixture extension, with random mixing, of the traditional choice model, or similarly, as a grade-of-membership extension. A version of the model with extreme value distributions is developed, in which case the model simplifies to a mixture multinomial logit model with random mixing. The approach is shown to offer measures of item discrimination and difficulty, along with information about the relative plausibility of each of the alternatives. The model, parameters, and measures derived from the parameters are compared to those obtained with several commonly used item response theory models. An application of the model to an educational data set is presented.  相似文献   

20.
詹沛达  Hong Jiao  Kaiwen Man 《心理学报》2020,52(9):1132-1142
在心理与教育测量中,潜在加工速度反映学生运用潜在能力解决问题的效率。为在多维测验中探究潜在加工速度的多维性并实现参数估计,本研究提出多维对数正态作答时间模型。实证数据分析及模拟研究结果表明:(1)潜在加工速度具有与潜在能力相匹配的多维结构;(2)新模型可精确估计个体水平的多维潜在加工速度及与作答时间有关的题目参数;(3)冗余指定潜在加工速度具有多维性带来的负面影响低于忽略其多维性所带来的。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号