首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Current approaches to model responses and response times to psychometric tests solely focus on between-subject differences in speed and ability. Within subjects, speed and ability are assumed to be constants. Violations of this assumption are generally absorbed in the residual of the model. As a result, within-subject departures from the between-subject speed and ability level remain undetected. These departures may be of interest to the researcher as they reflect differences in the response processes adopted on the items of a test. In this article, we propose a dynamic approach for responses and response times based on hidden Markov modeling to account for within-subject differences in responses and response times. A simulation study is conducted to demonstrate acceptable parameter recovery and acceptable performance of various fit indices in distinguishing between different models. In addition, both a confirmatory and an exploratory application are presented to demonstrate the practical value of the modeling approach.  相似文献   

2.
With advances in computerized tests, it has become commonplace to register not just the accuracy of the responses provided to the items, but also the response time. The idea that for each response both response accuracy and response time are indicative of ability has explicitly been incorporated in the signed residual time (SRT) model (Maris & van der Maas, 2012, Psychometrika, 77, 615–633), which assumes that fast correct responses are indicative of a higher level of ability than slow correct responses. While the SRT model allows one to gain more information about ability than is possible based on considering only response accuracy, measurement may be confounded if persons show differences in their response speed that cannot be explained by ability, for example due to differences in response caution. In this paper we propose an adapted version of the SRT model that makes it possible to model person differences in overall speed, while maintaining the idea of the SRT model that the speed at which individual responses are given may be indicative of ability. We propose a two-dimensional SRT model that considers dichotomized response time, which allows one to model differences between fast and slow responses. The model includes both an ability and a speed parameter, and allows one to correct the estimates of ability for possible differences in overall speed. The performance of the model is evaluated through simulation, and the relevance of including the speed parameter is studied in the context of an empirical example from formative educational assessment.  相似文献   

3.
When two test forms measure the same construct but are independently modelled using item response theory, the two forms’ respective metrics cannot be assumed to be equivalent. Thus, before comparing parameter estimates across forms, a linear transformation must be applied to at least one form's scale. The mean‐sigma method is a well‐known procedure for estimating this adjustment when a common set of items appears on both forms. In this paper, I show both analytically and empirically (through a small simulation study) that the mean‐sigma estimators of the transformation constants are biased. While this systematic error was modest relative to random error under the conditions studied here, it is nevertheless intrinsic and its magnitude is conditional on extrinsic design features that include the number of anchor items and the quality of their difficulty estimates.  相似文献   

4.
In item response theory, modelling the item response times in addition to the item responses may improve the detection of possible between- and within-subject differences in the process that resulted in the responses. For instance, if respondents rely on rapid guessing on some items but not on all, the joint distribution of the responses and response times will be a multivariate within-subject mixture distribution. Suitable parametric methods to detect these within-subject differences have been proposed. In these approaches, a distribution needs to be assumed for the within-class response times. In this paper, it is demonstrated that these parametric within-subject approaches may produce false positives and biased parameter estimates if the assumption concerning the response time distribution is violated. A semi-parametric approach is proposed which resorts to categorized response times. This approach is shown to hardly produce false positives and parameter bias. In addition, the semi-parametric approach results in approximately the same power as the parametric approach.  相似文献   

5.
In real testing, examinees may manifest different types of test‐taking behaviours. In this paper we focus on two types that appear to be among the more frequently occurring behaviours – solution behaviour and rapid guessing behaviour. Rapid guessing usually happens in high‐stakes tests when there is insufficient time, and in low‐stakes tests when there is lack of effort. These two qualitatively different test‐taking behaviours, if ignored, will lead to violation of the local independence assumption and, as a result, yield biased item/person parameter estimation. We propose a mixture hierarchical model to account for differences among item responses and response time patterns arising from these two behaviours. The model is also able to identify the specific behaviour an examinee engages in when answering an item. A Monte Carlo expectation maximization algorithm is proposed for model calibration. A simulation study shows that the new model yields more accurate item and person parameter estimates than a non‐mixture model when the data indeed come from two types of behaviour. The model also fits real, high‐stakes test data better than a non‐mixture model, and therefore the new model can better identify the underlying test‐taking behaviour an examinee engages in on a certain item.  相似文献   

6.
To provide more refined diagnostic feedback with collateral information in item response times (RTs), this study proposed joint modelling of attributes and response speed using item responses and RTs simultaneously for cognitive diagnosis. For illustration, an extended deterministic input, noisy ‘and’ gate (DINA) model was proposed for joint modelling of responses and RTs. Model parameter estimation was explored using the Bayesian Markov chain Monte Carlo (MCMC) method. The PISA 2012 computer-based mathematics data were analysed first. These real data estimates were treated as true values in a subsequent simulation study. A follow-up simulation study with ideal testing conditions was conducted as well to further evaluate model parameter recovery. The results indicated that model parameters could be well recovered using the MCMC approach. Further, incorporating RTs into the DINA model would improve attribute and profile correct classification rates and result in more accurate and precise estimation of the model parameters.  相似文献   

7.
Given a drift diffusion model with unknown drift and boundary parameters, we analyse the behaviour of maximum likelihood estimates with respect to changes of responses and response times. It is shown analytically that a single fast response time can dominate the estimation in that no matter how many correct answers a test taker provides, the estimate of the drift (ability) parameter decreases to zero. In addition, it is shown that although higher drift rates imply shorter response times, the reverse implication does not hold for the estimates: shorter response times can decrease the drift rate estimate. In the light of these analytical results, we illustrate the actual impact of the findings in a small simulation for a mental rotation test. The method of analysis outlined is applicable to a broader range of models, and we emphasize the need to further check currently used reaction time models within this framework.  相似文献   

8.
Understanding individual differences in cognitive performance is an important part of understanding how variations in underlying cognitive processes can result in variations in task performance. However, the exploration of individual differences in the components of the decision process—such as cognitive processing speed, response caution, and motor execution speed—in previous research has been limited. Here, we assess the heritability of the components of the decision process, with heritability having been a common aspect of individual differences research within other areas of cognition. Importantly, a limitation of previous work on cognitive heritability is the underlying assumption that variability in response times solely reflects variability in the speed of cognitive processing. This assumption has been problematic in other domains, due to the confounding effects of caution and motor execution speed on observed response times. We extend a cognitive model of decision‐making to account for relatedness structure in a twin study paradigm. This approach can separately quantify different contributions to the heritability of response time. Using data from the Human Connectome Project, we find strong evidence for the heritability of response caution, and more ambiguous evidence for the heritability of cognitive processing speed and motor execution speed. Our study suggests that the assumption made in previous studies—that the heritability of cognitive ability is based on cognitive processing speed—may be incorrect. More generally, our methodology provides a useful avenue for future research in complex data that aims to analyze cognitive traits across different sources of related data, whether the relation is between people, tasks, experimental phases, or methods of measurement.  相似文献   

9.
A brightness discrimination experiment was performed to examine how subjects decide whether a patch of pixels is “bright” or “dark,” and stimulus duration, brightness, and speed versus accuracy instructions were manipulated. The diffusion model (Ratcliff, 1978) was fit to the data, and it accounted for all the dependent variables: mean correct and error response times, the shapes of response time distributions for correct and error responses, and accuracy values. Speed-accuracy manipulations affected only boundary separation (response criteria settings) in the model. Drift rate (the rate of accumulation of evidence) in the diffusion model, which represents stimulus quality, increased as a function of stimulus duration and stimulus brightness but asymptoted as stimulus duration increased from 100 to 150 msec. To address the argument that the diffusion model can fit any pattern of data, simulated patterns of plausible data are presented that the model cannot fit.  相似文献   

10.
11.
Count data naturally arise in several areas of cognitive ability testing, such as processing speed, memory, verbal fluency, and divergent thinking. Contemporary count data item response theory models, however, are not flexible enough, especially to account for over- and underdispersion at the same time. For example, the Rasch Poisson counts model (RPCM) assumes equidispersion (conditional mean and variance coincide) which is often violated in empirical data. This work introduces the Conway–Maxwell–Poisson counts model (CMPCM) that can handle underdispersion (variance lower than the mean), equidispersion, and overdispersion (variance larger than the mean) in general and specifically at the item level. A simulation study revealed satisfactory parameter recovery at moderate sample sizes and mostly unbiased standard errors for the proposed estimation approach. In addition, plausible empirical reliability estimates resulted, while those based on the RPCM were biased downwards (underdispersion) and biased upwards (overdispersion) when the simulation model deviated from equidispersion. Finally, verbal fluency data were analysed and the CMPCM with item-specific dispersion parameters fitted the data best. Dispersion parameter estimates indicated underdispersion for three out of four items. Overall, these findings indicate the feasibility and importance of the suggested flexible count data modelling approach.  相似文献   

12.
Among the most valuable tools in behavioral science is statistically fitting mathematical models of cognition to data—response time distributions, in particular. However, techniques for fitting distributions vary widely, and little is known about the efficacy of different techniques. In this article, we assess several fitting techniques by simulating six widely cited models of response time and using the fitting procedures to recover model parameters. The techniques include the maximization of likelihood and least squares fits of the theoretical distributions to different empirical estimates of the simulated distributions. A running example is used to illustrate the different estimation and fitting procedures. The simulation studies reveal that empirical density estimates are biased even for very large sample sizes. Some fitting techniques yield more accurate and less variable parameter estimates than do others. Methods that involve least squares fits to density estimates generally yield very poor parameter estimates.  相似文献   

13.
In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed.  相似文献   

14.
This article proposes an approach to modelling partially cross‐classified multilevel data where some of the level‐1 observations are nested in one random factor and some are cross‐classified by two random factors. Comparisons between a proposed approach to two other commonly used approaches which treat the partially cross‐classified data as either fully nested or fully cross‐classified are completed with a simulation study. Results show that the proposed approach demonstrates desirable performance in terms of parameter estimates and statistical inferences. Both the fully nested model and the fully cross‐classified model suffer from biased estimates of some variance components and statistical inferences of some fixed effects. Results also indicate that the proposed model is robust against cluster size imbalance.  相似文献   

15.
In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject’s response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.  相似文献   

16.
A method is proposed for the detection of item bias with respect to observed or unobserved subgroups. The method uses quasi-loglinear models for the incomplete subgroup × test score × Item 1 × ... × itemk contingency table. If subgroup membership is unknown the models are Haberman's incomplete-latent-class models.The (conditional) Rasch model is formulated as a quasi-loglinear model. The parameters in this loglinear model, that correspond to the main effects of the item responses, are the conditional estimates of the parameters in the Rasch model. Item bias can then be tested by comparing the quasi-loglinear-Rasch model with models that contain parameters for the interaction of item responses and the subgroups.The author thanks Wim J. van der Linden and Gideon J. Mellenbergh for comments and suggestions and Frank Kok for empirical data.  相似文献   

17.
The diffusion model (Ratcliff, 1978) and the leaky competing accumulator model (LCA, Usher & McClelland, 2001) were tested against two-choice data collected from the same subjects with the standard response time procedure and the response signal procedure. In the response signal procedure, a stimulus is presented and then, at one of a number of experimenter-determined times, a signal to respond is presented. The models were fit to the data from the two procedures simultaneously under the assumption that responses in the response signal procedure were based on a mixture of decision processes that had already terminated at response boundaries before the signal and decision processes that had not yet terminated. In the latter case, decisions were based on partial information in one variant of each model or on guessing in a second variant. Both variants of the diffusion model fit the data well and both fit better than either variant of the LCA model, although the differences in numerical goodness-of-fit measures were not large enough to allow decisive selection between the models.  相似文献   

18.
孟祥斌 《心理科学》2016,39(3):727-734
近年来,项目反应时间数据的建模是心理和教育测量领域的热门方向之一。针对反应时间的对数正态模型和Box-Cox正态模型的不足,本文在van der Linden的分层模型框架下基于偏正态分布建立一个反应时间的对数线性模型,并成功给出模型参数估计的马尔科夫链蒙特卡罗(Markov Chain Monte Carlo, MCMC)算法。模拟研究和实例分析的结果均表明,与对数正态模型和Box-Cox正态模型相比,对数偏正态模型表现出更加优良的拟合效果,具有更强的灵活性和适用性。  相似文献   

19.
Modeling Response Times for Two-Choice Decisions   总被引:8,自引:0,他引:8  
The diffusion model for two-choice real-time decisions is applied to four psychophysical tasks. The model reveals how stimulus information guides decisions and shows how the information is processed through time to yield sometimes correct and sometimes incorrect decisions. Rapid two-choice decisions yield multiple empirical measures: response times for correct and error responses, the probabilities of correct and error responses, and a variety of interactions between accuracy and response time that depend on instructions and task difficulty. The diffusion model can explain all these aspects of the data for the four experiments we present. The model correctly accounts for error response times, something previous models have failed to do. Variability within the decision process explains how errors are made, and variability across trials correctly predicts when errors are faster than correct responses and when they are slower.  相似文献   

20.
马洁  刘红云 《心理科学》2018,(6):1374-1381
本研究通过高中英语阅读测验实测数据,对比分析双参数逻辑斯蒂克模型 (2PL-IRT)和加入不同数量题组的双参数逻辑斯蒂克模型 (2PL-TRT), 探究题组数量对参数估计及模型拟合的影响。结果表明:(1) 2PL-IRT模型对能力介于-1.50到0.50的被试,能力参数估计偏差较大;(2)将题组效应大于0.50的题组作为局部独立题目纳入模型,会导致部分题目区分度参数的低估和大部分题目难度参数的高估;(3)题组效应越大,将其当作局部独立题目纳入模型估计项目参数的偏差越大。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号