首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In real testing, examinees may manifest different types of test‐taking behaviours. In this paper we focus on two types that appear to be among the more frequently occurring behaviours – solution behaviour and rapid guessing behaviour. Rapid guessing usually happens in high‐stakes tests when there is insufficient time, and in low‐stakes tests when there is lack of effort. These two qualitatively different test‐taking behaviours, if ignored, will lead to violation of the local independence assumption and, as a result, yield biased item/person parameter estimation. We propose a mixture hierarchical model to account for differences among item responses and response time patterns arising from these two behaviours. The model is also able to identify the specific behaviour an examinee engages in when answering an item. A Monte Carlo expectation maximization algorithm is proposed for model calibration. A simulation study shows that the new model yields more accurate item and person parameter estimates than a non‐mixture model when the data indeed come from two types of behaviour. The model also fits real, high‐stakes test data better than a non‐mixture model, and therefore the new model can better identify the underlying test‐taking behaviour an examinee engages in on a certain item.  相似文献   

2.
Statistical methods for identifying aberrances on psychological and educational tests are pivotal to detect flaws in the design of a test or irregular behavior of test takers. Two approaches have been taken in the past to address the challenge of aberrant behavior detection, which are (1) modeling aberrant behavior via mixture modeling methods, and (2) flagging aberrant behavior via residual based outlier detection methods. In this paper, we propose a two-stage method that is conceived of as a combination of both approaches. In the first stage, a mixture hierarchical model is fitted to the response and response time data to distinguish normal and aberrant behaviors using Markov chain Monte Carlo (MCMC) algorithm. In the second stage, a further distinction between rapid guessing and cheating behavior is made at a person level using a Bayesian residual index. Simulation results show that the two-stage method yields accurate item and person parameter estimates, as well as high true detection rate and low false detection rate, under different manipulated conditions mimicking NAEP parameters. A real data example is given in the end to illustrate the potential application of the proposed method.  相似文献   

3.
In low-stakes assessments, test performance has few or no consequences for examinees themselves, so that examinees may not be fully engaged when answering the items. Instead of engaging in solution behaviour, disengaged examinees might randomly guess or generate no response at all. When ignored, examinee disengagement poses a severe threat to the validity of results obtained from low-stakes assessments. Statistical modelling approaches in educational measurement have been proposed that account for non-response or for guessing, but do not consider both types of disengaged behaviour simultaneously. We bring together research on modelling examinee engagement and research on missing values and present a hierarchical latent response model for identifying and modelling the processes associated with examinee disengagement jointly with the processes associated with engaged responses. To that end, we employ a mixture model that identifies disengagement at the item-by-examinee level by assuming different data-generating processes underlying item responses and omissions, respectively, as well as response times associated with engaged and disengaged behaviour. By modelling examinee engagement with a latent response framework, the model allows assessing how examinee engagement relates to ability and speed as well as to identify items that are likely to evoke disengaged test-taking behaviour. An illustration of the model by means of an application to real data is presented.  相似文献   

4.
There is a consensus that visual working memory (WM) resources are sharply limited, but debate persists regarding the simple question of whether there is a limit to the total number of items that can be stored concurrently. Zhang and Luck (2008) advanced this debate with an analytic procedure that provided strong evidence for random guessing responses, but their findings can also be described by models that deny guessing while asserting a high prevalence of low precision memories. Here, we used a whole report memory procedure in which subjects reported all items in each trial and indicated whether they were guessing with each response. Critically, this procedure allowed us to measure memory performance for all items in each trial. When subjects were asked to remember 6 items, the response error distributions for about 3 out of the 6 items were best fit by a parameter-free guessing model (i.e. a uniform distribution). In addition, subjects’ self-reports of guessing precisely tracked the guessing rate estimated with a mixture model. Control experiments determined that guessing behavior was not due to output interference, and that there was still a high prevalence of guessing when subjects were instructed not to guess. Our novel approach yielded evidence that guesses, not low-precision representations, best explain limitations in working memory. These guesses also corroborate a capacity-limited working memory system – we found evidence that subjects are able to report non-zero information for only 3–4 items. Thus, WM capacity is constrained by an item limit that precludes the storage of more than 3–4 individuated feature values.  相似文献   

5.
The diffusion model (Ratcliff, 1978) and the leaky competing accumulator model (LCA, Usher & McClelland, 2001) were tested against two-choice data collected from the same subjects with the standard response time procedure and the response signal procedure. In the response signal procedure, a stimulus is presented and then, at one of a number of experimenter-determined times, a signal to respond is presented. The models were fit to the data from the two procedures simultaneously under the assumption that responses in the response signal procedure were based on a mixture of decision processes that had already terminated at response boundaries before the signal and decision processes that had not yet terminated. In the latter case, decisions were based on partial information in one variant of each model or on guessing in a second variant. Both variants of the diffusion model fit the data well and both fit better than either variant of the LCA model, although the differences in numerical goodness-of-fit measures were not large enough to allow decisive selection between the models.  相似文献   

6.
An item response theory model for dealing with test speededness is proposed. The model consists of two random processes, a problem solving process and a random guessing process, with the random guessing gradually taking over from the problem solving process. The involved change point and change rate are considered random parameters in order to model examinee differences in both respects. The proposed model is evaluated on simulated data and in a case study. The research reported in this paper was supported by IAP P5/24 and GOA/2005/04, both awarded to Paul De Boeck and Iven Van Mechelen, and by IAP P6/03, awarded to Iven Van Mechelen. Yuri Goegebeur’s research was supported by a grant of the Danish Natural Science Research Council.  相似文献   

7.
Both the speed and accuracy of responding are important measures of performance. A well-known interpretive difficulty is that participants may differ in their strategy, trading speed for accuracy, with no change in underlying competence. Another difficulty arises when participants respond slowly and inaccurately (rather than quickly but inaccurately), e.g., due to a lapse of attention. We introduce an approach that combines response time and accuracy information and addresses both situations. The modeling framework assumes two latent competing processes. The first, the error-free process, always produces correct responses. The second, the guessing process, results in all observed errors and some of the correct responses (but does so via non-specific processes, e.g., guessing in compliance with instructions to respond on each trial). Inferential summaries of the speed of the error-free process provide a principled assessment of cognitive performance reducing the influences of both fast and slow guesses. Likelihood analysis is discussed for the basic model and extensions. The approach is applied to a data set on response times in a working memory test. The authors wish to thank Roger Ratcliff, Christopher Chabris, and three anonymous referees for their helpful comments, and Aureliu Lavric for providing the data analyzed in this paper.  相似文献   

8.
By considering information about response time (RT) in addition to response accuracy (RA), joint models for RA and RT such as the hierarchical model (van der Linden, 2007) can improve the precision with which ability is estimated over models that only consider RA. The hierarchical model, however, assumes that only the person's speed is informative of ability. This assumption of conditional independence between RT and ability given speed may be violated in practice, and ignores collateral information about ability that may be present in the residual RTs. We propose a posterior predictive check for evaluating the assumption of conditional independence between RT and ability given speed. Furthermore, we propose an extension of the hierarchical model that contains cross-loadings between ability and RT, which enables one to take additional collateral information about ability into account beyond what is possible in the standard hierarchical model. A Bayesian estimation procedure is proposed for the model. Using simulation studies, the performance of the model is evaluated in terms of parameter recovery, and the possible gain in precision over the standard hierarchical model and an RA-only model is considered. The model is applied to data from a high-stakes educational test.  相似文献   

9.
Van der Linden's (2007, Psychometrika, 72, 287) hierarchical model for responses and response times in tests has numerous applications in psychological assessment. The success of these applications requires the parameters of the model to have been estimated without bias. The data used for model fitting, however, are often contaminated, for example, by rapid guesses or lapses of attention. This distorts the parameter estimates. In the present paper, a novel estimation approach is proposed that is robust against contamination. The approach consists of two steps. In the first step, the response time model is fitted on the basis of a robust estimate of the covariance matrix. In the second step, the item response model is extended to a mixture model, which allows for a proportion of irregular responses in the data. The parameters of the mixture model are then estimated with a modified marginal maximum likelihood estimator. The modified marginal maximum likelihood estimator downweights responses of test-takers with unusual response time patterns. As a result, the estimator is resistant to several forms of data contamination. The robustness of the approach is investigated in a simulation study. An application of the estimator is demonstrated with real data.  相似文献   

10.
This report compares three feature list sets for capital letters, previously proposed by different investigators, on the ability of each to predict empirical confusion matrices. Toward this end, several variants of assumed information processes in recognition were also compared. The best model incorporated: (1) variable feature retrieval probabilities, (2) a goodness-of-match lower threshold below which guessing determines response, and (3) response bias on guessing trials. This model, when combined with one particular proposed feature list set, produced stress values of less than 9% in comparisons to empirical matrices for each of three different Ss. The feature retrieval probability vectors associated with these minimum-stress predictions were highly correlated ( \(\bar r = .83\) ), suggesting considerable generality of process and feature sets between Ss.  相似文献   

11.
Bayesian models of cognition assume that prior knowledge about the world influences judgments. Recent approaches have suggested that the loss of fidelity from working to long-term (LT) memory is simply due to an increased rate of guessing (e.g. Brady, Konkle, Gill, Oliva, & Alvarez, 2013). That is, recall is the result of either remembering (with some noise) or guessing. This stands in contrast to Bayesian models of cognition while assume that prior knowledge about the world influences judgments, and that recall is a combination of expectations learned from the environment and noisy memory representations. Here, we evaluate the time course of fidelity in LT episodic memory, and the relative contribution of prior category knowledge and guessing, using a continuous recall paradigm. At an aggregate level, performance reflects a high rate of guessing. However, when aggregate data is partitioned by lag (i.e., the number of presentations from study to test), or is un-aggregated, performance appears to be more complex than just remembering with some noise and guessing. We implemented three models: the standard remember-guess model, a three-component remember-guess model, and a Bayesian mixture model and evaluated these models against the data. The results emphasize the importance of taking into account the influence of prior category knowledge on memory.  相似文献   

12.
The speed-accuracy decomposition technique was developed by Meyer, Irwin, Osman, and Kounios (1988) to examine the time course of information processing. The technique allows for the estimation of the accuracy of guesses that are induced by the presentation of a response signal on a proportion of trials. Estimated guessing accuracy has been found to be above chance and to increase as time of guessing increases, suggesting that guesses are based on partial information that has accumulated prior to a response decision (sophisticated guesses). In this paper, a different interpretation of these data is presented. Results suggest that response signals may enhance the speed of regular processes, thereby violating the temporal-independence assumption that underlies the decomposition technique. As shown by Monte Carlo simulations, such facilitating effects of response signals can explain the results from the decomposition technique at least in part and possibly in full, even when guesses are actually at chance accuracy (pure guesses). The pure-guess model was supported by the results from an experiment designed to test between the alternative interpretations. These results point to the need for great caution in the attempt to infer the time course of information processing from guessing accuracies as estimated by the speed-accuracy decomposition technique.  相似文献   

13.
The speed-accuracy decomposition technique was developed by Meyer, Irwin, Osman, and Kounios (1988) to examine the time course of information processing. The technique allows for the estimation of the accuracy of guesses that are induced by the presentation of a response signal on a proportion of trials. Estimated guessing accuracy has been found to be above chance and to increase as time of guessing increases, suggesting that guesses are based on partial information that has accumulated prior to a response decision (sophisticated guesses). In this paper, a different interpretation of these data is presented. Results suggest that response signals may enhance the speed of regular processes, thereby violating the temporal-independence assumption that underlies the decomposition technique. As shown by Monte Carlo simulations, such facilitating effects of response signals can explain the results from the decomposition technique at least in part and possibly in full, even when guesses are actually at chance accuracy (pure guesses). The pure-guess model was supported by the results from an experiment designed to test between the alternative interpretations. These results point to the need for great caution in the attempt to infer the time course of information processing from guessing accuracies as estimated by the speed-accuracy decomposition technique.  相似文献   

14.
摘 要 再认启发式利用再认线索进行决策。以往研究采用一致率、击中率、虚报率和区分指数来表示再认启发式使用,然而这些方法都存在局限。多项式加工树模型能够分离不同的认知加工过程,为了解决再认使用与知识使用的混淆,研究者提出一种多项式加工树模型 r-model 测量再认启发式的使用。本文将重 点介绍 r-model,具体包括 r-model 的内容、数据分析以及考虑个体差异的分层 r-model。最后,从 r-model 的模型修正和边界条件两个方面提出未来研究方向。 关键词 再认启发式;流畅启发式;多项式加工树;贝叶斯分层模型  相似文献   

15.
汪文义  宋丽红  丁树良 《心理学报》2016,48(12):1612-1624
介绍多维项目反应理论模型下分类准确性和分类一致性指标, 采用蒙特卡罗方法实现复杂决策规则下指标计算, 并从数学上证明分类准确性指标两类估计量在均匀先验和相同决策规则条件下依概率收敛于同一真值。研究结果表明:分类准确性指标可以比较准确地评价分类结果的准确性; 分类一致性指标可以较好地评价分类结果的重测一致性; 在一定条件下, 基于能力量尺的指标优于基于原始总分的指标; 纵使测验维度增加, 估计精度仍比较好; 随着测验长度和维度间相关增加, 分类准确性和分类一致性更高。指标可以用来评价标准参照测验或计算机分类测验的多种决策规则下分类信度和效度。  相似文献   

16.
Bayesian IRT Guessing Models for Partial Guessing Behaviors   总被引:1,自引:0,他引:1  
According to the recent Nation’s Report Card, 12th-graders failed to produce gains on the 2005 National Assessment of Educational Progress (NAEP) despite earning better grades on average. One possible explanation is that 12th-graders were not motivated taking the NAEP, which is a low-stakes test. We develop three Bayesian IRT mixture models to describe the results from a group of examinees including both nonguessers and partial guessers. The first assumes that the guesser answers questions based on his or her knowledge up to a certain test item, and guesses thereafter. The second model assumes that the guesser answers relatively easy questions based on his or her knowledge and guesses randomly on the remaining items. The third is constructed to describe more general low-motivation behavior. It assumes that the guesser gives less and less effort as he or she proceeds through the test. The models can provide not only consistent estimates of IRT parameters but also estimates of each examinee’s nonguesser/guesser status and degree of guessing behavior. We show results of a simulation study comparing the performance of the three guessing models to the 2PL-IRT model. Finally, an analysis of real data from a low-stakes test administered to university students is presented.  相似文献   

17.
Most models of response time (RT) in elementary cognitive tasks implicitly assume that the speed-accuracy trade-off is continuous: When payoffs or instructions gradually increase the level of speed stress, people are assumed to gradually sacrifice response accuracy in exchange for gradual increases in response speed. This trade-off presumably operates over the entire range from accurate but slow responding to fast but chance-level responding (i.e., guessing). In this article, we challenge the assumption of continuity and propose a phase transition model for RTs and accuracy. Analogous to the fast guess model (Ollman, 1966), our model postulates two modes of processing: a guess mode and a stimulus-controlled mode. From catastrophe theory, we derive two important predictions that allow us to test our model against the fast guess model and against the popular class of sequential sampling models. The first prediction--hysteresis in the transitions between guessing and stimulus-controlled behavior--was confirmed in an experiment that gradually changed the reward for speed versus accuracy. The second prediction--bimodal RT distributions--was confirmed in an experiment that required participants to respond in a way that is intermediate between guessing and accurate responding.  相似文献   

18.
In item response theory, modelling the item response times in addition to the item responses may improve the detection of possible between- and within-subject differences in the process that resulted in the responses. For instance, if respondents rely on rapid guessing on some items but not on all, the joint distribution of the responses and response times will be a multivariate within-subject mixture distribution. Suitable parametric methods to detect these within-subject differences have been proposed. In these approaches, a distribution needs to be assumed for the within-class response times. In this paper, it is demonstrated that these parametric within-subject approaches may produce false positives and biased parameter estimates if the assumption concerning the response time distribution is violated. A semi-parametric approach is proposed which resorts to categorized response times. This approach is shown to hardly produce false positives and parameter bias. In addition, the semi-parametric approach results in approximately the same power as the parametric approach.  相似文献   

19.
J. O. Ramsay 《Psychometrika》1989,54(3):487-499
In very simple test theory models such as the Rasch model, a single parameter is used to represent the ability of any examinee or the difficulty of any item. Simple models such as these provide very important points of departure for more detailed modeling when a substantial amount of data are available, and are themselves of real practical value for small or even medium samples. They can also serve a normative role in test design.As an alternative to the Rasch model, or the Rasch model with a correction for guessing, a simple model is introduced which characterizes strength of response in terms of the ratio of ability and difficulty parameters rather than their difference. This model provides a natural account of guessing, and has other useful things to contribute as well. It also offers an alternative to the Rasch model with the usual correction for guessing. The three models are compared in terms of statistical properties and fits to actual data. The goal of the paper is to widen the range of minimal models available to test analysts.This research was supported by grant AP320 from the Natural Sciences and Engineering Research Council of Canada. The author is grateful for discussions with M. Abrahamowicz, I. Molenaar, D. Thissen, and H. Wainer.  相似文献   

20.
Response time modelling is developing rapidly in the field of psychometrics, and its use is growing in psychology. In most applications, component models for response times are modelled jointly with component models for responses, thereby stabilizing estimation of item response theory model parameters and enabling research on a variety of novel substantive research questions. Bayesian estimation techniques facilitate estimation of response time models. Implementations of these models in standard statistical software, however, are still sparse. In this accessible tutorial, we discuss one of the most common response time models—the lognormal response time model—embedded in the hierarchical framework by van der Linden (2007). We provide detailed guidance on how to specify and estimate this model in a Bayesian hierarchical context. One of the strengths of the presented model is its flexibility, which makes it possible to adapt and extend the model according to researchers' needs and hypotheses on response behaviour. We illustrate this based on three recent model extensions: (a) application to non-cognitive data incorporating the distance-difficulty hypothesis, (b) modelling conditional dependencies between response times and responses, and (c) identifying differences in response behaviour via mixture modelling. This tutorial aims to provide a better understanding of the use and utility of response time models, showcases how these models can easily be adapted and extended, and contributes to a growing need for these models to answer novel substantive research questions in both non-cognitive and cognitive contexts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号