首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
In real testing, examinees may manifest different types of test‐taking behaviours. In this paper we focus on two types that appear to be among the more frequently occurring behaviours – solution behaviour and rapid guessing behaviour. Rapid guessing usually happens in high‐stakes tests when there is insufficient time, and in low‐stakes tests when there is lack of effort. These two qualitatively different test‐taking behaviours, if ignored, will lead to violation of the local independence assumption and, as a result, yield biased item/person parameter estimation. We propose a mixture hierarchical model to account for differences among item responses and response time patterns arising from these two behaviours. The model is also able to identify the specific behaviour an examinee engages in when answering an item. A Monte Carlo expectation maximization algorithm is proposed for model calibration. A simulation study shows that the new model yields more accurate item and person parameter estimates than a non‐mixture model when the data indeed come from two types of behaviour. The model also fits real, high‐stakes test data better than a non‐mixture model, and therefore the new model can better identify the underlying test‐taking behaviour an examinee engages in on a certain item.  相似文献   

In item response theory, modelling the item response times in addition to the item responses may improve the detection of possible between- and within-subject differences in the process that resulted in the responses. For instance, if respondents rely on rapid guessing on some items but not on all, the joint distribution of the responses and response times will be a multivariate within-subject mixture distribution. Suitable parametric methods to detect these within-subject differences have been proposed. In these approaches, a distribution needs to be assumed for the within-class response times. In this paper, it is demonstrated that these parametric within-subject approaches may produce false positives and biased parameter estimates if the assumption concerning the response time distribution is violated. A semi-parametric approach is proposed which resorts to categorized response times. This approach is shown to hardly produce false positives and parameter bias. In addition, the semi-parametric approach results in approximately the same power as the parametric approach.  相似文献   

We show how the hierarchical model for responses and response times as developed by van der Linden (2007), Fox, Klein Entink, and van der Linden (2007), Klein Entink, Fox, and van der Linden (2009), and Glas and van der Linden (2010) can be simplified to a generalized linear factor model with only the mild restriction that there is no hierarchical model at the item side. This result is valuable as it enables all well‐developed modelling tools and extensions that come with these methods. We show that the restriction we impose on the hierarchical model does not influence parameter recovery under realistic circumstances. In addition, we present two illustrative real data analyses to demonstrate the practical benefits of our approach.  相似文献   

Given a drift diffusion model with unknown drift and boundary parameters, we analyse the behaviour of maximum likelihood estimates with respect to changes of responses and response times. It is shown analytically that a single fast response time can dominate the estimation in that no matter how many correct answers a test taker provides, the estimate of the drift (ability) parameter decreases to zero. In addition, it is shown that although higher drift rates imply shorter response times, the reverse implication does not hold for the estimates: shorter response times can decrease the drift rate estimate. In the light of these analytical results, we illustrate the actual impact of the findings in a small simulation for a mental rotation test. The method of analysis outlined is applicable to a broader range of models, and we emphasize the need to further check currently used reaction time models within this framework.  相似文献   

We give closed form expressions for the mean and variance of RTs for Ratcliff’s diffusion model [Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85, 59-108] under the simplifying assumption that there is no variability across trials in the parameters. These expressions are more general than those currently available. As an application, we demonstrate their use in a method-of-moments estimation procedure that addresses some of the weaknesses of the EZ method [Wagenmakers, E.-J., van der Maas, H. L. J., & Grasman, R. P. P. P. (2007). An EZ-diffusion model for response time and accuracy. Psychonomic Bulletin & Review, 14, 3-22], and illustrate this with lexical decision data. We discuss further possible applications.  相似文献   

Van der Linden's (2007, Psychometrika, 72, 287) hierarchical model for responses and response times in tests has numerous applications in psychological assessment. The success of these applications requires the parameters of the model to have been estimated without bias. The data used for model fitting, however, are often contaminated, for example, by rapid guesses or lapses of attention. This distorts the parameter estimates. In the present paper, a novel estimation approach is proposed that is robust against contamination. The approach consists of two steps. In the first step, the response time model is fitted on the basis of a robust estimate of the covariance matrix. In the second step, the item response model is extended to a mixture model, which allows for a proportion of irregular responses in the data. The parameters of the mixture model are then estimated with a modified marginal maximum likelihood estimator. The modified marginal maximum likelihood estimator downweights responses of test-takers with unusual response time patterns. As a result, the estimator is resistant to several forms of data contamination. The robustness of the approach is investigated in a simulation study. An application of the estimator is demonstrated with real data.  相似文献   

The stochastic model for the evolution of preferences proposed by Falmagne, Regenwetter, and Grofman [1997. Journal of Mathematical Psychology, 41, 129-143] and tested by Regenwetter, Falmagne, and Grofman [1999. Psychological Review, 106, 362-384], as well as the alternative Thurstonian model of Böckenholt [Falmagne, J.-C., Regenwetter, M., & Grofman, B. (1997). A stochastic model for the evolution of preferences. In A. A. J. Marley (Ed.), Choice, decision and measurement: Essays in honor of R. Duncan Luce (pp. 113-131). Mahwah, NJ: Lawrence Erlbaum.], gave a good statistical account of attitudinal panel data from the 1992 US presidential election. We show, however, that both models have the defect of underestimating the number of respondents who did not change their order of preference for the candidates across different polls. We present a generalization of Falmagne et al.'s model based on the idea that some individuals may become momentarily impervious to all matters related to the campaign and ‘tune out.’ This behavior could be triggered by some personal reason or by some external event related to the campaign. Like the original model, the resulting model is a random walk, but on an augmented set of states. A respondent in a ‘live’ state behaves as in the previous model, except when receiving a ‘tune-out’ token, which effectively freezes the respondent's preference state until it is reversed by a ‘tune-in’ token. We describe and successfully test the new model on the same 1992 National Election Study panel data as those used by Böckenholt (2002) and Regenwetter et al. (1999).  相似文献   

Latent trait models for responses and response times in tests often lack a substantial interpretation in terms of a cognitive process model. This is a drawback because process models are helpful in clarifying the meaning of the latent traits. In the present paper, a new model for responses and response times in tests is presented. The model is based on the proportional hazards model for competing risks. Two processes are assumed, one reflecting the increase in knowledge and the second the tendency to discontinue. The processes can be characterized by two proportional hazards models whose baseline hazard functions correspond to the temporary increase in knowledge and discouragement. The model can be calibrated with marginal maximum likelihood estimation and an application of the ECM algorithm. Two tests of model fit are proposed. The amenability of the proposed approaches to model calibration and model evaluation is demonstrated in a simulation study. Finally, the model is used for the analysis of two empirical data sets.  相似文献   

Psychologists take two propositions for granted. Specifically, empirical verification of predictions derived from a theory (a) support that the theory is more likely to be true and (b) support that additional predictions derived from the theory have an increased probability of being sustained if subjected to empirical testing. In contrast, I argue that both propositions depend strongly on whether auxiliary assumptions are taken into account. When auxiliary assumptions are not taken into account, the first proposition is valid but the second is not. When auxiliary assumptions are taken into account, the first proposition is not valid, and the second proposition encounters additional problems. I use Venn diagrams and Bayesian principles to demonstrate these conclusions.  相似文献   

An important distinction between different models for response time and accuracy is whether conditional independence (CI) between response time and accuracy is assumed. In the present study, a test for CI given an exponential family model for accuracy (for example, the Rasch model or the one‐parameter logistic model) is proposed and evaluated in a simulation study. The procedure is based on the non‐parametric Kolmogorov–Smirnov tests. As an illustrative example, the CI test was applied to data from an arithmetics test for secondary education.  相似文献   

The item response times (RTs) collected from computerized testing represent an underutilized source of information about items and examinees. In addition to knowing the examinees’ responses to each item, we can investigate the amount of time examinees spend on each item. In this paper, we propose a semi‐parametric model for RTs, the linear transformation model with a latent speed covariate, which combines the flexibility of non‐parametric modelling and the brevity as well as interpretability of parametric modelling. In this new model, the RTs, after some non‐parametric monotone transformation, become a linear model with latent speed as covariate plus an error term. The distribution of the error term implicitly defines the relationship between the RT and examinees’ latent speeds; whereas the non‐parametric transformation is able to describe various shapes of RT distributions. The linear transformation model represents a rich family of models that includes the Cox proportional hazards model, the Box–Cox normal model, and many other models as special cases. This new model is embedded in a hierarchical framework so that both RTs and responses are modelled simultaneously. A two‐stage estimation method is proposed. In the first stage, the Markov chain Monte Carlo method is employed to estimate the parametric part of the model. In the second stage, an estimating equation method with a recursive algorithm is adopted to estimate the non‐parametric transformation. Applicability of the new model is demonstrated with a simulation study and a real data application. Finally, methods to evaluate the model fit are suggested.  相似文献   

In tailored testing, it is important to determine the optimal difficulty of the next item to present to the examinee. This paper shows that the difference that maximizes information for the three-parameter normal ogive response model is approximately 1.7 times the optimal differenceb for the three-parameter logistic model. Under the normal model, calculation of the optimal difficulty for minimizing the Bayes risk is equivalent to maximizing an associated information function.The views expressed herein, are those of the author and do not necessarily reflect those of the Department of the Navy.  相似文献   

This article explores the consequences for factorial additivity in a Sternberg [(1969). The discovery of processing stages: Extensions of donders method In: W.G. Koster (Ed.), Attention and performance II, Acta Psychologica, 30, 276-315] additive-factors paradigm of the assumptions adopted by models of perception that relate the representation of a stimulus to decision time. Three example models, signal detection theory with the latency-distance hypothesis, stochastic general recognition theory, and a random walk model of exemplar classification, are interrogated to determine what type of interaction they predict factors will yield in a hypothetical factorial (choice) reaction time experiment in which the ‘empirical’ factors’ effects are manifest as parameter changes. All frameworks make the critical assumption that decision time depends on the perceptual representation of the stimulus as well as the architecture. As a consequence, nonadditivity of factors thought to affect different “stages” in the classical approach emerges within the current modeling approach. The nature of this influence is revealed through analytic investigations and simulation. Earlier empirical findings of failures of selective influence that have defied adequate explanation are reinterpreted in light of the present findings.  相似文献   

In previous works, in which the topological model has been applied to martensitic phase transformations, the value of twist angle ω was determined based on the habit plane-(HP) matching method, where the physical realization of the so-predicted interfacial defect networks may require reorientations of defect line directions by short-range diffusion, though no long-range diffusion was needed. In the present work, a novel criterion for determining the optimum value of twist is proposed so that the predicted interface defects are not only able to fulfil the function of fully accommodating the coherency strains arising on the terrace plane, but also capable of reaching the required position at the HP without long- or short-range diffusions. A numerical analysis for an Fe–20Ni–5Mn alloy is demonstrated based on the newly proposed criterion, and the predictions so obtained are in good agreement with the results provided by the phenomenological theory and experimental measurements.  相似文献   

Three methods for fitting the diffusion model (Ratcliff, 1978) to experimental data are examined. Sets of simulated data were generated with known parameter values, and from fits of the model, we found that the maximum likelihood method was better than the chi-square and weighted least squares methods by criteria of bias in the parameters relative to the parameter values used to generate the data and standard deviations in the parameter estimates. The standard deviations in the parameter values can be used as measures of the variability in parameter estimates from fits to experimental data. We introduced contaminant reaction times and variability into the other components of processing besides the decision process and found that the maximum likelihood and chi-square methods failed, sometimes dramatically. But the weighted least squares method was robust to these two factors. We then present results from modifications of the maximum likelihood and chi-square methods, in which these factors are explicitly modeled, and show that the parameter values of the diffusion model are recovered well. We argue that explicit modeling is an important method for addressing contaminants and variability in nondecision processes and that it can be applied in any theoretical approach to modeling reaction time.  相似文献   

汪文义  宋丽红  丁树良 《心理学报》2016,48(12):1612-1624
介绍多维项目反应理论模型下分类准确性和分类一致性指标, 采用蒙特卡罗方法实现复杂决策规则下指标计算, 并从数学上证明分类准确性指标两类估计量在均匀先验和相同决策规则条件下依概率收敛于同一真值。研究结果表明:分类准确性指标可以比较准确地评价分类结果的准确性; 分类一致性指标可以较好地评价分类结果的重测一致性; 在一定条件下, 基于能力量尺的指标优于基于原始总分的指标; 纵使测验维度增加, 估计精度仍比较好; 随着测验长度和维度间相关增加, 分类准确性和分类一致性更高。指标可以用来评价标准参照测验或计算机分类测验的多种决策规则下分类信度和效度。  相似文献   

This paper concerns items that consist of several item steps to be responded to sequentially. The item scoreX is defined as the number of correct responses until the first failure. Samejima's graded response model states that each steph=1,...,m is characterized by a parameterb h , and, for a subject with ability, Pr(Xh; )=F(–b h ). Tutz's general sequential model associates with each step a parameterdh, and it states that Pr(Xh;)= r =1h G(d r ). Tutz's (1991, 1997) conjectures that the models are equivalent if and only ifF(x)=G(x) is an extreme value distribution. This paper presents a proof for this conjecture.  相似文献   

We examined matching bias in syllogistic reasoning by analysing response times, confidence ratings, and individual differences. Roberts’ (2005) “negations paradigm” was used to generate conflict between the surface features of problems and the logical status of conclusions. The experiment replicated matching bias effects in conclusion evaluation (Stupple & Waterhouse, 2009), revealing increased processing times for matching/logic “conflict problems”. Results paralleled chronometric evidence from the belief bias paradigm indicating that logic/belief conflict problems take longer to process than non-conflict problems (Stupple, Ball, Evans, & Kamal-Smith, 2011). Individuals’ response times for conflict problems also showed patterns of association with the degree of overall normative responding. Acceptance rates, response times, metacognitive confidence judgements, and individual differences all converged in supporting dual-process theory. This is noteworthy because dual-process predictions about heuristic/analytic conflict in syllogistic reasoning generalised from the belief bias paradigm to a situation where matching features of conclusions, rather than beliefs, were set in opposition to logic.  相似文献   

Van Breukelen offers a promising method for modeling both response speed and response accuracy. However, the underlying conception of both dependent measures is somewhat flawed, leading the author to conclude that the approach possesses limitations that, under revised assumptions, may not hold. The central misconception, and a set of related misconceptions, is addressed, and it is suggested that this approach holds a good deal of promise for application in the perceptual and cognitive sciences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号