首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
I describe how multilevel logistic regression can be used to assess the consistency of an individual's response pattern with an item response theory measurement model. Specifically, by treating item responses as being nested within individuals, multilevel logistic regression is used to estimate a person-response curve that models how an individual's item endorsement rate decreases as a function of item difficulty. The slope of an individual's person-response curve is used as an indicator of the degree of response consistency or person-fit. I argue that the proposed multilevel modeling approach to person-fit assessment has several potential advantages over traditional techniques. The most important advantage being that the multilevel modeling approach allows explanatory variables to be entered into the model so that the causes of response inconsistency or differential test functioning can be investigated.  相似文献   

2.
Ss were asked to name the typeface in which a printed item appeared in a discrete trial variant of the Stroop color-word test. Two kinds of items were used: (a) typeface names appearing in antagonistic typefaces, e.g., the word SCRIPT printed in bold type; and (b) nonsense strings constructed by jumbling; the letters in a typeface name, e.g., PSRTCI printed in bold type. Typeface-naming latencies were found to be significantly longer for items of the first kind. Examination of the distributions of individual trial latencies for the two kinds of items indicated that modification of Morton’s (1969) account of the Stroop test is required.  相似文献   

3.
This study assessed the relationships between characteristics of biographical items from the Armed Services Applicant Profile and the items' validity in predicting the retention of enlisted military personnel. Item characteristics were appraised with ratings by expert judges and test takers, word and alternative counts, and response latencies. Item content was also appraised with ratings by expert judges. The more valid items involved overt behavior or experiences, dealt with discrete behavior or experiences, and had heterogeneous content. After controlling for item content, only the latter characteristic was related to validity. Item characteristics and item content interacted in several instances.  相似文献   

4.
This article proposes a factor-analytic model, intended for graded-response or continuous-response personality and attitude items, which includes an additional multiplicative person parameter that models the individual's response mapping process. The model, which is a modification of Spearman's (1904) factor analysis (FA) model, is parameterized as both an FA model and an item response theory (IRT) model and is fully developed to the extent that it can be used in applications. Procedures for (a) calibrating the items and assessing data fit, (b) obtaining individual estimates of both person parameters, (c) determining measurement precision, and (d) assessing differential predictability are proposed and discussed. The potential advantages of the proposal, its practical relevance, and its relations with other approaches are also discussed. Its functioning is assessed with a simulation study and 3 empirical examples in the personality domain.  相似文献   

5.
The response latencies of college students to computer-displayed emotionally arousing or emotionally neutral self-report statement were investigated. Arousing and neutral statements were matched for word and character length to control for reading time. Subjects responded to statements by pressing T (for true), F (for false), or the space bar (for cannot say) keys on the keyboard, and the computer recorded the elapsed time from the onset of each statement's display until the subject's response. Results indicated that response latencies were significantly longer for the arousing items. The findings suggest that the recording of response latencies in computerized personality inventories may help identify item content areas that individuals find emotionally arousing. Other possible clinical and research uses for response latencies in computerized inventories are discussed.  相似文献   

6.
Marginal maximum‐likelihood procedures for parameter estimation and testing the fit of a hierarchical model for speed and accuracy on test items are presented. The model is a composition of two first‐level models for dichotomous responses and response times along with multivariate normal models for their item and person parameters. It is shown how the item parameters can easily be estimated using Fisher's identity. To test the fit of the model, Lagrange multiplier tests of the assumptions of subpopulation invariance of the item parameters (i.e., no differential item functioning), the shape of the response functions, and three different types of conditional independence were derived. Simulation studies were used to show the feasibility of the estimation and testing procedures and to estimate the power and Type I error rate of the latter. In addition, the procedures were applied to an empirical data set from a computerized adaptive test of language comprehension.  相似文献   

7.
The original data of McGurk's classic study of black-white differences on cultural and noncultural tests is re-analyzed at the item level to investigate the role of possible item biases that would cause the noncultural items to be relatively more difficult than the cultural items for blacks than for whites. The evidence indicates that McGurk's results cannot be explained in terms of item biases, but appear to be the result of the noncultural items requiring more sheer reasoning ability than the cultural items, which depend more on acquired information.  相似文献   

8.
Although it is currently popular to model human associative learning using connectionist networks, the mechanism by which their output activations are converted to probabilities of response has received relatively little attention. Several possible models of this decision process are considered here, including a simple ratio rule, a simple difference rule, their exponential versions, and a winner-take-all network. Two categorization experiments that attempt to dissociate these models are reported. Analogues of the experiments were presented to a single-layer, feed-forward, delta-rule network. Only the exponential ratio rule and the winner-take-all architecture, acting on the networks' output activations that corresponded to responses available on test, were capable of fully predicting the mean response results. In addition, unlike the exponential ratio rule, the winner-take-all model has the potential to predict latencies. Further studies will be required to determine whether latencies produced under more stringent conditions conform to the model's predictions.  相似文献   

9.
Most item response theory (IRT) models for dichotomous responses are based on probit or logit link functions which assume a symmetric relationship between the probability of a correct response and the latent traits of individuals taking a test. This assumption restricts the use of those models to the case in which all items behave symmetrically. On the other hand, asymmetric models proposed in the literature impose that all the items in a test behave asymmetrically. This assumption is inappropriate for great majority of tests which are, in general, composed of both symmetric and asymmetric items. Furthermore, a straightforward extension of the existing models in the literature would require a prior selection of the items' symmetry/asymmetry status. This paper proposes a Bayesian IRT model that accounts for symmetric and asymmetric items in a flexible but parsimonious way. That is achieved by assigning a finite mixture prior to the skewness parameter, with one of the mixture components being a point mass at zero. This allows for analyses under both model selection and model averaging approaches. Asymmetric item curves are designed through the centred skew normal distribution, which has a particularly appealing parametrization in terms of parameter interpretation and computational efficiency. An efficient Markov chain Monte Carlo algorithm is proposed to perform Bayesian inference and its performance is investigated in some simulated examples. Finally, the proposed methodology is applied to a data set from a large-scale educational exam in Brazil.  相似文献   

10.
In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject’s response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.  相似文献   

11.
Subjects studied either an 8- or 16-word list and later recalled the items while a voice key recorded each response latency. The trials were partitioned by recall total in order to examine the means and distributions of both latencies and interresponse times as a function of recall total. Each analysis was consistent with the view that an item’s absolute strength determineswhether it is recalled whereas an item’s relative strength determineswhen it is recalled. In addition, mean latency was effectively proportional to study list length yet independent of recall total. All of the analyses were consistent with the view that the set of study items is sampled according to a relative-strength rule until all items are found and that a sampled item is recovered into consciousness only when its absolute strength exceeds a fixed threshold.  相似文献   

12.
In analyzing responses and response times to personality questionnaire items, models have been proposed which include the so-called “inverted-U effect.” These models predict that response times to personality test items decrease as the latent trait value of a given person gets closer to the attractiveness of an item. Initial studies into these models have focused on dichotomous personality items, and more recently, models for Likert-type scale items have been proposed. In all these models, it is assumed that the inverted-U effect is symmetrical around 0, while, as will be explained in this article, there are substantive and statistical reasons to study this assumption. Therefore, in this article, a general inverted-U model is proposed which accommodates two sources of asymmetry between the response times and the attractiveness of the items. The viability of this model is demonstrated in a simulation study, and the model is applied to the responses and response times of the Temperament and Character Inventory–Revised, covering a broad range of personality dimensions.  相似文献   

13.
Two studies were conducted to examine the effects of job familiarity and impression management on response latencies and scale scores for measures of personality and situational judgment. In a laboratory study using university students and a field study using U.S. Border Patrol Agent applicants, impression management was generally associated with faster personality item responses when job familiarity was high and with slower responses when job familiarity was low. Both impression management and job familiarity were associated with personality item responses that were more likely to lead to a job offer. The field study revealed a similar pattern of results for situational judgment scale response latencies, although only impression management was associated with item responses that were more likely to lead to a job offer. The implications for using response latencies to detect impression management on self-report measures are discussed.  相似文献   

14.
Tustin (1994) recently observed that an individual's preference for one of two concurrently available reinforcers under low schedule requirements (concurrent fixed-ratio [FR] 1) switched to the other reinforcer when the schedule requirements were high (concurrent FR 10). We extended this line of research by examining preference for similar and dissimilar reinforcers (i.e., those affecting the same sensory modality and those affecting different sensory modalities). Two individuals with developmental disabilities were exposed to an arrangement in which pressing two different panels produced two different reinforcers according to progressively increasing, concurrent-ratio schedules. When two dissimilar stimuli were concurrently available (food and a leisure item), no clear preference for one item over the other was observed, regardless of the FR schedules in effect (FR 1, 2, 5, 10, and 20). By contrast, when two similar stimuli were concurrently available (two food items), a clear preference for one item emerged as the schedule requirements were increased from FR 1 to FR 5 or FR 10. These results are discussed in terms of implications for conducting preference assessments and for selecting reinforcers to be used under training conditions in which response requirements are relatively high or effortful.  相似文献   

15.
Overdetection and underdetection of depression and anxiety in primary care are common and may partly reflect individuals' misperceptions of the severity of symptoms they experience. Here, we explore how people's judgments about the severity of their own symptoms are influenced by their beliefs about the distribution of symptoms experienced by the rest of the population. We apply the rank‐based decision by sampling cognitive model of judgment to symptom severity. The model proposes that judgments depend on the relative rank of an item within a mental sample of comparable items. It is predicted that judgments of symptom severity will be context dependent and more specifically that an individual's judgments will be invalid to the extent that the individual has inaccurate beliefs about the relevant social context. Two studies found that participants' assessments of symptom severity were rank based. Study 1 elicited participants' beliefs about the social distribution of symptoms and found that participants' judgments of whether they were depressed or anxious were mainly predicted not by their symptoms' objective severity but rather by where participants ranked the severity of their symptoms in comparison with the believed symptoms of others. Study 2 varied symptom distributions experimentally and again found relative rank effects as predicted. It is concluded that the real‐world application of contextual models of judgment requires investigation of individual differences in participants' background beliefs. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

16.
Assessment of irrational beliefs by such measures as the Common Beliefs Survey III (CBS) has traditionally relied upon classical test theory assumptions, in which the properties of specific test items are less important than the total test score as the aggregate of all item responses. An alternative approach using item response theory (IRT) methodology allows one to specify the parameters of difficulty and discrimination for each test item. Difficulty levels of CBS items range along a continuum of irrationality, the implied latent trait measured by responses to the questionnaire as a whole. We evaluated the CBS responses of 605 individuals from clinical and college settings, drawing from current and archival data. The original Likert scale ratings were recoded into dichotomous scores. Fourteen of the 54 items were highly or very highly discriminating in distinguishing respondents with high and low irrationality levels. However, discriminating items exhibited a very narrow range of difficulty; most functioned at a point a little above the halfway mark on the continuum of irrationality. Item characteristic curves and test information curves were very similar for female (n = 424) and male (n = 179) respondents. We derived a 4-item screening test for irrationality from our IRT analyses of the 54 CBS items. Further test development, focused on the selection and scaling of items with a much broader range of difficulty, would facilitate evaluation of the hierarchical structure of irrational beliefs. Portions of this paper were presented at the 39th Annual Convention of the Association for Behavioral and Cognitive Therapies, Washington, DC, November, 2005.  相似文献   

17.
项目反应理论(IRT)模型依据项目与被试的特征预测被试的作答表现, 是常用的心理测量模型。但IRT的有效运用依赖于所选用IRT模型与实际数据资料相符合的程度(即模型?资料拟合度, goodness of fit)。只有当所采用IRT分析模型与实际数据资料拟合较好时, IRT的优点和功能才能真正发挥出来(Orlando & Thissen, 2000)。而当所采用IRT模型与资料不拟合或选择了错误的模型, 则会导致如参数估计、测验等值及项目功能差异分析等具有较大误差(Kang, Cohen & Sung, 2009), 给实际工作带来不良影响。因此, 在使用IRT分析时, 应首先充分考察及检验所选用模型与实际数据是否相匹配/相拟合(McKinley & Mills, 1985)。IRT领域中常用模型?资料拟合检验统计量可从项目拟合、测验拟合两个角度进行阐述并比较, 这是心理、教育测量领域的重要主题, 也是测验分析过程中较易忽视的环节, 目前还未见此类公开发表的文章。未来的研究可以在各统计量的实证比较研究以及在认知诊断领域的拓展方面有所发展。  相似文献   

18.
An experiment was conducted to investigate the effects of item order and questionnaire content on faking good or intentional response distortion. It was hypothesized that intentional response distortion would either increase towards the end of a long questionnaire, as learning effects might make it easier to adjust responses to a faking good schema, or decrease because applicants' will to distort responses is reduced if the questionnaire lasts long enough. Furthermore, it was hypothesized that certain types of questionnaire content are especially vulnerable to response distortion. Eighty‐four pre‐selected pilot applicants filled out a questionnaire consisting of 516 items including items from the NEO five factor inventory (NEO FFI), NEO personality inventory revised (NEO PI‐R) and business‐focused inventory of personality (BIP). The positions of the items were varied within the applicant sample to test if responses are affected by item order, and applicants' response behaviour was additionally compared to that of volunteers. Applicants reported significantly higher mean scores than volunteers, and results provide some evidence of decreased faking tendencies towards the end of the questionnaire. Furthermore, it could be demonstrated that lower variances or standard deviations in combination with appropriate (often higher) mean scores can serve as an indicator for faking tendencies in group comparisons, even if effects are not significant.  相似文献   

19.
According to the asymmetry model of bilingual representation (Kroll & Stewart, 1994), the first language (L1) lexicon is closely tied to an underlying conceptual memory, whereas second language (L2) items are mostly associated with their L1 equivalents. An outcome of this architecture is that L1-to-L2, or forward, translation must be mediated by the conceptual memory, whereas L2-to-L1 (backward) translation takes a direct lexical path. Some predictions derived from this hypothetical structure were tested in the present study, which took into account, through analysis of covariance, variations in response production time, concept retrieval time, and some other characteristics associated with the individual test items. Proficient Chinese-English bilinguals were tested on delayed production (Balota & Chumbley, 1985), picture naming, word translation, and category matching. The expected asymmetrical pattern of translation latencies (i.e., forward > backward) was demonstrated, although it could be statistically explained by the item characteristic of familiarity; matching an L1 item to a category name was faster than matching an L2 item, suggesting relatively strong L1 conceptual links. The present results are best accommodated by a form of asymmetry that allows for nondominant L2-concept linkage, the use of which is conditional upon the familiarity of the test item to the bilingual.  相似文献   

20.
Cognitive diagnosis models of educational test performance rely on a binary Q‐matrix that specifies the associations between individual test items and the cognitive attributes (skills) required to answer those items correctly. Current methods for fitting cognitive diagnosis models to educational test data and assigning examinees to proficiency classes are based on parametric estimation methods such as expectation maximization (EM) and Markov chain Monte Carlo (MCMC) that frequently encounter difficulties in practical applications. In response to these difficulties, non‐parametric classification techniques (cluster analysis) have been proposed as heuristic alternatives to parametric procedures. These non‐parametric classification techniques first aggregate each examinee's test item scores into a profile of attribute sum scores, which then serve as the basis for clustering examinees into proficiency classes. Like the parametric procedures, the non‐parametric classification techniques require that the Q‐matrix underlying a given test be known. Unfortunately, in practice, the Q‐matrix for most tests is not known and must be estimated to specify the associations between items and attributes, risking a misspecified Q‐matrix that may then result in the incorrect classification of examinees. This paper demonstrates that clustering examinees into proficiency classes based on their item scores rather than on their attribute sum‐score profiles does not require knowledge of the Q‐matrix, and results in a more accurate classification of examinees.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号