首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A conventional way to analyze item responses in multiple tests is to apply unidimensional item response models separately, one test at a time. This unidimensional approach, which ignores the correlations between latent traits, yields imprecise measures when tests are short. To resolve this problem, one can use multidimensional item response models that use correlations between latent traits to improve measurement precision of individual latent traits. The improvements are demonstrated using 2 empirical examples. It appears that the multidimensional approach improves measurement precision substantially, especially when tests are short and the number of tests is large. To achieve the same measurement precision, the multidimensional approach needs less than half of the comparable items required for the unidimensional approach.  相似文献   

2.
In a pre‐test–post‐test cluster randomized trial, one of the methods commonly used to detect an intervention effect involves controlling pre‐test scores and other related covariates while estimating an intervention effect at post‐test. In many applications in education, the total post‐test and pre‐test scores, ignoring measurement error, are used as response variable and covariate, respectively, to estimate the intervention effect. However, these test scores are frequently subject to measurement error, and statistical inferences based on the model ignoring measurement error can yield a biased estimate of the intervention effect. When multiple domains exist in test data, it is sometimes more informative to detect the intervention effect for each domain than for the entire test. This paper presents applications of the multilevel multidimensional item response model with measurement error adjustments in a response variable and a covariate to estimate the intervention effect for each domain.  相似文献   

3.
A new item response theory (IRT) model with a tree structure has been introduced for modeling item response processes with a tree structure. In this paper, we present a generalized item response tree model with a flexible parametric form, dimensionality, and choice of covariates. The utilities of the model are demonstrated with two applications in psychological assessments for investigating Likert scale item responses and for modeling omitted item responses. The proposed model is estimated with the freely available R package flirt (Jeon et al., 2014b).  相似文献   

4.
A generalized dimensionality discrepancy measure is introduced to facilitate a critique of dimensionality assumptions in multidimensional item response models. Connections between dimensionality and local independence motivate the development of the discrepancy measure from a conditional covariance theory perspective. A simulation study and a real‐data analysis demonstrate the utility of the discrepancy measure's application at multiple levels of analysis in a posterior predictive model checking framework.  相似文献   

5.
The use of multidimensional forced-choice (MFC) items to assess non-cognitive traits such as personality, interests and values in psychological tests has a long history, because MFC items show strengths in preventing response bias. Recently, there has been a surge of interest in developing item response theory (IRT) models for MFC items. However, nearly all of the existing IRT models have been developed for MFC items with binary scores. Real tests use MFC items with more than two categories; such items are more informative than their binary counterparts. This study developed a new IRT model for polytomous MFC items based on the cognitive model of choice, which describes the cognitive processes underlying humans' preferential choice behaviours. The new model is unique in its ability to account for the ipsative nature of polytomous MFC items, to assess individual psychological differentiation in interests, values and emotions, and to compare the differentiation levels of latent traits between individuals. Simulation studies were conducted to examine the parameter recovery of the new model with existing computer programs. The results showed that both statement parameters and person parameters were well recovered when the sample size was sufficient. The more complete the linking of the statements was, the more accurate the parameter estimation was. This paper provides an empirical example of a career interest test using four-category MFC items. Although some aspects of the model (e.g., the nature of the person parameters) require additional validation, our approach appears promising.  相似文献   

6.
Person-fit statistics have been proposed to investigate the fit of an item score pattern to an item response theory (IRT) model. The author investigated how these statistics can be used to detect different types of misfit. Intelligence test data were analyzed using person-fit statistics in the context of the G. Rasch (1960) model and R. J. Mokken's (1971, 1997) IRT models. The effect of the choice of an IRT model to detect misfitting item score patterns and the usefulness of person-fit statisticsfor diagnosis of misfit are discussed. Results showed that different types of person-fit statistics can be used to detect different kinds of person misfit. Parametric person-fit statistics had more power than nonparametric person-fit statistics.  相似文献   

7.
Multidimensional item response theory (MIRT) is widely used in assessment and evaluation of educational and psychological tests. It models the individual response patterns by specifying a functional relationship between individuals' multiple latent traits and their responses to test items. One major challenge in parameter estimation in MIRT is that the likelihood involves intractable multidimensional integrals due to the latent variable structure. Various methods have been proposed that involve either direct numerical approximations to the integrals or Monte Carlo simulations. However, these methods are known to be computationally demanding in high dimensions and rely on sampling data points from a posterior distribution. We propose a new Gaussian variational expectation--maximization (GVEM) algorithm which adopts variational inference to approximate the intractable marginal likelihood by a computationally feasible lower bound. In addition, the proposed algorithm can be applied to assess the dimensionality of the latent traits in an exploratory analysis. Simulation studies are conducted to demonstrate the computational efficiency and estimation precision of the new GVEM algorithm compared to the popular alternative Metropolis–Hastings Robbins–Monro algorithm. In addition, theoretical results are presented to establish the consistency of the estimator from the new GVEM algorithm.  相似文献   

8.
The aim of latent variable selection in multidimensional item response theory (MIRT) models is to identify latent traits probed by test items of a multidimensional test. In this paper the expectation model selection (EMS) algorithm proposed by Jiang et al. (2015) is applied to minimize the Bayesian information criterion (BIC) for latent variable selection in MIRT models with a known number of latent traits. Under mild assumptions, we prove the numerical convergence of the EMS algorithm for model selection by minimizing the BIC of observed data in the presence of missing data. For the identification of MIRT models, we assume that the variances of all latent traits are unity and each latent trait has an item that is only related to it. Under this identifiability assumption, the convergence of the EMS algorithm for latent variable selection in the multidimensional two-parameter logistic (M2PL) models can be verified. We give an efficient implementation of the EMS for the M2PL models. Simulation studies show that the EMS outperforms the EM-based L1 regularization in terms of correctly selected latent variables and computation time. The EMS algorithm is applied to a real data set related to the Eysenck Personality Questionnaire.  相似文献   

9.
The four-parameter logistic (4PL) item response model, which includes an upper asymptote for the correct response probability, has drawn increasing interest due to its suitability for many practical scenarios. This paper proposes a new Gibbs sampling algorithm for estimation of the multidimensional 4PL model based on an efficient data augmentation scheme (DAGS). With the introduction of three continuous latent variables, the full conditional distributions are tractable, allowing easy implementation of a Gibbs sampler. Simulation studies are conducted to evaluate the proposed method and several popular alternatives. An empirical data set was analysed using the 4PL model to show its improved performance over the three-parameter and two-parameter logistic models. The proposed estimation scheme is easily accessible to practitioners through the open-source IRTlogit package.  相似文献   

10.
Wagner TA  Harvey RJ 《心理评价》2006,18(1):100-105
The authors describe the initial development of the Wagner Assessment Test (WAT), an instrument designed to assess critical thinking, using the 5-faceted view popularized by the Watson-Glaser Critical Thinking Appraisal (WGCTA; G. B. Watson & E. M. Glaser, 1980). The WAT was designed to reduce the degree of successful guessing relative to the WGCTA by increasing the number of response alternatives (i.e., 80% of WGCTA items are 2-alternative, multiple-choice), a change that was hypothesized to result in more desirable test information and standard-error functions. Analyses using the 3-parameter logistic item response theory (IRT) model in a sample of undergraduates (N = 407) supported this prediction, even when the WAT item pool was shortened to match the length of the WGCTA. Convergent validity between full-pool IRT score estimates was r = .69. Implications for subsequent research on IRT-based measurement of critical thinking are discussed.  相似文献   

11.
A first-order autoregressive growth model is proposed for longitudinal binary item analysis where responses to the same items are conditionally dependent across time given the latent traits. Specifically, the item response probability for a given item at a given time depends on the latent trait as well as the response to the same item at the previous time, or the lagged response. An initial conditions problem arises because there is no lagged response at the initial time period. We handle this problem by adapting solutions proposed for dynamic models in panel data econometrics. Asymptotic and finite sample power for the autoregressive parameters are investigated. The consequences of ignoring local dependence and the initial conditions problem are also examined for data simulated from a first-order autoregressive growth model. The proposed methods are applied to longitudinal data on Korean students’ self-esteem.  相似文献   

12.
In this paper it will be shown that a certain class of constrained latent class models may be interpreted as a special case of nonparametric multidimensional item response models. The parameters of this latent class model will be estimated using an application of the Gibbs sampler. It will be illustrated that the Gibbs sampler is an excellent tool if inequality constraints have to be taken into consideration when making inferences. Model fit will be investigated using posterior predictive checks. Checks for manifest monotonicity, the agreement between the observed and expected conditional association structure, marginal local homogeneity, and the number of latent classes will be presented.This paper is supported by grant S40-645 of the Dutch Organization for Scientific Research (NWO).  相似文献   

13.
Additional information contained in incorrect responses calls for a multicategorical rather than a binary analysis of multiple choice data. A nonparametric divided-by-total model for joint maximum likelihood estimation of probability-of-choice functions (for particular responses) and of latent ability is proposed. The model approximates probability functions by rational splines. Some illustrative examples of real test data analysis and the results of a Monte Carlo study are presented.The research in this paper was supported by the National Sciences and Engineering Research Council of Canada Grants OGP0105521 and APA 320 awarded to the first and the second author, respectively. The authors are indebted to R. Melzack and A. Baker for making available the data analyzed in this paper. We would also like to thank J. McKenna and B. Cont for their assistance in editing this paper.  相似文献   

14.
Multidimensional item response theory (MIRT) models for response style (e.g., Bolt, Lu, & Kim, 2014, Psychological Methods, 19, 528; Falk & Cai, 2016, Psychological Methods, 21, 328) provide flexibility in accommodating various response styles, but often present difficulty in isolating the effects of response style(s) from the intended substantive trait(s). In the presence of such measurement limitations, we consider several ways in which MIRT models are nevertheless useful in lending insight into how response styles may interfere with measurement for a given test instrument. Such a study can also inform whether alternative design considerations (e.g., anchoring vignettes, self-report items of heterogeneous content) that seek to control for response style effects may be helpful. We illustrate several aspects of an MIRT approach using real and simulated analyses.  相似文献   

15.
Multidimensionality is a core concept in the measurement and analysis of psychological data. In personality assessment, for example, constructs are mostly theoretically defined as unidimensional, yet responses collected from the real world are almost always determined by multiple factors. Significant research efforts have concentrated on the use of simulated studies to evaluate the robustness of unidimensional item response models when applied to multidimensional data with a dominant dimension. In contrast, in the present paper, I report the result from a theoretical investigation that a multidimensional item response model is empirically indistinguishable from a locally dependent unidimensional model, of which the single dimension represents the actual construct of interest. A practical implication of this result is that multidimensional response data do not automatically require the use of multidimensional models. Circumstances under which the alternative approach of locally dependent unidimensional models may be useful are discussed.  相似文献   

16.
Marginal maximum‐likelihood procedures for parameter estimation and testing the fit of a hierarchical model for speed and accuracy on test items are presented. The model is a composition of two first‐level models for dichotomous responses and response times along with multivariate normal models for their item and person parameters. It is shown how the item parameters can easily be estimated using Fisher's identity. To test the fit of the model, Lagrange multiplier tests of the assumptions of subpopulation invariance of the item parameters (i.e., no differential item functioning), the shape of the response functions, and three different types of conditional independence were derived. Simulation studies were used to show the feasibility of the estimation and testing procedures and to estimate the power and Type I error rate of the latter. In addition, the procedures were applied to an empirical data set from a computerized adaptive test of language comprehension.  相似文献   

17.
Randomized response (RR) models are often used for analysing univariate randomized response data and measuring population prevalence of sensitive behaviours. There is much empirical support for the belief that RR methods improve the cooperation of the respondents. Recently, RR models have been extended to measure individual unidimensional behaviour. An extension of this modelling framework is proposed to measure compensatory or non‐compensatory multiple sensitive factors underlying the randomized item response process. A confirmatory multidimensional randomized item response theory model (MRIRT) is proposed for the analysis of multivariate RR data by modelling the response process and specifying structural relationships between sensitive behaviours and background information. A Markov chain Monte Carlo algorithm is developed to estimate simultaneously the parameters of the MRIRT model. The model extension enables the computation of individual true item response probabilities, estimates of individuals’ sensitive behaviour on different domains, and their relationships with background variables. An MRIRT analysis is presented of data from a college alcohol problem scale, measuring alcohol‐related socio‐emotional and community problems, and alcohol expectancy questionnaire, measuring alcohol‐related sexual enhancement expectancies. Students were interviewed via direct or RR questioning. Scores of alcohol‐related problems and expectancies are significantly higher for the group of students questioned using the RR technique. Alcohol‐related problems and sexual enhancement expectancies are positively moderately correlated and vary differently across gender and universities.  相似文献   

18.
19.
An IRT model based on the Rasch model is proposed for composite tasks, that is, tasks that are decomposed into subtasks of different kinds. There is one subtask for each component that is discerned in the composite tasks. A component is a generic kind of subtask of which the subtasks resulting from the decomposition are specific instantiations with respect to the particular composite tasks under study. The proposed model constrains the difficulties of the composite tasks to be linear combinations of the difficulties of the corresponding subtask items, which are estimated together with the weights used in the linear combinations, one weight for each kind of subtask. Although the model does not belong to the exponential family, its parameters can be estimated using conditional maximum likelihood estimation. The approach is demonstrated with an application to spelling tasks. We thank Eric Maris for his helpful comments.  相似文献   

20.
汪文义  宋丽红  丁树良 《心理学报》2016,48(12):1612-1624
介绍多维项目反应理论模型下分类准确性和分类一致性指标, 采用蒙特卡罗方法实现复杂决策规则下指标计算, 并从数学上证明分类准确性指标两类估计量在均匀先验和相同决策规则条件下依概率收敛于同一真值。研究结果表明:分类准确性指标可以比较准确地评价分类结果的准确性; 分类一致性指标可以较好地评价分类结果的重测一致性; 在一定条件下, 基于能力量尺的指标优于基于原始总分的指标; 纵使测验维度增加, 估计精度仍比较好; 随着测验长度和维度间相关增加, 分类准确性和分类一致性更高。指标可以用来评价标准参照测验或计算机分类测验的多种决策规则下分类信度和效度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号