首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Multidimensional item response theory (MIRT) models for response style (e.g., Bolt, Lu, & Kim, 2014, Psychological Methods, 19, 528; Falk & Cai, 2016, Psychological Methods, 21, 328) provide flexibility in accommodating various response styles, but often present difficulty in isolating the effects of response style(s) from the intended substantive trait(s). In the presence of such measurement limitations, we consider several ways in which MIRT models are nevertheless useful in lending insight into how response styles may interfere with measurement for a given test instrument. Such a study can also inform whether alternative design considerations (e.g., anchoring vignettes, self-report items of heterogeneous content) that seek to control for response style effects may be helpful. We illustrate several aspects of an MIRT approach using real and simulated analyses.  相似文献   

2.
The use of multidimensional forced-choice (MFC) items to assess non-cognitive traits such as personality, interests and values in psychological tests has a long history, because MFC items show strengths in preventing response bias. Recently, there has been a surge of interest in developing item response theory (IRT) models for MFC items. However, nearly all of the existing IRT models have been developed for MFC items with binary scores. Real tests use MFC items with more than two categories; such items are more informative than their binary counterparts. This study developed a new IRT model for polytomous MFC items based on the cognitive model of choice, which describes the cognitive processes underlying humans' preferential choice behaviours. The new model is unique in its ability to account for the ipsative nature of polytomous MFC items, to assess individual psychological differentiation in interests, values and emotions, and to compare the differentiation levels of latent traits between individuals. Simulation studies were conducted to examine the parameter recovery of the new model with existing computer programs. The results showed that both statement parameters and person parameters were well recovered when the sample size was sufficient. The more complete the linking of the statements was, the more accurate the parameter estimation was. This paper provides an empirical example of a career interest test using four-category MFC items. Although some aspects of the model (e.g., the nature of the person parameters) require additional validation, our approach appears promising.  相似文献   

3.
Wendy M. Yen 《Psychometrika》1985,50(4):399-410
When the three-parameter logistic model is applied to tests covering a broad range of difficulty, there frequently is an increase in mean item discrimination and a decrease in variance of item difficulties and traits as the tests become more difficult. To examine the hypothesis that this unexpected scale shrinkage effect occurs because the items increase in complexity as they increase in difficulty, an approximate relationship is derived between the unidimensional model used in data analysis and a multidimensional model hypothesized to be generating the item responses. Scale shrinkage is successfully predicted for several sets of simulated data.The author is grateful to Robert Mislevy for kindly providing a copy of his computer program, RESOLVE.  相似文献   

4.
The aim of latent variable selection in multidimensional item response theory (MIRT) models is to identify latent traits probed by test items of a multidimensional test. In this paper the expectation model selection (EMS) algorithm proposed by Jiang et al. (2015) is applied to minimize the Bayesian information criterion (BIC) for latent variable selection in MIRT models with a known number of latent traits. Under mild assumptions, we prove the numerical convergence of the EMS algorithm for model selection by minimizing the BIC of observed data in the presence of missing data. For the identification of MIRT models, we assume that the variances of all latent traits are unity and each latent trait has an item that is only related to it. Under this identifiability assumption, the convergence of the EMS algorithm for latent variable selection in the multidimensional two-parameter logistic (M2PL) models can be verified. We give an efficient implementation of the EMS for the M2PL models. Simulation studies show that the EMS outperforms the EM-based L1 regularization in terms of correctly selected latent variables and computation time. The EMS algorithm is applied to a real data set related to the Eysenck Personality Questionnaire.  相似文献   

5.
汪文义  宋丽红  丁树良 《心理学报》2016,48(12):1612-1624
介绍多维项目反应理论模型下分类准确性和分类一致性指标, 采用蒙特卡罗方法实现复杂决策规则下指标计算, 并从数学上证明分类准确性指标两类估计量在均匀先验和相同决策规则条件下依概率收敛于同一真值。研究结果表明:分类准确性指标可以比较准确地评价分类结果的准确性; 分类一致性指标可以较好地评价分类结果的重测一致性; 在一定条件下, 基于能力量尺的指标优于基于原始总分的指标; 纵使测验维度增加, 估计精度仍比较好; 随着测验长度和维度间相关增加, 分类准确性和分类一致性更高。指标可以用来评价标准参照测验或计算机分类测验的多种决策规则下分类信度和效度。  相似文献   

6.
Information functions are used to find the optimum ability levels and maximum contributions to information for estimating item parameters in three commonly used logistic item response models. For the three and two parameter logistic models, examinees who contribute maximally to the estimation of item difficulty contribute little to the estimation of item discrimination. This suggests that in applications that depend heavily upon the veracity of individual item parameter estimates (e.g. adaptive testing or text construction), better item calibration results may be obtained (for fixed sample sizes) from examinee calibration samples in which ability is widely dispersed.This work was supported by Contract No. N00014-83-C-0457, project designation NR 150-520, from Cognitive Science Program, Cognitive and Neural Sciences Division, Office of Naval Research and Educational Testing Service through the Program Research Planning Council. Reproduction in whole or in part is permitted for any purpose of the United States Government. The author wishes to acknowledge the invaluable assistance of Maxine B. Kingston in carrying out this study, and to thank Charles Lewis for his many insightful comments on earlier drafts of this paper.  相似文献   

7.
8.
Item response curves for a set of binary responses are studied from a Bayesian viewpoint of estimating the item parameters. For the two-parameter logistic model with normally distributed ability, restricted bivariate beta priors are used to illustrate the computation of the posterior mode via the EM algorithm. The procedure is illustrated by data from a mathematics test.This work was supported under Contract No. N00014-85-K-0113, NR 150-535, from Personnel and Training Research Programs, Psychological Sciences Division, Office of Naval Research. The authors wish to thank Mark D. Reckase for providing the ACT data used in the illustration and Michael J. Soltys for computational assistance. They also wish to thank the editor and four anonymous reviewers for many valuable suggestions.  相似文献   

9.
In a pre‐test–post‐test cluster randomized trial, one of the methods commonly used to detect an intervention effect involves controlling pre‐test scores and other related covariates while estimating an intervention effect at post‐test. In many applications in education, the total post‐test and pre‐test scores, ignoring measurement error, are used as response variable and covariate, respectively, to estimate the intervention effect. However, these test scores are frequently subject to measurement error, and statistical inferences based on the model ignoring measurement error can yield a biased estimate of the intervention effect. When multiple domains exist in test data, it is sometimes more informative to detect the intervention effect for each domain than for the entire test. This paper presents applications of the multilevel multidimensional item response model with measurement error adjustments in a response variable and a covariate to estimate the intervention effect for each domain.  相似文献   

10.
Bayes modal estimation in item response models   总被引:1,自引:0,他引:1  
This article describes a Bayesian framework for estimation in item response models, with two-stage prior distributions on both item and examinee populations. Strategies for point and interval estimation are discussed, and a general procedure based on the EM algorithm is presented. Details are given for implementation under one-, two-, and three-parameter binary logistic IRT models. Novel features include minimally restrictive assumptions about examinee distributions and the exploitation of dependence among item parameters in a population of interest. Improved estimation in a moderately small sample is demonstrated with simulated data.This research was supported by a grant from the Spencer Foundation, Chicago, IL. Comments and suggestions on earlier drafts by Charles Lewis, Frederic Lord, Rosenbaum, James Ramsey, Hiroshi Watanabe, the editor, and two anonymous referees are gratefully acknowledged.  相似文献   

11.
When scaling data using item response theory, valid statements based on the measurement model are only permissible if the model fits the data. Most item fit statistics used to assess the fit between observed item responses and the item responses predicted by the measurement model show significant weaknesses, such as the dependence of fit statistics on sample size and number of items. In order to assess the size of misfit and to thus use the fit statistic as an effect size, dependencies on properties of the data set are undesirable. The present study describes a new approach and empirically tests it for consistency. We developed an estimator of the distance between the predicted item response functions (IRFs) and the true IRFs by semiparametric adaptation of IRFs. For the semiparametric adaptation, the approach of extended basis functions due to Ramsay and Silverman (2005) is used. The IRF is defined as the sum of a linear term and a more flexible term constructed via basis function expansions. The group lasso method is applied as a regularization of the flexible term, and determines whether all parameters of the basis functions are fixed at zero or freely estimated. Thus, the method serves as a selection criterion for items that should be adjusted semiparametrically. The distance between the predicted and semiparametrically adjusted IRF of misfitting items can then be determined by describing the fitting items by the parametric form of the IRF and the misfitting items by the semiparametric approach. In a simulation study, we demonstrated that the proposed method delivers satisfactory results in large samples (i.e., N ≥ 1,000).  相似文献   

12.
Some standard errors in item response theory   总被引:2,自引:0,他引:2  
The mathematics required to calculate the asymptotic standard errors of the parameters of three commonly used logistic item response models is described and used to generate values for some common situations. It is shown that the maximum likelihood estimation of a lower asymptote can wreak havoc with the accuracy of estimation of a location parameter, indicating that if one needs to have accurate estimates of location parameters (say for purposes of test linking/equating or computerized adaptive testing) the sample sizes required for acceptable accuracy may be unattainable in most applications. It is suggested that other estimation methods be used if the three parameter model is applied in these situations.The research reported here was supported, in part, by contract #F41689-81-6-0012 from the Air Force Human Resources Laboratory to McFann-Gray & Associates, Benjamin A. Fairbank, Jr., Principal Investigator. Further support of Wainer's effort was supplied by the Educational Testing Service, Program Statistics Research Project.  相似文献   

13.
This paper proposes a multi-objective programming method for determining samples of examinees needed for estimating the parameters of a group of items. In the numerical experiments, optimum samples are compared to uniformly and normally distributed samples. The results show that the samples usually recommended in the literature are well suited for estimating the difficulty parameters. Furthermore, they are also adequate for estimating the discrimination parameters in the three-parameter model, butnot for the guessing parameters.  相似文献   

14.
A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales.  相似文献   

15.
16.
17.
In this paper, we explore the use of the stochastic EM algorithm (Celeux & Diebolt (1985) Computational Statistics Quarterly, 2, 73) for large-scale full-information item factor analysis. Innovations have been made on its implementation, including an adaptive-rejection-based Gibbs sampler for the stochastic E step, a proximal gradient descent algorithm for the optimization in the M step, and diagnostic procedures for determining the burn-in size and the stopping of the algorithm. These developments are based on the theoretical results of Nielsen (2000, Bernoulli, 6, 457), as well as advanced sampling and optimization techniques. The proposed algorithm is computationally efficient and virtually tuning-free, making it scalable to large-scale data with many latent traits (e.g. more than five latent traits) and easy to use for practitioners. Standard errors of parameter estimation are also obtained based on the missing-information identity (Louis, 1982, Journal of the Royal Statistical Society, Series B, 44, 226). The performance of the algorithm is evaluated through simulation studies and an application to the analysis of the IPIP-NEO personality inventory. Extensions of the proposed algorithm to other latent variable models are discussed.  相似文献   

18.
The four-parameter logistic (4PL) item response model, which includes an upper asymptote for the correct response probability, has drawn increasing interest due to its suitability for many practical scenarios. This paper proposes a new Gibbs sampling algorithm for estimation of the multidimensional 4PL model based on an efficient data augmentation scheme (DAGS). With the introduction of three continuous latent variables, the full conditional distributions are tractable, allowing easy implementation of a Gibbs sampler. Simulation studies are conducted to evaluate the proposed method and several popular alternatives. An empirical data set was analysed using the 4PL model to show its improved performance over the three-parameter and two-parameter logistic models. The proposed estimation scheme is easily accessible to practitioners through the open-source IRTlogit package.  相似文献   

19.
Reliability of scores from psychological or educational assessments provides important information regarding the precision of measurement. The reliability of scores is however population dependent and may vary across groups. In item response theory, this population dependence can be attributed to differential item functioning or to differences in the latent distributions between groups and needs to be accounted for when estimating the reliability of scores for different groups. Here, we introduce group-specific and overall reliability coefficients for sum scores and maximum likelihood ability estimates defined by a multiple group item response theory model. We derive confidence intervals using asymptotic theory and evaluate the empirical properties of estimators and the confidence intervals in a simulation study. The results show that the estimators are largely unbiased and that the confidence intervals are accurate with moderately large sample sizes. We exemplify the approach with the Montreal Cognitive Assessment (MoCA) in two groups defined by education level and give recommendations for applied work.  相似文献   

20.
Applications of item response theory, which depend upon its parameter invariance property, require that parameter estimates be unbiased. A new method, weighted likelihood estimation (WLE), is derived, and proved to be less biased than maximum likelihood estimation (MLE) with the same asymptotic variance and normal distribution. WLE removes the first order bias term from MLE. Two Monte Carlo studies compare WLE with MLE and Bayesian modal estimation (BME) of ability in conventional tests and tailored tests, assuming the item parameters are known constants. The Monte Carlo studies favor WLE over MLE and BME on several criteria over a wide range of the ability scale.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号