首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Generalized fiducial inference (GFI) has been proposed as an alternative to likelihood-based and Bayesian inference in mainstream statistics. Confidence intervals (CIs) can be constructed from a fiducial distribution on the parameter space in a fashion similar to those used with a Bayesian posterior distribution. However, no prior distribution needs to be specified, which renders GFI more suitable when no a priori information about model parameters is available. In the current paper, we apply GFI to a family of binary logistic item response theory models, which includes the two-parameter logistic (2PL), bifactor and exploratory item factor models as special cases. Asymptotic properties of the resulting fiducial distribution are discussed. Random draws from the fiducial distribution can be obtained by the proposed Markov chain Monte Carlo sampling algorithm. We investigate the finite-sample performance of our fiducial percentile CI and two commonly used Wald-type CIs associated with maximum likelihood (ML) estimation via Monte Carlo simulation. The use of GFI in high-dimensional exploratory item factor analysis was illustrated by the analysis of a set of the Eysenck Personality Questionnaire data.  相似文献   

2.
3.
The problem of fitting unidimensional item response models to potentially multidimensional data has been extensively studied. The focus of this article is on response data that have a strong dimension but also contain minor nuisance dimensions. Fitting a unidimensional model to such multidimensional data is believed to result in ability estimates that represent a combination of the major and minor dimensions. We conjecture that the underlying dimension for the fitted unidimensional model, which we call the functional dimension, represents a nonlinear projection. In this article we investigate 2 issues: (a) can a proposed nonlinear projection track the functional dimension well, and (b) what are the biases in the ability estimate and the associated standard error when estimating the functional dimension? To investigate the second issue, the nonlinear projection is used as an evaluative tool. An example regarding a construct of desire for physical competency is used to illustrate the functional unidimensional approach.  相似文献   

4.
5.
Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models.  相似文献   

6.
Log-Multiplicative Association Models as Item Response Models   总被引:1,自引:0,他引:1  
Log-multiplicative association (LMA) models, which are special cases of log-linear models, have interpretations in terms of latent continuous variables. Two theoretical derivations of LMA models based on item response theory (IRT) arguments are presented. First, we show that Anderson and colleagues (Anderson &; Vermunt, 2000; Anderson &; Böckenholt, 2000; Anderson, 2002), who derived LMA models from statistical graphical models, made the equivalent assumptions as Holland (1990) when deriving models for the manifest probabilities of response patterns based on an IRT approach. We also present a second derivation of LMA models where item response functions are specified as functions of rest-scores. These various connections provide insights into the behavior of LMA models as item response models and point out philosophical issues with the use of LMA models as item response models. We show that even for short tests, LMA and standard IRT models yield very similar to nearly identical results when data arise from standard IRT models. Log-multiplicative association models can be used as item response models and do not require numerical integration for estimation.  相似文献   

7.
8.
9.
Nested logit item response models for multiple-choice data are presented. Relative to previous models, the new models are suggested to provide a better approximation to multiple-choice items where the application of a solution strategy precedes consideration of response options. In practice, the models also accommodate collapsibility across all distractor categories, making it easier to allow decisions about including distractor information to occur on an item-by-item or application-by-application basis without altering the statistical form of the correct response curves. Marginal maximum likelihood estimation algorithms for the models are presented along with simulation and real data analyses.  相似文献   

10.
Differential item functioning (DIF), referring to between-group variation in item characteristics above and beyond the group-level disparity in the latent variable of interest, has long been regarded as an important item-level diagnostic. The presence of DIF impairs the fit of the single-group item response model being used, and calls for either model modification or item deletion in practice, depending on the mode of analysis. Methods for testing DIF with continuous covariates, rather than categorical grouping variables, have been developed; however, they are restrictive in parametric forms, and thus are not sufficiently flexible to describe complex interaction among latent variables and covariates. In the current study, we formulate the probability of endorsing each test item as a general bivariate function of a unidimensional latent trait and a single covariate, which is then approximated by a two-dimensional smoothing spline. The accuracy and precision of the proposed procedure is evaluated via Monte Carlo simulations. If anchor items are available, we proposed an extended model that simultaneously estimates item characteristic functions (ICFs) for anchor items, ICFs conditional on the covariate for non-anchor items, and the latent variable density conditional on the covariate—all using regression splines. A permutation DIF test is developed, and its performance is compared to the conventional parametric approach in a simulation study. We also illustrate the proposed semiparametric DIF testing procedure with an empirical example.  相似文献   

11.
Current approaches to model responses and response times to psychometric tests solely focus on between-subject differences in speed and ability. Within subjects, speed and ability are assumed to be constants. Violations of this assumption are generally absorbed in the residual of the model. As a result, within-subject departures from the between-subject speed and ability level remain undetected. These departures may be of interest to the researcher as they reflect differences in the response processes adopted on the items of a test. In this article, we propose a dynamic approach for responses and response times based on hidden Markov modeling to account for within-subject differences in responses and response times. A simulation study is conducted to demonstrate acceptable parameter recovery and acceptable performance of various fit indices in distinguishing between different models. In addition, both a confirmatory and an exploratory application are presented to demonstrate the practical value of the modeling approach.  相似文献   

12.
13.
Models specifying indirect effects (or mediation) and structural equation modeling are both popular in the social sciences. Yet relatively little research has compared methods that test for indirect effects among latent variables and provided precise estimates of the effectiveness of different methods.

This simulation study provides an extensive comparison of methods for constructing confidence intervals and for making inferences about indirect effects with latent variables. We compared the percentile (PC) bootstrap, bias-corrected (BC) bootstrap, bias-corrected accelerated (BC a ) bootstrap, likelihood-based confidence intervals (Neale & Miller, 1997), partial posterior predictive (Biesanz, Falk, and Savalei, 2010), and joint significance tests based on Wald tests or likelihood ratio tests. All models included three reflective latent variables representing the independent, dependent, and mediating variables. The design included the following fully crossed conditions: (a) sample size: 100, 200, and 500; (b) number of indicators per latent variable: 3 versus 5; (c) reliability per set of indicators: .7 versus .9; (d) and 16 different path combinations for the indirect effect (α = 0, .14, .39, or .59; and β = 0, .14, .39, or .59). Simulations were performed using a WestGrid cluster of 1680 3.06GHz Intel Xeon processors running R and OpenMx.

Results based on 1,000 replications per cell and 2,000 resamples per bootstrap method indicated that the BC and BC a bootstrap methods have inflated Type I error rates. Likelihood-based confidence intervals and the PC bootstrap emerged as methods that adequately control Type I error and have good coverage rates.  相似文献   

14.
Liu  Yang  Yang  Ji Seung  Maydeu-Olivares  Alberto 《Psychometrika》2019,84(2):529-553
Psychometrika - In item response theory (IRT), it is often necessary to perform restricted recalibration (RR) of the model: A set of (focal) parameters is estimated holding a set of (nuisance)...  相似文献   

15.
When categorical ordinal item response data are collected over multiple timepoints from a repeated measures design, an item response theory (IRT) modeling approach whose unit of analysis is an item response is suitable. This study proposes a few longitudinal IRT models and illustrates how a popular compensatory multidimensional IRT model can be utilized to formulate such longitudinal IRT models, which permits an investigation of ability growth at both individual and population levels. The equivalence of an existing multidimensional IRT model and those longitudinal IRT models is also elaborated so that one can make use of an existing multidimensional IRT model to implement the longitudinal IRT models.  相似文献   

16.
17.
Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.  相似文献   

18.
19.
Bounds are established for log odds ratios (log cross-product ratios) involving pairs of items for item response models. First, expressions for bounds on log odds ratios are provided for one-dimensional item response models in general. Then, explicit bounds are obtained for the Rasch model and the two-parameter logistic (2PL) model. Results are also illustrated through an example from a study of model-checking procedures. The bounds obtained can provide an elementary basis for assessment of goodness of fit of these models. Any opinions expressed in this publication are those of the authors and not necessarily those of the Educational Testing Service. The authors thank Dan Eignor, Matthias von Davier, Lydia Gladkova, Brian Junker, and the three anonymous reviewers for their invaluable advice. The authors gratefully acknowledge the help of Kim Fryer with proofreading.  相似文献   

20.
在测量具有层阶结构的潜质时, 标准项目反应模型对项目参数估计和能力参数估计都具有较低的效率, 多维项目反应模型虽然在估计第一阶潜质时具有高效性, 但没有考虑到潜质层阶的情况, 所以它不适合用来处理具有层阶结构的潜质; 而高阶项目反应模型在处理这种具有层阶结构的潜质时, 不仅能够高效准确地对项目参数和能力参数进行估计, 而且还能同时获得高阶潜质与低阶潜质。目前存在的高阶项目反应模型有高阶DINA模型、高阶双参数正态肩型层阶模型、高阶逻辑斯蒂模型、多级评分的高阶项目反应模型和高阶题组模型。未来对高阶项目反应模型的研究方向应注意多水平高阶项目反应模型、项目内多维情况下的高阶项目反应模型以及高阶认知诊断模型。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号