首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
Bayesian estimation of a multilevel IRT model using gibbs sampling   总被引:3,自引:0,他引:3  
In this article, a two-level regression model is imposed on the ability parameters in an item response theory (IRT) model. The advantage of using latent rather than observed scores as dependent variables of a multilevel model is that it offers the possibility of separating the influence of item difficulty and ability level and modeling response variation and measurement error. Another advantage is that, contrary to observed scores, latent scores are test-independent, which offers the possibility of using results from different tests in one analysis where the parameters of the IRT model and the multilevel model can be concurrently estimated. The two-parameter normal ogive model is used for the IRT measurement model. It will be shown that the parameters of the two-parameter normal ogive model and the multilevel model can be estimated in a Bayesian framework using Gibbs sampling. Examples using simulated and real data are given.  相似文献   

2.
Constant latent odds-ratios models and the mantel-haenszel null hypothesis   总被引:1,自引:0,他引:1  
In the present paper, a new family of item response theory (IRT) models for dichotomous item scores is proposed. Two basic assumptions define the most general model of this family. The first assumption is local independence of the item scores given a unidimensional latent trait. The second assumption is that the odds-ratios for all item-pairs are constant functions of the latent trait. Since the latter assumption is characteristic of the whole family, the models are called constant latent odds-ratios (CLORs) models. One nonparametric special case and three parametric special cases of the general CLORs model are shown to be generalizations of the one-parameter logistic Rasch model. For all CLORs models, the total score (the unweighted sum of the item scores) is shown to be a sufficient statistic for the latent trait. In addition, conditions under the general CLORs model are studied for the investigation of differential item functioning (DIF) by means of the Mantel-Haenszel procedure. This research was supported by the Dutch Organization for Scientific Research (NWO), grant number 400-20-026.  相似文献   

3.
Structural equation modeling of paired-comparison and ranking data   总被引:1,自引:0,他引:1  
L. L. Thurstone's (1927) model provides a powerful framework for modeling individual differences in choice behavior. An overview of Thurstonian models for comparative data is provided, including the classical Case V and Case III models as well as more general choice models with unrestricted and factor-analytic covariance structures. A flow chart summarizes the model selection process. The authors show how to embed these models within a more familiar structural equation modeling (SEM) framework. The different special cases of Thurstone's model can be estimated with a popular SEM statistical package, including factor analysis models for paired comparisons and rankings. Only minor modifications are needed to accommodate both types of data. As a result, complex models for comparative judgments can be both estimated and tested efficiently.  相似文献   

4.
Estimating the reliability of scores on single‐item measures can be difficult because commonly used internal consistency estimates of reliability cannot be calculated. When longitudinal data is available, statistical models can be used to decompose the variability in the latent variable at each wave into trait versus state variance. Then, reliability can be estimated as a ratio of the sum of the trait variance that is captured in repeated assessments over the total variance. The current study used latent trait‐state‐error models on a nine‐year longitudinal data (N = 5,003) to estimate the test–retest reliability of scores on a single‐item measure of job satisfaction. Results showed that job satisfaction scores were somewhat unreliable (rxx = .49–.59) and amenable to change.  相似文献   

5.
I describe how multilevel logistic regression can be used to assess the consistency of an individual's response pattern with an item response theory measurement model. Specifically, by treating item responses as being nested within individuals, multilevel logistic regression is used to estimate a person-response curve that models how an individual's item endorsement rate decreases as a function of item difficulty. The slope of an individual's person-response curve is used as an indicator of the degree of response consistency or person-fit. I argue that the proposed multilevel modeling approach to person-fit assessment has several potential advantages over traditional techniques. The most important advantage being that the multilevel modeling approach allows explanatory variables to be entered into the model so that the causes of response inconsistency or differential test functioning can be investigated.  相似文献   

6.
The application of psychological measures often results in item response data that arguably are consistent with both unidimensional (a single common factor) and multidimensional latent structures (typically caused by parcels of items that tap similar content domains). As such, structural ambiguity leads to seemingly endless "confirmatory" factor analytic studies in which the research question is whether scale scores can be interpreted as reflecting variation on a single trait. An alternative to the more commonly observed unidimensional, correlated traits, or second-order representations of a measure's latent structure is a bifactor model. Bifactor structures, however, are not well understood in the personality assessment community and thus rarely are applied. To address this, herein we (a) describe issues that arise in conceptualizing and modeling multidimensionality, (b) describe exploratory (including Schmid-Leiman [Schmid & Leiman, 1957] and target bifactor rotations) and confirmatory bifactor modeling, (c) differentiate between bifactor and second-order models, and (d) suggest contexts where bifactor analysis is particularly valuable (e.g., for evaluating the plausibility of subscales, determining the extent to which scores reflect a single variable even when the data are multidimensional, and evaluating the feasibility of applying a unidimensional item response theory (IRT) measurement model). We emphasize that the determination of dimensionality is a related but distinct question from either determining the extent to which scores reflect a single individual difference variable or determining the effect of multidimensionality on IRT item parameter estimates. Indeed, we suggest that in many contexts, multidimensional data can yield interpretable scale scores and be appropriately fitted to unidimensional IRT models.  相似文献   

7.
The interpretation of a Thurstonian model for paired comparisons where the utilities' covariance matrix is unrestricted proved to be difficult due to the comparative nature of the data. We show that under a suitable constraint the utilities' correlation matrix can be estimated, yielding a readily interpretable solution. This set of identification constraints can recover any true utilities' covariance matrix, but it is not unique. Indeed, we show how to transform the estimated correlation matrix into alternative correlation matrices that are equally consistent with the data but may be more consistent with substantive theory. Also, we show how researchers can investigate the sample size needed to estimate a particular model by exploiting the simulation capabilities of a popular structural equation modeling statistical package.  相似文献   

8.
The identifiability of item response models with nonparametrically specified item characteristic curves is considered. Strict identifiability is achieved, with a fixed latent trait distribution, when only a single set of item characteristic curves can possibly generate the manifest distribution of the item responses. When item characteristic curves belong to a very general class, this property cannot be achieved. However, for assessments with many items, it is shown that all models for the manifest distribution have item characteristic curves that are very near one another and pointwise differences between them converge to zero at all values of the latent trait as the number of items increases. An upper bound for the rate at which this convergence takes place is given. The main result provides theoretical support to the practice of nonparametric item response modeling, by showing that models for long assessments have the property of asymptotic identifiability. The research was partially supported by the National Institute of Health grant R01 CA81068-01.  相似文献   

9.
It is generally assumed that the latent trait is normally distributed in the population when estimating logistic item response theory (IRT) model parameters. This assumption requires that the latent trait be fully continuous and the population homogenous (i.e., not a mixture). When this normality assumption is violated, models are misspecified, and item and person parameter estimates are inaccurate. When normality cannot be assumed, it might be appropriate to consider alternative modeling approaches: (a) a zero-inflated mixture, (b) a log-logistic, (c) a Ramsay curve, or (d) a heteroskedastic-skew model. The first 2 models were developed to address modeling problems associated with so-called quasi-continuous or unipolar constructs, which apply only to a subset of the population, or are meaningful at one end of the continuum only. The second 2 models were developed to address non-normal latent trait distributions and violations of homogeneity of error variance, respectively. To introduce these alternative IRT models and illustrate their strengths and weaknesses, we performed real data application comparing results to those from a graded response model. We review both statistical and theoretical challenges in applying these models and choosing among them. Future applications of these and other alternative models (e.g., unfolding, diffusion) are needed to advance understanding about model choice in particular situations.  相似文献   

10.
This article examines the potential contribution of latent trait models to the study of intelligence. Nontechnical introductions to both unidimensional and multidimensional latent trait models are given, and possible research applications are considered. Latent trait models are shown to resolve several measurement problems in studies of intellectual change, including ability modification studies and life-span development studies. Furthermore, under certain conditions, latent trait models are found useful for construct validation research, since they can represent an individual differences model of cognitive processing on ability test items. Multidimensional latent trait models are shown to be especially useful as processing models, because they can be used to test alternative multiple component theories of test item processing. Furthermore, multidimensional models can be used to decompose test item difficulty into component contributions and estimate individual differences in processing abilities.  相似文献   

11.
刘红云  骆方 《心理学报》2008,40(1):92-100
作者简要介绍了多水平项目反应模型,对多水平项目反应理论与通常项目反应理论之间的关系进行了探讨,得到了多水平项目反应模型参数与通常项目反应模型参数之间的关系,并讨论了多水平项目反应模型的推广模型。通过一个实际例子,用多水平项目反应模型对测验中项目的特征进行分析;检验个体水平和组水平预测变量对能力参数的影响;对项目功能差异进行分析。最后文章就多水平项目反应理论模型的优势与不足进行了讨论  相似文献   

12.
A multinormal partial credit model for factor analysis of polytomously scored items with ordered response categories is derived using an extension of the Dutch Identity (Holland in Psychometrika 55:5?C18, 1990). In the model, latent variables are assumed to have a multivariate normal distribution conditional on unweighted sums of item scores, which are sufficient statistics. Attention is paid to maximum likelihood estimation of item parameters, multivariate moments of latent variables, and person parameters. It is shown that the maximum likelihood estimates can be found without the use of numerical integration techniques. More general models are discussed which can be used for testing the model, and it is shown how models with different numbers of latent variables can be tested against each other. In addition, multi-group extensions are proposed, which can be used for testing both measurement invariance and latent population differences. Models and procedures discussed are demonstrated in an empirical data example.  相似文献   

13.
We explore the justification and formulation of a four‐parameter item response theory model (4PM) and employ a Bayesian approach to recover successfully parameter estimates for items and respondents. For data generated using a 4PM item response model, overall fit is improved when using the 4PM rather than the 3PM or the 2PM. Furthermore, although estimated trait scores under the various models correlate almost perfectly, inferences at the high and low ends of the trait continuum are compromised, with poorer coverage of the confidence intervals when the wrong model is used. We also show in an empirical example that the 4PM can yield new insights into the properties of a widely used delinquency scale. We discuss the implications for building appropriate measurement models in education and psychology to model more accurately the underlying response process.  相似文献   

14.
Edward H. Ip 《Psychometrika》2002,67(3):367-386
In this paper, we propose a class of locally dependent latent trait models for responses to psychological and educational tests. Typically, item response models treat an individual's multiple response to stimuli as conditional independent given the individual's latent trait. In this paper, instead the focus is on models based on a family of conditional distributions, or kernel, that describes joint multiple item responses as a function of student latent trait, not assuming conditional independence. Specifically, we examine a hybrid kernel which comprises a component for one-way item response functions and a component for conditional associations between items given latent traits. The class of models allows the extension of item response theory to cover some new and innovative applications in psychological and educational research. An EM algorithm for marginal maximum likelihood of the hybrid kernel model is proposed. Furthermore, we delineate the relationship of the class of locally dependent models and the log-linear model by revisiting the Dutch identity (Holland, 1990). The work is supported by a research grant from the Marshall School of Business, University of Southern California. The author thanks the anonymous referees for their suggestions.  相似文献   

15.
A rasch model for partial credit scoring   总被引:24,自引:0,他引:24  
A unidimensional latent trait model for responses scored in two or more ordered categories is developed. This “Partial Credit” model is a member of the family of latent trait models which share the property of parameter separability and so permit “specifically objective” comparisons of persons and items. The model can be viewed as an extension of Andrich's Rating Scale model to situations in which ordered response alternatives are free to vary in number and structure from item to item. The difference between the parameters in this model and the “category boundaries” in Samejima's Graded Response model is demonstrated. An unconditional maximum likelihood procedure for estimating the model parameters is developed. Preparation of this paper was supported by grants from the Spencer Foundation and the National Institute for Justice. I would like to thank Professor Benjamin D. Wright of the University of Chicago for his very kind help with the various drafts of this paper.  相似文献   

16.
In between-item multidimensional item response models, it is often desirable to compare individual latent trait estimates across dimensions. These comparisons are only justified if the model dimensions are scaled relative to each other. Traditionally, this scaling is done using approaches such as standardization—fixing the latent mean and standard deviation to 0 and 1 for all dimensions. However, approaches such as standardization do not guarantee that Rasch model properties hold across dimensions. Specifically, for between-item multidimensional Rasch family models, the unique ordering of items holds within dimensions, but not across dimensions. Previously, Feuerstahler and Wilson described the concept of scale alignment, which aims to enforce the unique ordering of items across dimensions by linearly transforming item parameters within dimensions. In this article, we extend the concept of scale alignment to the between-item multidimensional partial credit model and to models fit using incomplete data. We illustrate this method in the context of the Kindergarten Individual Development Survey (KIDS), a multidimensional survey of kindergarten readiness used in the state of Illinois. We also present simulation results that demonstrate the effectiveness of scale alignment in the context of polytomous item response models and missing data.  相似文献   

17.
For computer-administered tests, response times can be recorded conjointly with the corresponding responses. This broadens the scope of potential modelling approaches because response times can be analysed in addition to analysing the responses themselves. For this purpose, we present a new latent trait model for response times on tests. This model is based on the Cox proportional hazards model. According to this model, latent variables alter a baseline hazard function. Two different approaches to item parameter estimation are described: the first approach uses a variant of the Cox model for discrete time, whereas the second approach is based on a profile likelihood function. Properties of each estimator will be compared in a simulation study. Compared to the estimator for discrete time, the profile likelihood estimator is more efficient, that is, has smaller variance. Additionally, we show how the fit of the model can be evaluated and how the latent traits can be estimated. Finally, the applicability of the model to an empirical data set is demonstrated.  相似文献   

18.
本研究用中文修订版罗森博格自尊量表(RSES-R)考察随机截距因子分析模型在控制条目表述效应时的表现。用RSES-R和过分宣称问卷组成的量表调查621名中学生。结果表明,随机截距模型在建模时,拟合指数良好、因子方差与负荷合理,自尊因子分与RSES-R总分有极高相关,表明该模型能有效分离RSES-R得分的特质与表述效应。分离的表述效应因子分与受测者的自我提升水平具有显著但较弱的相关,表明表述效应与自受测者的社会赞许性有共同的成分。  相似文献   

19.
To prevent response bias, personality questionnaires may use comparative response formats. These include forced choice, where respondents choose among a number of items, and quantitative comparisons, where respondents indicate the extent to which items are preferred to each other. The present article extends Thurstonian modeling of binary choice data to “proportion-of-total” (compositional) formats. Following the seminal work of Aitchison, compositional item data are transformed into log ratios, conceptualized as differences of latent item utilities. The mean and covariance structure of the log ratios is modeled using confirmatory factor analysis (CFA), where the item utilities are first-order factors, and personal attributes measured by a questionnaire are second-order factors. A simulation study with two sample sizes, N = 300 and N = 1,000, shows that the method provides very good recovery of true parameters and near-nominal rejection rates. The approach is illustrated with empirical data from N = 317 students, comparing model parameters obtained with compositional and Likert-scale versions of a Big Five measure. The results show that the proposed model successfully captures the latent structures and person scores on the measured traits.  相似文献   

20.
A central assumption that is implicit in estimating item parameters in item response theory (IRT) models is the normality of the latent trait distribution, whereas a similar assumption made in categorical confirmatory factor analysis (CCFA) models is the multivariate normality of the latent response variables. Violation of the normality assumption can lead to biased parameter estimates. Although previous studies have focused primarily on unidimensional IRT models, this study extended the literature by considering a multidimensional IRT model for polytomous responses, namely the multidimensional graded response model. Moreover, this study is one of few studies that specifically compared the performance of full-information maximum likelihood (FIML) estimation versus robust weighted least squares (WLS) estimation when the normality assumption is violated. The research also manipulated the number of nonnormal latent trait dimensions. Results showed that FIML consistently outperformed WLS when there were one or multiple skewed latent trait distributions. More interestingly, the bias of the discrimination parameters was non-ignorable only when the corresponding factor was skewed. Having other skewed factors did not further exacerbate the bias, whereas biases of boundary parameters increased as more nonnormal factors were added. The item parameter standard errors recovered well with both estimation algorithms regardless of the number of nonnormal dimensions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号