首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The item response function (IRF) for a polytomously scored item is defined as a weighted sum of the item category response functions (ICRF, the probability of getting a particular score for a randomly sampled examinee of ability ). This paper establishes the correspondence between an IRF and a unique set of ICRFs for two of the most commonly used polytomous IRT models (the partial credit models and the graded response model). Specifically, a proof of the following assertion is provided for these models: If two items have the same IRF, then they must have the same number of categories; moreover, they must consist of the same ICRFs. As a corollary, for the Rasch dichotomous model, if two tests have the same test characteristic function (TCF), then they must have the same number of items. Moreover, for each item in one of the tests, an item in the other test with an identical IRF must exist. Theoretical as well as practical implications of these results are discussed.This research was supported by Educational Testing Service Allocation Projects No. 79409 and No. 79413. The authors wish to thank John Donoghue, Ming-Mei Wang, Rebecca Zwick, and Zhiliang Ying for their useful comments and discussions. The authors also wish to thank three anonymous reviewers for their comments.  相似文献   

2.
Person-fit statistics have been proposed to investigate the fit of an item score pattern to an item response theory (IRT) model. The author investigated how these statistics can be used to detect different types of misfit. Intelligence test data were analyzed using person-fit statistics in the context of the G. Rasch (1960) model and R. J. Mokken's (1971, 1997) IRT models. The effect of the choice of an IRT model to detect misfitting item score patterns and the usefulness of person-fit statisticsfor diagnosis of misfit are discussed. Results showed that different types of person-fit statistics can be used to detect different kinds of person misfit. Parametric person-fit statistics had more power than nonparametric person-fit statistics.  相似文献   

3.
Assessing item fit for unidimensional item response theory models for dichotomous items has always been an issue of enormous interest, but there exists no unanimously agreed item fit diagnostic for these models, and hence there is room for further investigation of the area. This paper employs the posterior predictive model‐checking method, a popular Bayesian model‐checking tool, to examine item fit for the above‐mentioned models. An item fit plot, comparing the observed and predicted proportion‐correct scores of examinees with different raw scores, is suggested. This paper also suggests how to obtain posterior predictive p‐values (which are natural Bayesian p‐values) for the item fit statistics of Orlando and Thissen that summarize numerically the information in the above‐mentioned item fit plots. A number of simulation studies and a real data application demonstrate the effectiveness of the suggested item fit diagnostics. The suggested techniques seem to have adequate power and reasonable Type I error rate, and psychometricians will find them promising.  相似文献   

4.
We present a semi-parametric approach to estimating item response functions (IRF) useful when the true IRF does not strictly follow commonly used functions. Our approach replaces the linear predictor of the generalized partial credit model with a monotonic polynomial. The model includes the regular generalized partial credit model at the lowest order polynomial. Our approach extends Liang’s (A semi-parametric approach to estimate IRFs, Unpublished doctoral dissertation, 2007) method for dichotomous item responses to the case of polytomous data. Furthermore, item parameter estimation is implemented with maximum marginal likelihood using the Bock–Aitkin EM algorithm, thereby facilitating multiple group analyses useful in operational settings. Our approach is demonstrated on both educational and psychological data. We present simulation results comparing our approach to more standard IRF estimation approaches and other non-parametric and semi-parametric alternatives.  相似文献   

5.
This paper discusses the application of a class of Rasch models to situations where test items are grouped into subsets and the common attributes of items within these subsets brings into question the usual assumption of conditional independence. The models are all expressed as particular cases of the random coefficients multinomial logit model developed by Adams and Wilson. This formulation allows a very flexible approach to the specification of alternative models, and makes model testing particularly straightforward. The use of the models is illustrated using item bundles constructed in the framework of the SOLO taxonomy of Biggs and Collis.The work of both authors was supported by fellowships from the National Academy of Education Spencer Fellowship.  相似文献   

6.
Wendy M. Yen 《Psychometrika》1985,50(4):399-410
When the three-parameter logistic model is applied to tests covering a broad range of difficulty, there frequently is an increase in mean item discrimination and a decrease in variance of item difficulties and traits as the tests become more difficult. To examine the hypothesis that this unexpected scale shrinkage effect occurs because the items increase in complexity as they increase in difficulty, an approximate relationship is derived between the unidimensional model used in data analysis and a multidimensional model hypothesized to be generating the item responses. Scale shrinkage is successfully predicted for several sets of simulated data.The author is grateful to Robert Mislevy for kindly providing a copy of his computer program, RESOLVE.  相似文献   

7.
For each Rasch (Masters) partial credit item, there exists a set of independent Rasch binary and indecomposable trinary items for which the sum of the scores and the partial credit score have identical probability density functions. If each indecomposable trinary item is further expressed as the sum of two binary items, then the binary items are positively dependent and cannot be both of the Rasch type. This paper was written while the author was working with Steve Ferrara and Hillary Michaels on some technical aspects of the Maryland School Performance Assessment Program. The author had been puzzled by the fact that most MSPAP assessment items have three or less score categories. With a psychometric justification now being apparent, this paper is dedicated to both of them.  相似文献   

8.
Differential item functioning (DIF), referring to between-group variation in item characteristics above and beyond the group-level disparity in the latent variable of interest, has long been regarded as an important item-level diagnostic. The presence of DIF impairs the fit of the single-group item response model being used, and calls for either model modification or item deletion in practice, depending on the mode of analysis. Methods for testing DIF with continuous covariates, rather than categorical grouping variables, have been developed; however, they are restrictive in parametric forms, and thus are not sufficiently flexible to describe complex interaction among latent variables and covariates. In the current study, we formulate the probability of endorsing each test item as a general bivariate function of a unidimensional latent trait and a single covariate, which is then approximated by a two-dimensional smoothing spline. The accuracy and precision of the proposed procedure is evaluated via Monte Carlo simulations. If anchor items are available, we proposed an extended model that simultaneously estimates item characteristic functions (ICFs) for anchor items, ICFs conditional on the covariate for non-anchor items, and the latent variable density conditional on the covariate—all using regression splines. A permutation DIF test is developed, and its performance is compared to the conventional parametric approach in a simulation study. We also illustrate the proposed semiparametric DIF testing procedure with an empirical example.  相似文献   

9.
针对双目标CD-CAT,将六种项目区分度(鉴别力D、一般区分度GDI、优势比OR、2PL的区分度a、属性区分度ADI、认知诊断区分度CDI)分别与IPA方法结合,得到新的选题策略。模拟研究比较了它们的表现,还考察了区分度分层在控制项目曝光的表现。结果发现:新方法都能明显提高知识状态的判准率和能力估计精度;分层选题均能很好地提高题库利用率。总体上,OR加权能显著提高测量精度;OR分层选题在保证测量精度条件下显著提高项目曝光均匀性。  相似文献   

10.
11.
This study sought to apply an item parcelling approach to confirm the factor structure of trust in the direct supervisor as measured by the trust relationship audit (TRA). The researchers analysed an existing data set on the TRA from 9 060 South African employees. For the analysis, the researchers utilised structural equation modelling, using item parcelling to confirm the factor structure. The results confirm that, in essence, the large sample structural model replicates the original small sample model, consisting of separate personality and managerial practices factors as antecedents of trust in supervisors. Two items measuring personality traits loaded differently in the small and the combined sample. The results suggest item parcelling to be a value-add in measure validation when data mining.  相似文献   

12.
The simultaneous and nonparametric estimation of latent abilities and item characteristic curves is considered. The asymptotic properties of ordinal ability estimation and kernel smoothed nonparametric item characteristic curve estimation are investigated under very general assumptions on the underlying item response theory model as both the test length and the sample size increase. A large deviation probability inequality is stated for ordinal ability estimation. The mean squared error of kernel smoothed item characteristic curve estimates is studied and a strong consistency result is obtained showing that the worst case error in the item characteristic curve estimates over all items and ability levels converges to zero with probability equal to one.  相似文献   

13.
This article describes the functions of a SAS macro and an SPSS syntax that produce common statistics for conventional item analysis including Cronbach’s alpha, item difficulty index (p-value or item mean), and item discrimination indices (D-index, point biserial and biserial correlations for dichotomous items and item-total correlation for polytomous items). These programs represent an improvement over the existing SAS and SPSS item analysis routines in terms of completeness and user-friendliness. To promote routine evaluations of item qualities in instrument development of any scale, the programs are available at no charge for interested users. The program codes along with a brief user’s manual that contains instructions and examples are downloadable from suen.ed.psu.edu/~pwlei/plei.htm.  相似文献   

14.
Huynh Huynh 《Psychometrika》1994,59(1):111-119
Given a Masters partial credit item withn known step difficulties, conditions are stated for the existence of a set of (locally) independent Rasch binary items such that their raw score and the partial credit raw score have identical probability density functions. The conditions are those for the existence ofn positive values with predetermined elementary symmetric functions and include the requirement that then step difficulties form an increasing sequence.  相似文献   

15.
The use of multidimensional forced-choice (MFC) items to assess non-cognitive traits such as personality, interests and values in psychological tests has a long history, because MFC items show strengths in preventing response bias. Recently, there has been a surge of interest in developing item response theory (IRT) models for MFC items. However, nearly all of the existing IRT models have been developed for MFC items with binary scores. Real tests use MFC items with more than two categories; such items are more informative than their binary counterparts. This study developed a new IRT model for polytomous MFC items based on the cognitive model of choice, which describes the cognitive processes underlying humans' preferential choice behaviours. The new model is unique in its ability to account for the ipsative nature of polytomous MFC items, to assess individual psychological differentiation in interests, values and emotions, and to compare the differentiation levels of latent traits between individuals. Simulation studies were conducted to examine the parameter recovery of the new model with existing computer programs. The results showed that both statement parameters and person parameters were well recovered when the sample size was sufficient. The more complete the linking of the statements was, the more accurate the parameter estimation was. This paper provides an empirical example of a career interest test using four-category MFC items. Although some aspects of the model (e.g., the nature of the person parameters) require additional validation, our approach appears promising.  相似文献   

16.
The identifiability of item response models with nonparametrically specified item characteristic curves is considered. Strict identifiability is achieved, with a fixed latent trait distribution, when only a single set of item characteristic curves can possibly generate the manifest distribution of the item responses. When item characteristic curves belong to a very general class, this property cannot be achieved. However, for assessments with many items, it is shown that all models for the manifest distribution have item characteristic curves that are very near one another and pointwise differences between them converge to zero at all values of the latent trait as the number of items increases. An upper bound for the rate at which this convergence takes place is given. The main result provides theoretical support to the practice of nonparametric item response modeling, by showing that models for long assessments have the property of asymptotic identifiability. The research was partially supported by the National Institute of Health grant R01 CA81068-01.  相似文献   

17.
Applications of signal detection theory (SDT) often involve presentations of different items on each trial, such as slides in a medical imaging study or words in a memory study. If factors particular to the items themselves, apart from being a signal or noise, affect observers’ responses, then ‘item effects’ are present. One way to model these effects is to use a latent continuous variable as an item ‘factor’, such as item ‘difficulty’. Details of SDT models with item effects are clarified via derivations of their implied conditional means, variances, and covariances. Intra-item correlations are defined and suggested as measures of the magnitude of item effects. The SDT-item models are simple random coefficient models and can be fit with standard software. More general models, such as item models with mixing and/or with random observer effects, are also considered.  相似文献   

18.
The aim of the current study was to reduce the number of items in the 48-item hypomanic personality scale (HPS) and determine whether a unidimensional scale of the hypomanic trait could be derived. Previously collected HPS data from University students (n = 318) were applied to the Rasch model (one-parameter item response theory). Overall scale and individual item fit statistics were used to judge fit to the model and item maps employed to determine coverage of the trait. Cronbach’s Alpha and correlations with other questionnaires pre- and post-item reduction were evaluated. Rasch analysis indicated that the original HPS was not unidimensional, had significant redundancy and differential item functioning by age and gender. An iterative process of item reduction produced a 20-item HPS (HPS-20) that retained the concepts of the original HPS and had excellent fit to the Rasch model (χ2 p = 0.27). Unidimensionality of the HPS-20 was confirmed. The traditional psychometric properties of the HPS-20 and coverage of the underlying hypomanic construct were similar to the original. It was possible to derive a unidimensional measure of the hypomanic trait. Further use of the HPS-20 is encouraged as it may increase understanding of the risk factors for affective disorders.  相似文献   

19.
杨向东 《心理学报》2010,42(7):802-812
自动化项目生成(Automatic Item Generation)中的项目参数是基于认知项目设计的刺激特征集预测的, 在不确定性来源上较之用经验数据标定的参数更为复杂。文章通过实证研究分析了在计算机适应性测验条件下基于认知设计系统法生成的抽象推理测验(ART)项目预测参数对能力参数估计的精确性。研究表明, 项目预测参数比相应标定参数分布更为趋中。这种回归效应既影响到能力参数估计误差大小, 也导致适应性测验过程中项目选择的差异。在控制了项目选择差异之后, 能力参数估计误差较之基于项目标定参数的能力估计误差大, 但差别并不明显。两者相应的能力估计值相关很高, 对应能力值之间的差异很小, 且几乎贯彻整个能力分布区间。  相似文献   

20.
Many authors have demonstrated for idealized item configurations that equal item weights are often virtually as good for a particular predictive purpose as the item weights that are theoretically optimal. What has not been heretofore clear, however, is what happens to the similarity between weighted and unweighted composites of the same items when the item configuration's variance structure is complex.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号