期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

11.

Hierarchical Diagnostic Classification Models Morphing into Unidimensional ‘Diagnostic’ Classification Models—A Commentary

Matthias von Davier Shelby J. Haberman 《Psychometrika》2014,79(2):340-346

This commentary addresses the modeling and final analytical path taken, as well as the terminology used, in the paper “Hierarchical diagnostic classification models: a family of models for estimating and testing attribute hierarchies” by Templin and Bradshaw (Psychometrika, doi:10.1007/s11336-013-9362-0, 2013). It raises several issues concerning use of cognitive diagnostic models that either assume attribute hierarchies or assume a certain form of attribute interactions. The issues raised are illustrated with examples, and references are provided for further examination. 相似文献

12.

Assessing Item Fit for Unidimensional Item Response Theory Models Using Residuals from Estimated Item Response Functions

Shelby J. Haberman Sandip Sinharay Kyong Hee Chon 《Psychometrika》2013,78(3):417-440

Residual analysis (e.g. Hambleton & Swaminathan, Item response theory: principles and applications, Kluwer Academic, Boston, 1985; Hambleton, Swaminathan, & Rogers, Fundamentals of item response theory, Sage, Newbury Park, 1991) is a popular method to assess fit of item response theory (IRT) models. We suggest a form of residual analysis that may be applied to assess item fit for unidimensional IRT models. The residual analysis consists of a comparison of the maximum-likelihood estimate of the item characteristic curve with an alternative ratio estimate of the item characteristic curve. The large sample distribution of the residual is proved to be standardized normal when the IRT model fits the data. We compare the performance of our suggested residual to the standardized residual of Hambleton et al. (Fundamentals of item response theory, Sage, Newbury Park, 1991) in a detailed simulation study. We then calculate our suggested residuals using data from an operational test. The residuals appear to be useful in assessing the item fit for unidimensional IRT models. 相似文献

13.

Does subgroup membership information lead to better estimation of true subscores?

Shelby J. Haberman Sandip Sinharay 《The British journal of mathematical and statistical psychology》2013,66(3):452-469

Haberman (2008) suggested a method to determine if subtest scores have added value over the total score. The method is based on classical test theory and considers the estimation of the true subscores. Performance of subgroups, for example, those based on gender or ethnicity, on subtests is often of interest. Researchers such as Stricker (1993) and Livingston and Rupp (2004) found that the difference in performance between the subgroups often varies over the different subtests. We suggest a method to examine whether the knowledge of the subgroup membership of the examinees leads to a better estimation of the true subscores. We apply our suggested method to data from two operational testing programmes. The knowledge of the subgroup membership of the examinees does not lead to a better estimation of the true subscore for the data sets. 相似文献

14.

Reporting Diagnostic Scores in Educational Testing: Temptations,Pitfalls, and Some Solutions

Sandip Sinharay Gautam Puhan Shelby J. Haberman 《Multivariate behavioral research》2013,48(3):553-573

Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting in educational testing. The existing methods for diagnostic score reporting are discussed. A recent method (Haberman, 2008a Haberman, S. J. 2008a. When can subscores have value?. Journal of Educational and Behavioral Statistics, 33: 204–229. [Crossref], [Web of Science ®] , [Google Scholar]) that examines if diagnostic scores are worth reporting is reviewed. It is demonstrated, using results from operational and simulated data, that diagnostic scores have to be based on a sufficient number of items and have to be sufficiently distinct from each other to be worth reporting and that several operationally reported subscores are actually not worth reporting. Several recommendations are made for those interested to report diagnostic scores for educational tests. 相似文献

15.

Limits on Log Odds Ratios for Unidimensional Item Response Theory Models

Shelby J. Haberman Paul W. Holland Sandip Sinharay 《Psychometrika》2007,72(4):551-561

Bounds are established for log odds ratios (log cross-product ratios) involving pairs of items for item response models. First, expressions for bounds on log odds ratios are provided for one-dimensional item response models in general. Then, explicit bounds are obtained for the Rasch model and the two-parameter logistic (2PL) model. Results are also illustrated through an example from a study of model-checking procedures. The bounds obtained can provide an elementary basis for assessment of goodness of fit of these models. Any opinions expressed in this publication are those of the authors and not necessarily those of the Educational Testing Service. The authors thank Dan Eignor, Matthias von Davier, Lydia Gladkova, Brian Junker, and the three anonymous reviewers for their invaluable advice. The authors gratefully acknowledge the help of Kim Fryer with proofreading. 相似文献

16.

Investigating Test-Taking Behaviors Using Timing and Process Data

Yi-Hsuan Lee Shelby J. Haberman 《International Journal of Testing》2016,16(3):240-267

The use of computer-based assessments makes the collection of detailed data that capture examinees’ progress in the tests and time spent on individual actions possible. This article presents a study using process and timing data to aid understanding of an international language assessment and the examinees. Issues regarding test-taking strategies, test speededness, test design, and their relationship to examinees’ demographic backgrounds and performance are also discussed. 相似文献

17.

Reporting of Subscores Using Multidimensional Item Response Theory

Shelby J. Haberman Sandip Sinharay 《Psychometrika》2010,75(2):209-227

Recently, there has been increasing interest in reporting subscores. This paper examines reporting of subscores using multidimensional item response theory (MIRT) models (e.g., Reckase in Appl. Psychol. Meas. 21:25–36, 1997; C.R. Rao and S. Sinharay (Eds), Handbook of Statistics, vol. 26, pp. 607–642, North-Holland, Amsterdam, 2007; Beguin & Glas in Psychometrika, 66:471–488, 2001). A MIRT model is fitted using a stabilized Newton–Raphson algorithm (Haberman in The Analysis of Frequency Data, University of Chicago Press, Chicago, 1974; Sociol. Methodol. 18:193–211, 1988) with adaptive Gauss–Hermite quadrature (Haberman, von Davier, & Lee in ETS Research Rep. No. RR-08-45, ETS, Princeton, 2008). A new statistical approach is proposed to assess when subscores using the MIRT model have any added value over (i) the total score or (ii) subscores based on classical test theory (Haberman in J. Educ. Behav. Stat. 33:204–229, 2008; Haberman, Sinharay, & Puhan in Br. J. Math. Stat. Psychol. 62:79–95, 2008). The MIRT-based methods are applied to several operational data sets. The results show that the subscores based on MIRT are slightly more accurate than subscore estimates derived by classical test theory. 相似文献

18.

Relation of social competence to scores on two scales of psychosis proneness

M C Haberman L J Chapman J S Numbers R M McFall 《Journal of abnormal psychology》1979,88(6):675-677

相似文献

19.

Flexible target templates improve visual search accuracy for faces depicting emotion

Won Bo-Yeong Haberman Jason Bliss-Moreau Eliza Geng Joy J. 《Attention, perception & psychophysics》2020,82(6):2909-2923

Theories of visual attention hypothesize that target selection depends upon matching visual inputs to a memory representation of the target – i.e., the target or attentional template. Most theories assume that the template contains a veridical copy of target features, but recent studies suggest that target representations may shift "off veridical" from actual target features to increase target-to-distractor distinctiveness. However, these studies have been limited to simple visual features (e.g., orientation, color), which leaves open the question of whether similar principles apply to complex stimuli, such as a face depicting an emotion, the perception of which is known to be shaped by conceptual knowledge. In three studies, we find confirmatory evidence for the hypothesis that attention modulates the representation of an emotional face to increase target-to-distractor distinctiveness. This occurs over-and-above strong pre-existing conceptual and perceptual biases in the representation of individual faces. The results are consistent with the view that visual search accuracy is determined by the representational distance between the target template in memory and distractor information in the environment, not the veridical target and distractor features.

相似文献