Reliability measures in item response theory: Manifest versus latent correlation functions |
| |
Authors: | Elasma Milanzi Geert Molenberghs Ariel Alonso Geert Verbeke Paul De Boeck |
| |
Affiliation: | 1. Interuniversity Institute for Biostatistics and statistical Bioinformatics, Universiteit Hasselt, Diepenbeek, Belgium;2. Interuniversity Institute for Biostatistics and statistical Bioinformatics, Katholieke Universiteit Leuven, Belgium;3. Department of Psychology, Higher Cognition and Individual Differences, Katholieke Universiteit Leuven, Belgium;4. Universiteit van Amsterdam, the Netherlands |
| |
Abstract: | For item response theory (IRT) models, which belong to the class of generalized linear or non‐linear mixed models, reliability at the scale of observed scores (i.e., manifest correlation) is more difficult to calculate than latent correlation based reliability, but usually of greater scientific interest. This is not least because it cannot be calculated explicitly when the logit link is used in conjunction with normal random effects. As such, approximations such as Fisher's information coefficient, Cronbach's α, or the latent correlation are calculated, allegedly because it is easy to do so. Cronbach's α has well‐known and serious drawbacks, Fisher's information is not meaningful under certain circumstances, and there is an important but often overlooked difference between latent and manifest correlations. Here, manifest correlation refers to correlation between observed scores, while latent correlation refers to correlation between scores at the latent (e.g., logit or probit) scale. Thus, using one in place of the other can lead to erroneous conclusions. Taylor series based reliability measures, which are based on manifest correlation functions, are derived and a careful comparison of reliability measures based on latent correlations, Fisher's information, and exact reliability is carried out. The latent correlations are virtually always considerably higher than their manifest counterparts, Fisher's information measure shows no coherent behaviour (it is even negative in some cases), while the newly introduced Taylor series based approximations reflect the exact reliability very closely. Comparisons among the various types of correlations, for various IRT models, are made using algebraic expressions, Monte Carlo simulations, and data analysis. Given the light computational burden and the performance of Taylor series based reliability measures, their use is recommended. |
| |
Keywords: | one‐parameter logistic model two‐parameter logistic model logit link probit link Rasch model |
|
|