期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Asymptotically Correct Standardization of Person-Fit Statistics Beyond Dichotomous Items

Sandip Sinharay 《Psychometrika》2016,81(4):992-1013

The $l_z$ statistic (Drasgow et al. in Br J Math Stat Psychol 38:67–86, 1985) is one of the most popular person-fit statistics (Armstrong et al. in Pract Assess Res Eval 12(16):1–10, 2007). Snijders (Psychometrika 66:331–342, 2001) derived the asymptotic null distribution of $l_z$ when the examinee ability parameter is estimated. He also suggested the $l^*_z$ statistic, which is the asymptotically correct standardized version of $l_z$. However, Snijders (Psychometrika 66:331–342, 2001) only considered tests with dichotomous items. In this paper, the asymptotic null distribution of $l_z$ is derived for mixed-format tests (those that include both dichotomous and polytomous items). The asymptotically correct standardized version of $l_z$, which can be considered as the extension of $l^*_z$ to such tests, is suggested. The Type I error rate and power of the suggested statistic are examined from several simulated datasets. The suggested statistic is computed using a real dataset. The suggested statistic appears to be a satisfactory tool for assessing person fit for mixed-format tests. 相似文献

2.

The mixed model for multivariate repeated measures: validity conditions and an approximate test

Robert J. Boik 《Psychometrika》1988,53(4):469-486

Repeated measures on multivariate responses can be analyzed according to either of two models: a doubly multivariate model (DMM) or a multivariate mixed model (MMM). This paper reviews both models and gives three new results concerning the MMM. The first result is, primarily, of theoretical interest; the second and third have implications for practice. First, it is shown that, given multivariate normality, a condition called multivariate sphericity of the covariance matrix is both necessary and sufficient for the validity of the MMM analysis. To test for departure from multivariate sphericity, the likelihood ratio test can be employed. The second result is an approximation to the null distribution of the likelihood ratio test statistic, useful for moderate sample sizes. Third, for situations satisfying multivariate normality, but not multivariate sphericity, a multivariate correction factor is derived. The correction factor generalizes Box's and can be used to construct an adjusted MMM test.I am grateful to an anonymous referee for carefully attending to the mathematical details of this paper. 相似文献

3.

Modified Distribution-Free Goodness-of-Fit Test Statistic

So Yeon Chun Michael W. Browne Alexander Shapiro 《Psychometrika》2018,83(1):48-66

Covariance structure analysis and its structural equation modeling extensions have become one of the most widely used methodologies in social sciences such as psychology, education, and economics. An important issue in such analysis is to assess the goodness of fit of a model under analysis. One of the most popular test statistics used in covariance structure analysis is the asymptotically distribution-free (ADF) test statistic introduced by Browne (Br J Math Stat Psychol 37:62–83, 1984). The ADF statistic can be used to test models without any specific distribution assumption (e.g., multivariate normal distribution) of the observed data. Despite its advantage, it has been shown in various empirical studies that unless sample sizes are extremely large, this ADF statistic could perform very poorly in practice. In this paper, we provide a theoretical explanation for this phenomenon and further propose a modified test statistic that improves the performance in samples of realistic size. The proposed statistic deals with the possible ill-conditioning of the involved large-scale covariance matrices. 相似文献

4.

A latent-trait based reliability estimate and upper bound

W. Alan Nicewander 《Psychometrika》1990,55(1):65-74

An estimate and an upper-bound estimate for the reliability of a test composed of binary items is derived from the multidimensional latent trait theory proposed by Bock and Aitkin (1981). The estimate derived here is similar to internal consistency estimates (such as coefficient alpha) in that it is a function of the correlations among test items; however, it is not a lowerbound estimate as are all other similar methods.An upper bound to reliability that is less than unity does not exist in the context of classical test theory. The richer theoretical background provided by Bock and Aitkin's latent trait model has allowed the development of an index (called here) that is always greater-than or equal-to the reliability coefficient for a test (and is less-than or equal-to one). The upper bound estimate of reliability has practical uses—one of which makes use of the greatest lower bound. 相似文献

5.

A stochastic multidimensional scaling procedure for the spatial representation of three-mode,three-way pick any/J data

Kamel Jedidi Wayne S. DeSarbo 《Psychometrika》1991,56(3):471-494

相似文献

6.

Improving the Crossing-SIBTEST Statistic for Detecting Non-uniform DIF

R. Philip Chalmers 《Psychometrika》2018,83(2):376-386

This paper demonstrates that, after applying a simple modification to Li and Stout’s (Psychometrika 61(4):647–677, 1996) CSIBTEST statistic, an improved variant of the statistic could be realized. It is shown that this modified version of CSIBTEST has a more direct association with the SIBTEST statistic presented by Shealy and Stout (Psychometrika 58(2):159–194, 1993). In particular, the asymptotic sampling distributions and general interpretation of the effect size estimates are the same for SIBTEST and the new CSIBTEST. Given the more natural connection to SIBTEST, it is shown that Li and Stout’s hypothesis testing approach is insufficient for CSIBTEST; thus, an improved hypothesis testing procedure is required. Based on the presented arguments, a new chi-squared-based hypothesis testing approach is proposed for the modified CSIBTEST statistic. Positive results from a modest Monte Carlo simulation study strongly suggest the original CSIBTEST procedure and randomization hypothesis testing approach should be replaced by the modified statistic and hypothesis testing method. 相似文献

7.

Analysis of the elements of attention: A neuropsychological approach 总被引：15，自引：0，他引：15

Allan F. Mirsky Bruno J. Anthony Connie C. Duncan Mary Beth Ahearn Sheppard G. Kellam 《Neuropsychology review》1991,2(2):109-145

A model for conceptualizing the components or elements of attention is presented. The model substitutes for the diffuse and global concept of attention a group of four processes and links them to a putative system of cerebral structures. Data in support of the model are presented; they are derived from neuropsychological test scores obtained from two samples, the first consisting of 203 adult neuropsychiatric patients and normal control subjects, and the second, an epidemiologically-based sample of 435 elementary school children. Principal components analyses of test scores from these two populations yielded similar results: a set of independent elements of attention that are assayed by different tests. This work presents a heuristic for clinical research in which the measurement of attention is essential. 相似文献

8.

Empirical bayes point estimates of latent trait scores without knowledge of the trait distribution

William Meredith Jack Kearns 《Psychometrika》1973,38(4):533-554

In this paper, recent developments in empirical Bayes procedures are tied in with current work in mental test theory. Point estimators of true scores are derived for the binomial and Rasch test models. These estimators are shown to be asymptotically optimal. Smoothing and an empirical study of the behavior of empirical Bayes estimates are taken up in the final section.This research was supported by the National Science Foundation, Division of Biological and Medical Sciences, Program in Psycho-Biology, Grant No. NSF GB-30779. 相似文献

9.

The average error of a learning model,estimation and use in testing the fit of models

Helena Chmura Kraemer 《Psychometrika》1965,30(3):343-352

A measure of the discrepancy between observed transition frequencies and those predicted by a learning model, an average error of a learning model, is presented. The maximum-likelihood estimator of the average error is derived and its use in a modified test of goodness of fit is demonstrated. 相似文献

10.

Rasch's model for reading speed with manifest explanatory variables

Margo G. H. Jansen 《Psychometrika》1997,62(3):393-409

In educational and psychological measurement we find the distinction between speed and power tests. Although most tests are partially speeded, the speed element is usually neglected. Here we consider a latent trait model developed by Rasch for the response time on a (set of) pure speed test(s), which is based on the assumption that the test response times are approximately gamma distributed, with known shape parameters and scale parameters depending on subject ability and test difficulty parameters. In our approach the subject parameters are treated as random variables having a common gamma distribution. From this, maximum marginal likelihood estimators are derived for the test difficulties and the parameters of the latent subject distribution. This basic model can be extended in a number of ways. Explanatory variables for the latent subject parameters and for the test parameters can be incorporated in the model. Our methods are illustrated by the analysis of a simulated and an empirical data set. 相似文献

11.

Reliability as a function of the number of item options derived from the “knowledge or random guessing” model

Robert?G.?MacCann Email author 《Psychometrika》2004,69(1):147-157

For (0, 1) scored multiple-choice tests, a formula giving test reliability as a function of the number of item options is derived, assuming the knowledge or random guessing model, the parallelism of the new and old tests (apart from the guessing probability), and the assumptions of classical test theory. It is shown that the formula is a more general case of an equation by Lord, and reduces to Lord's equation if the items are effectively parallel. Further, the formula is shown to be closely related to another formula derived from Lord's randomly parallel tests model. 相似文献

12.

Some exact conditional tests of independence forR ×C cross-classification tables

Alan Agresti Dennis Wackerly 《Psychometrika》1977,42(1):111-125

Exact conditional tests of independence in cross-classification tables are formulated based on the ² statistic and statistics with stronger operational interpretations, such as some nominal and ordinal measures of association. Guidelines for the table dimensions and sample sizes for which the tests are economically implemented on a computer are given. Some selected sample sizes and marginal distributions are used in a numerical comparison between the significance levels of the approximate and exact conditional tests based on the ² statistic.The authors are grateful for the suggestions of the referees and for computer funding provided by the Northeast Regional Data Center at the University of Florida. 相似文献

13.

Continuous Orthogonal Complement Functions and Distribution-Free Goodness of Fit Tests in Moment Structure Analysis

Robert Jennrich Albert Satorra 《Psychometrika》2013,78(3):545-552

It is shown that for any full column rank matrix X ₀ with more rows than columns there is a neighborhood $\mathcal{N}$ of X ₀ and a continuous function f on $\mathcal{N}$ such that f(X) is an orthogonal complement of X for all X in $\mathcal{N}$ . This is used to derive a distribution free goodness of fit test for covariance structure analysis. This test was proposed some time ago and is extensively used. Unfortunately, there is an error in the proof that the proposed test statistic has an asymptotic χ ² distribution. This is a potentially serious problem, without a proof the test statistic may not, in fact, be asymptoticly χ ². The proof, however, is easily fixed using a continuous orthogonal complement function. Similar problems arise in other applications where orthogonal complements are used. These can also be resolved by using continuous orthogonal complement functions. 相似文献

14.

The many null distributions of person fit indices 总被引：1，自引：0，他引：1

Ivo W. Molenaar Herbert Hoijtink 《Psychometrika》1990,55(1):75-106

This paper deals with the situation of an investigator who has collected the scores ofn persons to a set ofk dichotomous items, and wants to investigate whether the answers of all respondents are compatible with the one parameter logistic test model of Rasch. Contrary to the standard analysis of the Rasch model, where all persons are kept in the analysis and badly fittingitems may be removed, this paper studies the alternative model in which a small minority ofpersons has an answer strategy not described by the Rasch model. Such persons are called anomalous or aberrant. From the response vectors consisting ofk symbols each equal to 0 or 1, it is desired to classify each respondent as either anomalous or as conforming to the model. As this model is probabilistic, such a classification will possibly involve false positives and false negatives. Both for the Rasch model and for other item response models, the literature contains several proposals for a person fit index, which expresses for each individual the plausibility that his/her behavior follows the model. The present paper argues that such indices can only provide a satisfactory solution to the classification problem if their statistical distribution is known under the null hypothesis that all persons answer according to the model. This distribution, however, turns out to be rather different for different values of the person's latent trait value. This value will be called ability parameter, although our results are equally valid for Rasch scales measuring other attributes.As the true ability parameter is unknown, one can only use its estimate in order to obtain an estimated person fit value and an estimated null hypothesis distribution. The paper describes three specifications for the latter: assuming that the true ability equals its estimate, integrating across the ability distribution assumed for the population, and conditioning on the total score, which is in the Rasch model the sufficient statistic for the ability parameter.Classification rules for aberrance will be worked out for each of the three specifications. Depending on test length, item parameters and desired accuracy, they are based on the exact distribution, its Monte Carlo estimate and a new and promising approximation based on the moments of the person fit statistic. Results for the likelihood person fit statistic are given in detail, the methods could also be applied to other fit statistics. A comparison of the three specifications results in the recommendation to condition on the total score, as this avoids some problems of interpretation that affect the other two specifications.The authors express their gratitude to the reviewers and to many colleagues for comments on an earlier version. 相似文献

15.

When can we trust theF-approximation of the box-test?

Friedrich Foerster Gerhard Stemmler 《Psychometrika》1990,55(4):727-728

Consider a multivariate context withp variates andk independent samples, each of sizen. To test equality of thek population covariance matrices, the likelihood ratio test is commonly employed. Box'sF-approximation to the null distribution of the test statistic can be used to computep-values, if sample sizes are not too small. It is suggested to regard theF-approximation as accurate if the sample sizesn are greater than or equal to 1+0.0613p ²+2.7265p-1.4182p ^0.5+0.235p ^1.4* In (k), for 5p30,k20.This research was supported by the Deutsche Forschungsgemeinschaft through Ste 405/2-1. 相似文献

16.

Some approximate tests for repeated measurement designs

Huynh Huynh 《Psychometrika》1978,43(2):161-175

Four approximate tests are considered for repeated measurement designs in which observations are multivariate normal with arbitrary covariance matrices. In these tests traditional within-subject mean square ratios are compared with critical values derived fromF distributions with adjusted degrees of freedom. Two of them—the approximate and the improved general approximate (IGA) tests—behave adequately in terms of Type I error. Generally, the IGA test functions better than the approximate test, however the latter involves less computations. In regards to power, the IGA test may compete with one multivariate procedure when the assumptions of the latter are tenable.The author wishes to thank Garrett K. Mandeville for his careful reading of the final version of the paper. 相似文献

17.

A k-sample significance test for independent alpha coefficients 总被引：1，自引：0，他引：1

A. Ralph Hakstian Thomas E. Whalen 《Psychometrika》1976,41(2):219-231

The earlier two-sample procedure of Feldt [1969] for comparing independent alpha reliability coefficients is extended to the case ofK 2 independent samples. Details of a normalization of the statistic under consideration are presented, leading to computational procedures for the overallK-group significance test and accompanying multiple comparisons. Results based on computer simulation methods are presented, demonstrating that the procedures control Type I error adequately. The results of a power comparison of the case ofK=2 with Feldt's [1969]F test are also presented. The differences in power were negligible. Some final observations, along with suggestions for further research, are noted.The authors gratefully acknowledge the assistance of Michael E. Masson, in the computations performed, and of Leonard S. Feldt, in suggesting the data generation procedures used in the study. In addition, the authors thank James Zidek and the Institute of Applied Mathematics and Statistics, University of British Columbia, for advice concerning some of the theoretical development. 相似文献

18.

A priori tests in repeated measures designs: Effects of nonsphericity

Robert J. Boik 《Psychometrika》1981,46(3):241-255

The validity conditions for univariate repeated measures designs are described. Attention is focused on the sphericity requirement. For av degree of freedom family of comparisons among the repeated measures, sphericity exists when all contrasts contained in thev dimensional space have equal variances. Under nonsphericity, upper and lower bounds on test size and power of a priori, repeated measures,F tests are derived. The effects of nonsphericity are illustrated by means of a set of charts. The charts reveal that small departures from sphericity (.97 <1.00) can seriously affect test size and power. It is recommended that separate rather than pooled error term procedures be routinely used to test a priori hypotheses.Appreciation is extended to Milton Parnes for his insightful assistance. 相似文献

19.

On testing hypotheses regarding a class of covariance structures

J. N. Srivastava 《Psychometrika》1966,31(2):147-164

Let x be ap-component random variable having a multivariate normal distribution with covariance matrix . In this paper, we consider the problem of testing hypotheses of the formH ₀: =b ₁₁ + +b _m_m, whereb _i's are unknown scalars, and _i's are a set of known and simultaneously diagonalizable matrices. This problem has both psychometric and statistical interest, and its basic theory is developed here. Besides, the problem of obtaining likelihood-ratio statistic for testingH ₀ is studied, and the statistic obtained in a special case. 相似文献

20.

Two simple approximations to the distributions of quadratic forms

Ke‐Hai Yuan Peter M. Bentler 《The British journal of mathematical and statistical psychology》2010,63(2):273-291

Many test statistics are asymptotically equivalent to quadratic forms of normal variables, which are further equivalent to with z_i being independent and following N(0,1). Two approximations to the distribution of T have been implemented in popular software and are widely used in evaluating various models. It is important to know how accurate these approximations are when compared to each other and to the exact distribution of T. The paper systematically studies the quality of the two approximations and examines the effect of the λ_i and the degrees of freedom d by analysis and Monte Carlo. The results imply that the adjusted distribution for T can be as good as knowing its exact distribution. When the coefficient of variation of the λ_i is small, the rescaled statistic is also adequate for practical model inference. But comparing T_R against will inflate type I errors when substantial differences exist among the λ_i, especially, when d is also large. 相似文献