首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The use of one-way analysis of variance tables for obtaining unbiased estimates of true score variance and error score variance in the classical test theory model is discussed. Attention is paid to both balanced (equal numbers of observations on each person) and unbalanced designs, and estimates provided for both homoscedastic (common error variance for all persons) and heteroscedastic cases. It is noted that optimality properties (minimum variance) can be claimed for estimates derived from analysis of variance tables only in the balanced, homoscedastic case, and that there they are essentially a reflection of the symmetry inherent in the situation. Estimates which might be preferable in other cases are discussed. An example is given where a natural analysis of variance table leads to estimates which cannot be derived from the set of statistics which is sufficient under normality assumptions. Reference is made to Bayesian studies which shed light on the difficulties encountered. Work on this paper was carried out at the headquarters of the American College Testing Program, Iowa City, Iowa, while the author was on leave from the University College of Wales.  相似文献   

2.
This paper gives a method of estimating the reliability of a test which has been divided into three parts. The parts do not have to satisfy any statistical criteria like parallelism or-equivalence. If the parts are homogeneous in content (congeneric),i.e., if their true scores are linearly related and if sample size is large then the method described in this paper will give the precise value of the reliability parameter. If the homogeneity condition is violated then underestimation will typically result. However, the estimate will always be at least as accurate as coefficient and Guttman's lower bound 3 when the same data are used. An application to real data is presented by way of illustration. Seven different splits of the same test are analyzed. The new method yields remarkably stable reliability estimates across splits as predicted by the theory. One deviating value can be accounted for by a certain unsuspected peculiarity of the test composition. Both coefficient and 3 would not have led to the same discovery.Expanded version of a paper given at the Psychometric Society Meeting in Stanford, California, March 1974.  相似文献   

3.
Several procedures have been proposed in the statistical literature for estimating simultaneously the mean of each ofk binomial populations. In terms of mental test theory, however, it is not clear that these procedures should be used when an item sampling model applies since the binomial error model is usually viewed as an oversimplification of the true situation. In this study we compare empirically several of these estimation techniques. Particular attention is given to situations where observations are generated according to a two-term approximation to the compound binomial distribution.The author would like to thank Shelley Niwa for writing the computer programs used in this study.The work upon which this publication is based was performed pursuant to Grant # NIE-G-76-0083 with the National Institute of Education, Department of Health, Education and Welfare. Points of view or opinions stated do not necessarily represent official NIE position or policy.  相似文献   

4.
5.
A common question of interest to researchers in psychology is the equivalence of two or more groups. Failure to reject the null hypothesis of traditional hypothesis tests such as the ANOVA F‐test (i.e., H0: μ1 = … = μk) does not imply the equivalence of the population means. Researchers interested in determining the equivalence of k independent groups should apply a one‐way test of equivalence (e.g., Wellek, 2003). The goals of this study were to investigate the robustness of the one‐way Wellek test of equivalence to violations of homogeneity of variance assumption, and compare the Type I error rates and power of the Wellek test with a heteroscedastic version which was based on the logic of the one‐way Welch (1951) F‐test. The results indicate that the proposed Wellek–Welch test was insensitive to violations of the homogeneity of variance assumption, whereas the original Wellek test was not appropriate when the population variances were not equal.  相似文献   

6.
7.
Octave equivalence occurs when an observer judges notes separated by a doubling in frequency perceptually similar. The octave appears to form the basis of pitch change in all human cultures and thus may be of biological origin. Previously, we developed a nonverbal operant conditioning test of octave generalization and transfer in humans. The results of this testing showed that humans with and without musical training perceive the octave relationship between pitches. Our goal in the current study was to determine whether black-capped chickadees, a North American songbird, perceive octave equivalence. We chose these chickadees because of their reliance on pitch in assessing conspecific vocalizations, our strong background knowledge on their pitch height perception (log-linear perception of frequency), and the phylogenetic disparity between them and humans. Compared to humans, songbirds are highly skilled at using pitch height perception to classify pitches into ranges, independent of the octave. Our results suggest that chickadees used that skill, rather than octave equivalence, to transfer the note-range discrimination from one octave to the next. In contrast, there is evidence that at least some mammals, including humans, do perceive octave equivalence.  相似文献   

8.
9.
10.
BETTS GL 《Psychometrika》1950,15(4):435-439
The P50-discriminant has been reported elsewhere in connection with its use in predicting whether selective service registrants if inducted would become normal operative soldiers or would commit offenses causing their imprisonment. The standard error of the P50-discriminant is a good measure to use in determining how far to the side of this statistic a particular case falls. The standard error formula itself has also been published elsewhere; but its derivation, as the variance error, is given here.The author gratefully acknowledges the very extensive assistance kindly given to him by Dr. Truman L. Kelley and Dr. Frederick Mosteller. This assistance was given without reference to the utility of the P50-discriminant, upon which matter the author reports elsewhere and for which he takes full responsibility.  相似文献   

11.
HAMILTON CH 《Psychometrika》1950,15(2):151-168
A formula for estimating real scores on a multiple-choice test from a knowledge of raw scores is derived. This formula does not involve the assumption of a binomial distribution of real scores as does the Calandra formula. Other important formulas derived show: the variance of real scores in terms of the variance of raw scores and the correlation between real scores and raw scores. If the variance of real scores (or of raw scores also) is binomial, the regression of real scores on raw scores is linear; but, otherwise the regression is curvilinear. Yet the linear estimating formula is a close approximation to the curvilinear relationship. Factors affecting the regression of real scores on raw scores and the correlation coefficient are: (1) the number of choices per question; (2) the number of questions answered; (3) the ratio of the average group raw score to the variance of raw scores.  相似文献   

12.
13.
Pigeons were trained on many-to-one matching in which pairs of samples, each consisting of a visual stimulus and a distinctive pattern of center-key responding, occasioned the same reinforced comparison choice. Acquired equivalence between the visual and response samples then was evaluated by reinforcing new comparison choices to one set of samples, and examining generalization of these choices to the other samples. Three separate experiments found no evidence of such generalization, as indexed by performance on class-consistent versus class-inconsistent tests. Other tests showed that the pigeons' center-key response patterns during training had indeed served as a conditional cue for choice. These results do not support the hypothesis that different defined responses can become members of acquired equivalence classes.  相似文献   

14.
15.
16.
When a planar shape is viewed obliquely, it is deformed by a perspective deformation. If the visual system were to pick up geometrical invariants from such projections, these would necessarily be invariant under the wider class of projective transformations. To what extent can the visual system tell the difference between perspective and nonperspective but still projective deformations of shapes? To investigate this, observers were asked to indicate which of two test patterns most resembled a standard pattern. The test patterns were related to the standard pattern by a perspective or projective transformation, or they were completely unrelated. Performance was slightly better in a matching task with perspective and unrelated test patterns (92.6%) than in a projective-random matching task (88.8%). In a direct comparison, participants had a small preference (58.5%) for the perspectively related patterns over the projectively related ones. Preferences were based on the values of the transformation parameters (slant and shear). Hence, perspective and projective transformations yielded perceptual differences, but they were not treated in a categorically different manner by the human visual system.  相似文献   

17.
Score tests for identifying locally dependent item pairs have been proposed for binary item response models. In this article, both the bifactor and the threshold shift score tests are generalized to the graded response model. For the bifactor test, the generalization is straightforward; it adds one secondary dimension associated only with one pair of items. For the threshold shift test, however, multiple generalizations are possible: in particular, conditional, uniform, and linear shift tests are discussed in this article. Simulation studies show that all of the score tests have accurate Type I error rates given large enough samples, although their small‐sample behaviour is not as good as that of Pearson's Χ2 and M2 as proposed in other studies for the purpose of local dependence (LD) detection. All score tests have the highest power to detect the LD which is consistent with their parametric form, and in this case they are uniformly more powerful than Χ2 and M2; even wrongly specified score tests are more powerful than Χ2 and M2 in most conditions. An example using empirical data is provided for illustration.  相似文献   

18.
19.
The representation of test scores asn-dimensional points leads directly to an estimate of error variance at a particular score level in the case of equivalent items. Approximations are suggested for the case of non-equivalent items. These approximations are compared, with satisfactory results, with empirical data prepared by Dr. Mollenkopf.  相似文献   

20.
Bauer DJ 《心理学方法》2005,10(3):305-316
Measurement invariance is a necessary condition for the evaluation of factor mean differences over groups or time. This article considers the potential problems that can arise for tests of measurement invariance when the true factor-to-indicator relationship is nonlinear (quadratic) and invariant but the linear factor model is nevertheless applied. The factor loadings and indicator intercepts of the linear model will diverge across groups as the factor mean difference increases. Power analyses show that even apparently small quadratic effects can result in rejection of measurement invariance at moderate sample sizes when the factor mean difference is medium to large. Recommendations include the identification of nonlinear relationships using diagnostic plots and consideration of newly developed methods for fitting nonlinear factor models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号