首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The development of a test to measure Elli's concept of rationality is described. In the first study discussed, a 58-item test is developed to measure rationality, and the reliability and convergent validity of the test is described. In a second study, the discriminant validity of the test is examined. An attempt is also made to reduce social-desirability content in the test by eliminating items most highly correlated with a Social Desirability Scale. The final 44-item test is found to be high in both reliability and validity. The factor structure of the test is also examined.  相似文献   

2.
In the classical test theory, a high-reliability test always leads to a precise measurement. However, when it comes to the prediction of test scores, it is not necessarily so. Based on a Bayesian statistical approach, we predicted the distributions of test scores for a new subject, a new test, and a new subject taking a new test. Under some reasonable conditions, the predicted means, variances, and covariances of predicted scores were obtained and investigated. We found that high test reliability did not necessarily lead to small variances or covariances. For a new subject, higher test reliability led to larger predicted variances and covariances, because high test reliability enabled a more accurate prediction of test score variances. Regarding a new subject taking a new test, in this study, higher test reliability led to a large variance when the sample size was smaller than half the number of tests. The classical test theory is reanalyzed from the viewpoint of predictions and some suggestions are made.  相似文献   

3.
Preliminary tests of equality of variances used before a test of location are no longer widely recommended by statisticians, although they persist in some textbooks and software packages. The present study extends the findings of previous studies and provides further reasons for discontinuing the use of preliminary tests. The study found Type I error rates of a two‐stage procedure, consisting of a preliminary Levene test on samples of different sizes with unequal variances, followed by either a Student pooled‐variances t test or a Welch separate‐variances t test. Simulations disclosed that the twostage procedure fails to protect the significance level and usually makes the situation worse. Earlier studies have shown that preliminary tests often adversely affect the size of the test, and also that the Welch test is superior to the t test when variances are unequal. The present simulations reveal that changes in Type I error rates are greater when sample sizes are smaller, when the difference in variances is slight rather than extreme, and when the significance level is more stringent. Furthermore, the validity of the Welch test deteriorates if it is used only on those occasions where a preliminary test indicates it is needed. Optimum protection is assured by using a separate‐variances test unconditionally whenever sample sizes are unequal.  相似文献   

4.
The test information function serves important roles in latent trait models and in their applications. Among others, it has been used as the measure of accuracy in ability estimation. A question arises, however, if the test information function is accurate enough for all meaningful levels of ability relative to the test, especially when the number of test items is relatively small (e.g., less than 50). In the present paper, using the constant information model and constant amounts of test information for a finite interval of ability, simulated data were produced for eight different levels of ability and for twenty different numbers of test items ranging between 10 and 200. Analyses of these data suggest that it is desirable to consider some modification of the test information function when it is used as the measure of accuracy in ability estimation.  相似文献   

5.
An asymmetrical test of homogeneity of proportions possesses distinct advantages over a symmetrical test. The symmetric chi square test of homogeneity is widely employed in psychological research. An asymmetrical alternative to the chi square test of homogeneity is proposed and described.  相似文献   

6.
Cluster bias refers to measurement bias with respect to the clustering variable in multilevel data. The absence of cluster bias implies absence of bias with respect to any cluster‐level (level 2) variable. The variables that possibly cause the bias do not have to be measured to test for cluster bias. Therefore, the test for cluster bias serves as a global test of measurement bias with respect to any level 2 variable. However, the validity of the global test depends on the Type I and Type II error rates of the test. We compare the performance of the test for cluster bias with the restricted factor analysis (RFA) test, which can be used if the variable that leads to measurement bias is measured. It appeared that the RFA test has considerably more power than the test for cluster bias. However, the false positive rates of the test for cluster bias were generally around the expected values, while the RFA test showed unacceptably high false positive rates in some conditions. We conclude that if no significant cluster bias is found, still significant bias with respect to a level 2 violator can be detected with an RFA model. Although the test for cluster bias is less powerful, an advantage of the test is that the cause of the bias does not need to be measured, or even known.  相似文献   

7.
智力测验分数的解释是智力心理学一个重要的研究领域。智力测验分数解释最基本的一个目的就是为了理解。随着智力理论和智力测验的发展,智力测验分数的解释出现了一些新的特点与趋势即注重智力测验分数解释的理论基础,注重智力测验分数解释的效度丰富化与从有用信息的抽取来解释测验分数的趋向。  相似文献   

8.
叶宝娟  温忠粦 《心理科学》2013,36(3):728-733
在心理、教育和管理等研究领域中,经常会碰到两水平(两层)的数据结构,如学生嵌套在班级中,员工嵌套在企业中。在两水平研究中,被试通常不是独立的,如果直接用单水平信度公式进行估计,会高估测验信度。文献上已有研究讨论如何更准确地估计两水平研究中单维测验的信度。本研究指出了现有的估计公式的不足之处,用两水平验证性因子分析推导出一个新的信度公式,举例演示如何计算,并给出简单的计算程序。  相似文献   

9.
A modern test that takes advantage of the opportunities provided by advancements in computer technology is the multimedia test. The purpose of this study was to investigate the criterion-related validity of a specific open-ended multimedia test, namely a webcam test, by means of a concurrent validity study. In a webcam test a number of work-related situations are presented and participants have to respond as if these were real work situations. The responses are recorded with a webcam. The aim of the webcam test which we investigated is to measure the effectiveness of social work behaviour. This first field study on a webcam test was conducted in an employment agency in The Netherlands. The sample consisted of 188 consultants who participated in a certification process. For the webcam test, good interrater reliabilities and internal consistencies were found. The results showed the webcam test to be significantly correlated with job placement success. The webcam test scores were also found to be related to job knowledge. Hierarchical regression analysis demonstrated that the webcam test has incremental validity up to and beyond job knowledge in predicting job placement success. The webcam test, therefore, seems a promising type of instrument for personnel selection.  相似文献   

10.
A method is provided for estimating the nonspurious correlation of a part of a test with the total test. Two cases are considered: one in which the actual subtest is parallel to the total test, the other in which the actual subtest is not parallel to the total test.  相似文献   

11.
The content unreliability of an essay test is the error due to the items used or the content of the test. The reader unreliability is due to variation in judgment of the persons who read and score the essay test. The content reliability of an essay test is accordingly defined as being independent of the reader reliability. Formulae are derived for the reader reliability and for the content reliability. The content reliability is found to be equal to the geometric mean of the test reliabilities computed from the scores assigned by the two readers, divided by the reader reliability.  相似文献   

12.
考察了锚测验难度水平对其来源测验水平的代表性对垂直量尺化的影响。采用模拟研究的方法,比较了锚测验难度等于来源测验、位于高低年级测验水平难度区间的第25百分位处及区间第50百分位处时,年级能力分布和垂直量尺特性上的参数返真结果,发现锚题难度水平高于其来源测验非但不会导致垂直量尺化结果变差,在有的情境下反而可能会提高其准确性。研究揭示人们构建垂直量尺时,可以根据内容和其他统计特征的需要对锚测验的难度水平做出适当调整  相似文献   

13.
If a loss function is available specifying the social cost of an error of measurement in the score on a unidimensional test, an asymptotic method, based on item response theory, is developed for optimal test design for a specified target population of examinees. Since in the real world such loss functions are not available, it is more useful to reverse this process; thus a method is developed for finding the loss function for which a given test is an optimally designed test for the target population. An illustrative application is presented for one operational test.This work was supported in part by contract N00014-80-C-0402, project designation NR 150-453 between the Office of Naval Research and Educational Testing Service. Reproduction in whole or in part is permitted for any purpose of the United States Government.  相似文献   

14.
GREEN BF 《Psychometrika》1950,15(3):251-257
A procedure is proposed for testing the significance of group differences in the standard error of measurement of a psychological test. Wilks' criterion is used to assure that the tests used in ascertaining reliability and hence variance of errors of measurement may be assumed parallel for each group. Votaw's criterion may be used to check whether the test scores of all the groups have the same mean, variance, and covariance. It is possible, however, for the variance and reliability of the test to differ widely from group to group, so that Votaw's criterion is not satisfied even though the variance of errors of measurement stays relatively constant. For this case a modification of Neyman and Pearson's criterion is developed to test agreement among standard errors of measurement despite group differences in mean, variance, and reliability of the test.The author wishes to acknowledge the helpful criticisms of Dr. Harold Gulliksen, who suggested the problem.  相似文献   

15.
Researchers often test a null hypothesis of no ability in the populattion (the so-called “parity” hypothesis) using a single, forced-choice question with k alternatives. In this study a result is presented which should help researchers select the number of alternatives that maximizes the statistical power of the parity hypothesis test. Also the conditions under which it is always beneficial to add alternatives to the test are derived. Finally, the derived result is used to compare several popular parity test designs. The results show that the frequently used triangle test is optimal under a very broad range of plausible conditions.  相似文献   

16.
A visual illusion known as the motion aftereffect is considered to be the perceptual manifestation of motion sensors that are recovering from adaptation. This aftereffect can be obtained for a specific range of adaptation speeds with its magnitude generally peaking for speeds around 3 deg s-1. The classic motion aftereffect is usually measured with a static test pattern. Here, we measured the magnitude of the motion aftereffect for a large range of velocities covering also higher speeds, using both static and dynamic test patterns. The results suggest that at least two (sub)populations of motion-sensitive neurons underlie these motion aftereffects. One population shows itself under static test conditions and is dominant for low adaptation speeds, and the other is prevalent under dynamic test conditions after adaptation to high speeds. The dynamic motion aftereffect can be perceived for adaptation speeds up to three times as fast as the static motion aftereffect. We tested predictions that follow from the hypothesised division in neuronal substrates. We found that for exactly the same adaptation conditions (oppositely directed transparent motion with different speeds), the aftereffect direction differs by 180 degrees depending on the test pattern. The motion aftereffect is opposite to the pattern moving at low speed when the test pattern is static, and opposite to the high-speed pattern for a dynamic test pattern. The determining factor is the combination of adaptation speed and type of test pattern.  相似文献   

17.
Simultaneous brightness contrast is investigated in the fovea as a function of (1) amount of surround of the inducing field (Experiment 1) and (2) separation between the test and inducing fields (Experiment 2). Circular test and match fields subtending 14 min (radius) are used throughout. The inducing field, held constant in area, is a circular annulus (615 sq min) varying from a quadrant on one side of the test circle to an annulus completely surrounding the test circle. Test-field apparent brightness is not significantly affected by amount of inducing-field surround when the separation between centers of the test and inducing fields is held constant (Experiment 1). Experiment 2, though, shows that apparent brightness increases significantly as the separation between the centers of the test and inducing fields is increased.  相似文献   

18.
A Stockman  J Mollon 《Perception》1986,15(6):729-754
When a tiny centred test flash is presented on a small concentric background, the threshold rises with background radiance more quickly than Weber's law would predict. It is argued that under such conditions it is possible, by means of a test sensitivity method, to isolate either the M-cone or the L-cone types throughout the visible spectrum. As predicted, double-branched M- and L-cone tyr functions are found when the test flash and the field are of the same wavelength. From the independent vertical displacements of the two branches as test wavelength is varied, it is possible to derive spectral sensitivities that agree well with dichromatic sensitivities and K?nig fundamentals. The test sensitivities deviate from pi 4 at longer wavelengths and from pi 5 at shorter wavelengths.  相似文献   

19.
认知训练对不同类型考试焦虑的作用   总被引:4,自引:1,他引:3       下载免费PDF全文
以163名高中一年级学生为被试,采用现场实验法考察了认知训练对认知主导型考试焦虑(C型)、生理唤醒主导型考试焦虑(P型)和学习技能缺乏型考试焦虑(S型)的作用效果.结果表明:认知训练能显著地降低三种考试焦虑类型的状态考试焦虑,提高C型和P型的考试成绩.认知训练在降低考试焦虑和提高考试成绩两方面对C型考试焦虑者作用尤为明显.认知训练没有表明对S型考试焦虑者有提高考试成绩的作用.  相似文献   

20.
In this investigation, the relative importance of the effects of anticipated test format and anticipated test difficulty on performance was examined by simultaneously manipulating both. Experiments 1 and 2 showed that test performance was affected more by anticipated test format than by anticipated test difficulty. This suggests that the superior performance of subjects who had anticipated a recall test versus those who had anticipated a recognition test, reported here and in previous studies, is more likely to be due to anticipating a recall format than to anticipating a more difficult test. Experiment 2 showed that subjects who had anticipated a recall test studied longer than subjects who had anticipated a recognition test, even when recall tests were less difficult than recognition tests. One explanation for this finding is that subjects inaccurately monitor the relative difficulty of tests across test formats. Subjects rated recall items as more difficult than recognition items, even when recall items are actually less difficult (Experiment 3). These findings suggest that a priori metacognitive knowledge may reduce the accuracy of on-line metacognitive monitoring.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号