首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The development of a test to measure Elli's concept of rationality is described. In the first study discussed, a 58-item test is developed to measure rationality, and the reliability and convergent validity of the test is described. In a second study, the discriminant validity of the test is examined. An attempt is also made to reduce social-desirability content in the test by eliminating items most highly correlated with a Social Desirability Scale. The final 44-item test is found to be high in both reliability and validity. The factor structure of the test is also examined.  相似文献   

2.
Abstract

The present research examined the extent to which sleep disturbance is involved in the experience of test anxiety. In Study 1, a sample of 80 subjects completed a trait measure of test anxiety and completed a sleep inventory with reference to the past 30 days. In Study 2, a sample of 188 subjects provided measures of trait and state test anxiety and completed a sleep inventory for the night preceding an actual test. The results of Study 1 and Study 2 confirmed that test anxiety is associated with self-reported sleep disturbance. In addition, the results of Study 2 showed that sleep disturbance is also associated with increased state test anxiety. Finally, it was found in Study 2 that sleep disturbance was not related to actual test performance. However, poorer test performance was associated with increased state and trait test anxiety. It is concluded that certain characteristics associated with test anxiety are stable and may be detected in evaluative and non-evaluative situations. The results are discussed with particular reference to their implications for the test anxiety construct itself as well as treatment strategies for the test-anxious student.  相似文献   

3.
Unproctored Internet testing (UIT) is becoming more popular in employment settings due to its cost effectiveness and efficiency. However, one of the major concerns with UIT is the possibility of cheating behaviors: a more capable conspirator can sit beside the real applicant and answer test items, or the applicant may use unauthorized materials. The present study examined the effectiveness of using a proctored verification test following the UIT to identify cheating in UIT, where 2 test statistics, a Z‐test and a likelihood ratio (LR) test, compare the consistency of test performance across the testing conditions. A simulation study was conducted to test the effectiveness of the two test statistics for a computerized adaptive test format. Results indicate that both test statistics have high power to detect dishonest job applicants at low Type I error rates. Compared with the LR test, the Z‐test was more efficient and effective and is therefore recommended for practical applications. The theoretical and practical implications are discussed.  相似文献   

4.
叶宝娟  温忠粦 《心理科学》2013,36(3):728-733
在心理、教育和管理等研究领域中,经常会碰到两水平(两层)的数据结构,如学生嵌套在班级中,员工嵌套在企业中。在两水平研究中,被试通常不是独立的,如果直接用单水平信度公式进行估计,会高估测验信度。文献上已有研究讨论如何更准确地估计两水平研究中单维测验的信度。本研究指出了现有的估计公式的不足之处,用两水平验证性因子分析推导出一个新的信度公式,举例演示如何计算,并给出简单的计算程序。  相似文献   

5.
测验信度估计:从α系数到内部一致性信度   总被引:5,自引:0,他引:5  
温忠麟  叶宝娟 《心理学报》2011,43(7):821-829
沿用经典的测验信度定义, 简介了信度与a 系数的关系以及a系数的局限。为了推荐替代a系数的信度估计方法, 深入讨论了与a 系数关系密切的同质性信度和内部一致性信度。在很一般的条件下, 证明了a 系数和同质性信度都不超过内部一致性信度, 后者不超过测验信度, 说明内部一致性信度比较接近测验信度。总结出一个测验信度分析流程, 说明什么情况下a 系数还有参考价值; 什么情况下a 系数不再适用, 应当使用内部一致性信度(文献上也常称为合成信度)。提供了计算同质性信度和内部一致性信度的计算程序, 一般的应用工作者可以直接套用。  相似文献   

6.
A modern test that takes advantage of the opportunities provided by advancements in computer technology is the multimedia test. The purpose of this study was to investigate the criterion-related validity of a specific open-ended multimedia test, namely a webcam test, by means of a concurrent validity study. In a webcam test a number of work-related situations are presented and participants have to respond as if these were real work situations. The responses are recorded with a webcam. The aim of the webcam test which we investigated is to measure the effectiveness of social work behaviour. This first field study on a webcam test was conducted in an employment agency in The Netherlands. The sample consisted of 188 consultants who participated in a certification process. For the webcam test, good interrater reliabilities and internal consistencies were found. The results showed the webcam test to be significantly correlated with job placement success. The webcam test scores were also found to be related to job knowledge. Hierarchical regression analysis demonstrated that the webcam test has incremental validity up to and beyond job knowledge in predicting job placement success. The webcam test, therefore, seems a promising type of instrument for personnel selection.  相似文献   

7.
Measures of effective test length are developed for speeded and power tests, which are independent of the number of items in the test or of the time required for administration. These measures are used in determining reliability for (1) speeded and power tests, where a separately timed short parallel form is administered in addition to the full-length test; (2) power tests, where a subset of items is imbedded within the total test, parallel to the total test; and (3) power tests, where the subset of items is correlated with the complementary parallel subset in the test.  相似文献   

8.
A new method is proposed for estimating factor means and factor covariances in a group of individuals selected on their observed scores. The selection variable is, for example, the total score on an admissions test. Given a factor model for the test items based on the group of test takers, we may be interested in the factor structure for those in the top quartile. The differences in factor means and covariances between this selected group and the full group gives useful information both on successful test performance and on test validity. The new method draws on the classic Pearson-Lawley selection formulas. It avoids the fallacy of factor analysis on the selected group, which would lead to incorrect estimates. The new method is applied to a simple factor structure model for the GMAT test. Although the majority of the GMAT items test verbal skills, it is found that a quantitative factor shows the greatest change in moving from average to top quartile test takers.  相似文献   

9.
In the classical test theory, a high-reliability test always leads to a precise measurement. However, when it comes to the prediction of test scores, it is not necessarily so. Based on a Bayesian statistical approach, we predicted the distributions of test scores for a new subject, a new test, and a new subject taking a new test. Under some reasonable conditions, the predicted means, variances, and covariances of predicted scores were obtained and investigated. We found that high test reliability did not necessarily lead to small variances or covariances. For a new subject, higher test reliability led to larger predicted variances and covariances, because high test reliability enabled a more accurate prediction of test score variances. Regarding a new subject taking a new test, in this study, higher test reliability led to a large variance when the sample size was smaller than half the number of tests. The classical test theory is reanalyzed from the viewpoint of predictions and some suggestions are made.  相似文献   

10.
Cluster bias refers to measurement bias with respect to the clustering variable in multilevel data. The absence of cluster bias implies absence of bias with respect to any cluster‐level (level 2) variable. The variables that possibly cause the bias do not have to be measured to test for cluster bias. Therefore, the test for cluster bias serves as a global test of measurement bias with respect to any level 2 variable. However, the validity of the global test depends on the Type I and Type II error rates of the test. We compare the performance of the test for cluster bias with the restricted factor analysis (RFA) test, which can be used if the variable that leads to measurement bias is measured. It appeared that the RFA test has considerably more power than the test for cluster bias. However, the false positive rates of the test for cluster bias were generally around the expected values, while the RFA test showed unacceptably high false positive rates in some conditions. We conclude that if no significant cluster bias is found, still significant bias with respect to a level 2 violator can be detected with an RFA model. Although the test for cluster bias is less powerful, an advantage of the test is that the cause of the bias does not need to be measured, or even known.  相似文献   

11.
Baumgartner, Weiss, and Schindler (1998) introduced a novel non-parametric test for the two-sample comparison that is superior to commonly used tests such as the Wilcoxon rank-sum test. A modification of the novel test statistic can be used for one-sided comparisons based on ordinal data. Such comparisons frequently occur in psychological research, and the Wilcoxon test is often recommended for their analysis. Here, the two tests were compared in a simulation study. According to this study the tests have a similar type I error rate, but the modified Baumgartner-Weiss-Schindler test is more powerful than the Wilcoxon test.  相似文献   

12.
We propose a sampling-based Bayesian t test that allows researchers to quantify the statistical evidence in favor of the null hypothesis. This Savage—Dickey (SD) t test is inspired by the Jeffreys—Zellner—Siow (JZS) t test recently proposed by Rouder, Speckman, Sun, Morey, and Iverson (2009). The SD test retains the key concepts of the JZS test but is applicable to a wider range of statistical problems. The SD test allows researchers to test order restrictions and applies to two-sample situations in which the different groups do not share the same variance.  相似文献   

13.
A method is provided for estimating the nonspurious correlation of a part of a test with the total test. Two cases are considered: one in which the actual subtest is parallel to the total test, the other in which the actual subtest is not parallel to the total test.  相似文献   

14.
The test information function serves important roles in latent trait models and in their applications. Among others, it has been used as the measure of accuracy in ability estimation. A question arises, however, if the test information function is accurate enough for all meaningful levels of ability relative to the test, especially when the number of test items is relatively small (e.g., less than 50). In the present paper, using the constant information model and constant amounts of test information for a finite interval of ability, simulated data were produced for eight different levels of ability and for twenty different numbers of test items ranging between 10 and 200. Analyses of these data suggest that it is desirable to consider some modification of the test information function when it is used as the measure of accuracy in ability estimation.  相似文献   

15.
Priming effects in perceptual tests of implicit memory are assumed to be perceptually specific. Surprisingly, changing object colors from study to test did not diminish priming in most previous studies. However, these studies used implicit tests that are based on object identification, which mainly depends on the analysis of the object shape and therefore operates color-independently. The present study shows that color effects can be found in perceptual implicit tests when the test task requires the processing of color information. In Experiment 1, reliable color priming was found in a mere exposure design (preference test). In Experiment 2, the preference test was contrasted with a conceptually driven color-choice test. Altering the shape of object from study to test resulted in significant priming in the color-choice test but eliminated priming in the preference test. Preference judgments thus largely depend on perceptual processes. In Experiment 3, the preference and the color-choice test were studied under explicit test instructions. Differences in reaction times between the implicit and the explicit test suggest that the implicit test results were not an artifact of explicit retrieval attempts. In contrast with previous assumptions, it is therefore concluded that color is part of the representation that mediates perceptual priming.  相似文献   

16.
We propose a default Bayesian hypothesis test for the presence of a correlation or a partial correlation. The test is a direct application of Bayesian techniques for variable selection in regression models. The test is easy to apply and yields practical advantages that the standard frequentist tests lack; in particular, the Bayesian test can quantify evidence in favor of the null hypothesis and allows researchers to monitor the test results as the data come in. We illustrate the use of the Bayesian correlation test with three examples from the psychological literature. Computer code and example data are provided in the journal archives.  相似文献   

17.
Famously, Frank P. Ramsey suggested a test for the acceptability of conditionals. Recently, David Chalmers and Alan Hájek (2007) have criticized a qualitative variant of the Ramsey test for indicative conditionals. In this paper we argue for the following three claims: (i) Chalmers and Hájek are right that the variant of the Ramsey test that they attack is not the correct way of spelling out an acceptability test for indicative conditionals. But there is a suppositional variant of the Ramsey test which is still stated in purely qualitative terms, which avoids the problems, and which looks correct. (ii) While the variant of the Ramsey test that Chalmers and Hájek criticize is not correct, it is still a good approximation of a correct formulation of the Ramsey test which may be usefully employed in various contexts. (iii) The variant of the Ramsey test that Chalmers and Hájek suggest as a substitute for the deficient version of the Ramsey test is itself subject to worries similar to those raised by Chalmers and Hájek, if it is given a non-suppositional interpretation.  相似文献   

18.
正如不同的病症需要使用不同的医疗技术方法来诊断一样, 不同的认知结构也需要设计对应的测验模式来进行诊断, 从而保证测验具有高质量的诊断评估效果。但传统测验形式未考虑不同认知结构的针对性诊断测验需求, 导致“千人一卷”在测验效率上有所不足; 认知诊断计算机化自适应测验虽可针对不同认知结构的被试施测不同的项目, 然而支持自适应过程的题库却没有针对不同认知结构被试设计对应的项目, 导致题库使用效率较低。要解决上述问题的关键在于, 探索如何针对不同认知结构设计相对应的测验模式。本研究采用Monte Carlo模拟, 对六种属性层级关系下, 不同认知结构的测验设计模式进行探讨。实验结果表明(1)同一属性层级关系下, 不同认知结构的最佳测验设计模式不同; (2)依据不同认知结构的最佳测验设计模式构建的题库具有更高的使用效率。测验编制者可以根据实验结果针对不同认知结构优化对应的测验设计模式, 并用于指导题库建设。  相似文献   

19.
An effort to shorten the time spent in psychological testing and reporting is represented by a four-page form which offers opportunity for systematic evaluation of the test battery. Opportunity is provided for reference to relevant test evidence.

While such a form in no way lessens the necessity for full understanding of test results, it helps systematize presentation of test findings as well as serving as a guide to interpretation of the test battery.  相似文献   

20.
This study examined whether the behavior of male NIH Swiss mice in a putative animal model of depression, Porsolt's swim test, is related to that in other behavioral tests. The other tests were the plus-maze test of anxiety, the holeboard test of exploration and locomotor activity, and a test of seizure threshold to bicuculline. The immobility of the mice in the swim test did not correlate with their behavior in any of the other tests used. The only significant correlations found occurred between individual measures in the holeboard and plus-maze tests. The data suggest that immobility in the swim test is not related to behavior in the tests of anxiety, directed exploration, locomotor activity, or seizure threshold.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号