首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
The test-retest stability of the Slosson Full-Range Intelligence Test by Algozzine, Eaves, Mann, and Vance was investigated with test scores from a sample of 103 students. With a mean interval of 13.7 mo. and different examiners for each of the two test administrations, the test-retest reliability coefficients for the Full-Range IQ, Verbal Reasoning, Abstract Reasoning, Quantitative Reasoning, and Memory were .93, .85, .80, .80, and .83, respectively. Mean differences from the test-retest scores were not statistically significantly different for any of the scales. Results suggest that Slosson scores are stable over time even when different examiners administer the test.  相似文献   

4.
5.
6.
Long-term effects in a neurosurgically separated twin pair were illuminated by standard psychological test scores obtained over a period from 2 to 38 years of age. Interdigitation of the gyri of their right frontal lobes had necessitated separation in two stages at 4 months of age. One twin clearly suffered some brain injury and showed some impairment during the testing at 5 years of age. The scores of both twins rose at the adult testing. The brighter twin has an IQ comparable to that of the mother. The unique data set is a kind of model for long-term assessment of early brain surgery, particularly with craniopagus twins.  相似文献   

7.
A sample of 183 medical students completed the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT V2.0). Scores on the test were examined for evidence of reliability and factorial validity. Although Cronbach's alpha for the total scores was adequate (.79), many of the scales had low internal consistency (scale alphas ranged from .34 to .77; median = .48). Previous factor analyses of the MSCEIT are critiqued and the rationale for the current analysis is presented. Both confirmatory and exploratory factor analyses of the MSCEIT item parcels are reported. Pictures and faces items formed separate factors rather than loading on a Perception factor. Emotional Management appeared as a factor, but items from Blends and Facilitation failed to load consistently on any factor, rendering factors for Emotional Understanding and Emotional Facilitation problematic.  相似文献   

8.
In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE®General Analytical Writing and until 2009 in the case of TOEFL® iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e‐rater®. In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability.  相似文献   

9.
The representation of test scores asn-dimensional points leads directly to an estimate of error variance at a particular score level in the case of equivalent items. Approximations are suggested for the case of non-equivalent items. These approximations are compared, with satisfactory results, with empirical data prepared by Dr. Mollenkopf.  相似文献   

10.
Three studies are reported concerned with people's perception of their own personality, their acceptance of bogus ‘personality’ feedback, and the relationship between their ‘actual’ personality scores and their willingness to accept bogus feedback. In the first study subjects attempted to predict their own and a well-known other person's personality scores. They were fairly good at predicting some of their own scores (extraversion, neuroticism) but less so others, suggesting that people can recognize their own ‘correct’ personality feedback. In the second study subjects were given either positive (Barnum Statements) or negative (reverse Barnum Statements) ‘bogus’ feedback after a personality test. They tended to accept the positive feedback as more accurate than the negative feedback though this was not related to their actual scores. In the third study subjects were given four types of feedback statements after a personality test: general positive, general negative, specific positive and specific negative. As predicted, people tend to accept general rather than specific, and positive rather than negative feedback as true. Furthermore, acceptance was closely related to neuroticism and extraversion in a predicted direction. These results are discussed in terms of the uses and abuses of validation of personality feedback.  相似文献   

11.
12.
13.
In the classical test theory, a high-reliability test always leads to a precise measurement. However, when it comes to the prediction of test scores, it is not necessarily so. Based on a Bayesian statistical approach, we predicted the distributions of test scores for a new subject, a new test, and a new subject taking a new test. Under some reasonable conditions, the predicted means, variances, and covariances of predicted scores were obtained and investigated. We found that high test reliability did not necessarily lead to small variances or covariances. For a new subject, higher test reliability led to larger predicted variances and covariances, because high test reliability enabled a more accurate prediction of test score variances. Regarding a new subject taking a new test, in this study, higher test reliability led to a large variance when the sample size was smaller than half the number of tests. The classical test theory is reanalyzed from the viewpoint of predictions and some suggestions are made.  相似文献   

14.
Background: UK schools have a long history of using reasoning tests, most frequently of Verbal Reasoning (VR), Non‐Verbal Reasoning (NVR), and to a lesser extent Quantitative Reasoning (QR). Results are used for identifying students' learning needs, for grouping students, for identifying underachievement, and for providing indicators of future academic performance. Despite this widespread use there are little empirical data on the long‐term consistency of VR, QR and NVR as discrete abilities. Aims: To evaluate and compare the consistency of VR, QR and NVR scores over a 3‐year period, and to explore the influence of the secondary school on pupils' progress in the tests. Sample: Data were collected on a longitudinal sample of over 10,000 pupils who completed the Cognitive Abilities Test Second Edition in year 6 (age 10+) and year 9 (age 13+), and GCSE public examinations in year 11 (age 15+). Methods: Correlation coefficients and change scores for individual pupils are calculated. Multilevel modelling is used to determine school effects on reasoning scores and GCSE public examination results. Results: The results reveal high correlations in scores over time, ranging from .87 for VR to .76 for NVR, but also show around one‐sixth of pupils on the VR test and one‐fifth of pupils on the QR and NVR tests change their scores by 10 or more standard score points. Schools account for only a small part of the total variation in reasoning score, although they account for a much greater proportion of the variation in measures of attainment such as GCSE. School effects on pupils' progress in the reasoning tests between age 10 and age 13 are relatively modest. Conclusions: Reasoning tests make excellent baseline assessments for secondary schools. Some practical and policy implications for schools are discussed.  相似文献   

15.
The negative hypergeometric distribution of raw scores on mental tests is derived from certain assumptions relating to test theory. This result is checked empirically in a number of examples. Further derivations lead to the bivariate distribution of parallel tests which is also verified with actual data. The bivariate distribution of raw score and true score is also derived from a further assumption. This distribution is used to set confidence limits for true scores for persons with a given raw score.This work was supported in part by contract Nonr-2752(00) between the Office of Naval Research and Educational Testing Service. Reproduction in whole or in part for any purpose of the United States Government is permitted.  相似文献   

16.
17.
We introduce two simple empirical approximate Bayes estimators (EABEs)— and —for estimating domain scores under binomial and hypergeometric distributions, respectively. Both EABEs (derived from corresponding marginal distributions of observed test scorex without relying on knowledge of prior domain score distributions) have been proven to hold -asymptotic optimality in Robbins' sense of convergence in mean. We found that, where and are the monotonized versions of and under Van Houwelingen's monotonization method, respectively, the convergence rate of the overall expected loss of Bayes risk in either or depends on test length, sample size, and ratio of test length to size of domain items. In terms of conditional Bayes risk, and outperform their maximum likelihood counterparts over the middle range of domain scales. In terms of mean-squared error, we also found that: (a) given a unimodal prior distribution of domain scores, performs better than both and a linear EBE of the beta-binomial model when domain item size is small or when test items reflect a high degree of heterogeneity; (b) performs as well as when prior distribution is bimodal and test items are homogeneous; and (c) the linear EBE is extremely robust when a large pool of homogeneous items plus a unimodal prior distribution exists.The authors are indebted to both anonymous reviewers, especially Reviewer 2, and the Editor for their invaluable comments and suggestions. Thanks are also due to Yuan-Chin Chang and Chin-Fu Hsiao for their help with our simulation and programming work.  相似文献   

18.
19.
The relationship of students' performance on the Developing Cognitive Abilities Test (DCAT), a test of scholastic aptitude, and their subsequent performance on the Medical College Admission Test (MCAT) were examined for 122 nontraditional premedical students who participated in a medical educational preparatory program. A stepwise multiple regression analysis produced moderate, though significant multiple correlations among subscores on the two tests. While there were a few exceptions, for the most part all of the subscores on the Developing Cognitive Abilities Test made a significant contribution to the regression equation in the prediction of scores on MCAT subtests. Implications for the value of the Developing Cognitive Abilities Test as an admissions tool as well as providing direction for possible intervention are discussed.  相似文献   

20.
Participants completed two well established questionnaires on line (HPI: Hogan Personality Inventory; and the HDS: Hogan Developmental Survey). Time taken to complete each study was correlated with scale scores from both questionnaires including the occupational scales derived from the HPI. Those who scored higher on Adjustment (Stability), and Prudence (Conscientiousness) but lower on Learning Approach took longer to complete the test. Those who scored higher on Stress Tolerance and Reliability took significantly longer than those with low scores on these measures. With only the exception of Diligent and Dutiful all correlations between Dark Side variables and time taken were negative, particularly Leisurely, Excitable and Imaginative. Regression showed that up to 6% of the time taken variance could be accounted for. Implications for measurement were considered.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号