首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
2.
Recent conceptual and methodological innovations have led to new strategies for documenting the construct validity of test scores, including performance-based test scores. These strategies have the potential to generate more definitive evidence regarding the validity of scores derived from the Rorschach Inkblot Method (RIM) and help resolve some long-standing controversies regarding the clinical utility of the Rorschach. After discussing the unique challenges in studying the Rorschach and why research in this area is important given current trends in scientific and applied psychology, I offer 3 overarching principles to maximize the construct validity of RIM scores, arguing that (a) the method that provides RIM validation measures plays a key role in generating outcome predictions; (b) RIM variables should be linked with findings from neighboring subfields; and (c) rigorous RIM score validation includes both process-focused and outcome-focused assessments. I describe a 4-step strategy for optimal RIM score derivation (formulating hypotheses, delineating process links, generating outcome predictions, and establishing limiting conditions); and a 4-component template for RIM score validation (establishing basic psychometrics, documenting outcome-focused validity, assessing process-focused validity, and integrating outcome- and process-focused validity data). The proposed framework not only has the potential to enhance the validity and utility of the RIM, but might ultimately enable the RIM to become a model of test score validation for 21st-century personality assessment.  相似文献   

3.
Sleep disturbances are endemic in military personnel with nonclinical populations averaging 6 hours of sleep. The Pittsburgh Sleep Quality Index (PSQI), however, has not been validated in this population. It is thus unknown if the PSQI can differentiate clinically significant sleep disorders from sleep disturbances resulting from military duties with restricted sleep periods. After a clinical evaluation and polysomnogram, participants (N = 148) were classified as having insomnia only, obstructive sleep apnea (OSA) only, comorbid insomnia and OSA (CIO), service-related illnesses only (SRI–; pain, depression, posttraumatic stress disorder, traumatic brain injury), and controls. Military personnel in the insomnia only, and the CIO groups had higher PSQI scores (13.5 ± 2.8 and 14.7 ± 3.5, respectively) compared to the controls (8.9 ± 3.9). A cut-off score of ≥10 was optimal (90% sensitivity and 69% specificity) for determining clinically significant insomnia (≥12 for CIO; 84% sensitivity, 77% specificity). In military personnel, a PSQI score >5 is not necessarily indicative of a clinically significant sleep disorder. The use of elevated cut-off PSQI scores are likely better suited to differentiate military personnel who require further clinical evaluation versus a more conservative sleep improvement protocol.  相似文献   

4.
Models describing the role of source credibility in information integration were tested in two experiments. In the first experiment, subjects estimated the value of used cars based on two cues: blue book value and an estimate provided by one of three friends who examined the car. The three sources were described as differing in mechanical expertise. In the second experiment, subjects rated the likeableness of persons described by either one or two adjectives, each adjective contributed by a different source. The sources differed with respect to the length of their acquaintance with the person to be rated. In both experiments, credibility of the source magnified the impact of the information he provided. Further, this multiplicative effect of a source was inversely related to the credibility of the other source, in violation of additive or constant-weight averaging models, but consistent with a relative-weight averaging model.  相似文献   

5.
In the theory of test validity it is assumed that error scores on two distinct tests, a predictor and a criterion, are uncorrelated. The expected-value concept of true score in the calssical test-theory model as formulated by Lord and Novick, Guttman, and others, implies mathematically, without further assumptions, that true scores and error scores are uncorrelated. This concept does not imply, however, that error scores on two arbitrary tests are uncorrelated, and an additional axiom of “experimental independence” is needed in order to obtain familiar results in the theory of test validity. The formulas derived in the present paper do not depend on this assumption and can be applied to all test scores. These more general formulas reveal some unexpected and anomalous properties of test validty and have implications for the interpretation of validity coefficients in practice. Under some conditions there is no attenuation produced by error of measurement, and the correlation between observed scores sometimes can exceed the correlation between true scores, so that the usual correction for attenuation may be inappropriate and misleading. Observed scores on two tests can be positively correlated even when true scores are negatively correlated, and the validity coefficient can exceed the index of reliability. In some cases of practical interest, the validity coefficient will decrease with increase in test length. These anomalies sometimes occur even when the correlation between error scores is quite small, and their magnitude is inversely related to test reliability. The elimination of correlated errors in practice will not enhance a test's predictive value, but will restore the properties of the validity coefficient that are familiar in the classical theory.  相似文献   

6.
Faking on a biographical inventory compared to a traditional personality inventory was assessed in measuring the Five Factor Model of Personality. 705 subjects were randomly assigned to either an Answer Honestly or Faking condition. All subjects were recruited from psychology classes at two New Jersey State colleges. Women comprised 68.6% of the participants. The average age of the subjects was 24 yr. 383 subjects took part in the Answer Honestly condition. 322 participated in the Faking condition. In the Faking condition, subjects responded as if applying for the position of librarian. All subjects completed a biodata inventory, the NEO-Five Factor Inventory, a social desirability scale, a letter-cancellation task, and self-reported their grade point average. Criterion-related validity was assessed for both test scores across samples. Comparisons between samples indicated that subjects inflated scores on both inventories in socially desirable directions. Biodata Inventory scores were less elevated under the Faking conditions than the NEO-Five Factor Inventory scores.  相似文献   

7.
Maximum validity of a test with equivalent items   总被引:1,自引:0,他引:1  
It is assumed that a scale of true scores on a function exists and that the probability of answering an item correctly is a curve of the type of the integral of the normal curve. The product moment correlation between the test score and true score is derived for a normal distribution of subjects and a test composed of equivalent items. Numerical examples demonstrate that the maximum correlation between test scores and true scores occurs for a one hundred item test when the point correlation between items is less than three tenths.  相似文献   

8.
In this paper, we report on the development and validity of the Professional Decision-Making in Research (PDR) measure, a vignette-based test that examines decision-making strategies used by investigators when confronted with challenging situations in the context of empirical research. The PDR was administered online with a battery of validity measures to a group of NIH-funded researchers and research trainees who were diverse in terms of age, years of experience, types of research, and race. The PDR demonstrated adequate reliability (alpha = .84) and parallel form correlation (r = .70). As hypothesized, the PDR was significantly negatively correlated with narcissism, cynicism, moral disengagement, and compliance disengagement; it was not correlated with socially desirable responding. In regression analysis, the strongest predictors of higher PDR scores were low compliance disengagement, speaking English as a native language, conducting clinical research with human subjects, and low levels of narcissism. Given that the PDR was written at an eighth grade reading level to be suitable for use with English as a second language participants and that only one-fourth of items focused on clinical research, further research into the possible roles of culture and research ethics training across specialties is warranted. This initial validity study demonstrates the potential usefulness of the PDR as an educational outcome assessment measure and a research instrument for studies on professionalism and integrity in research.  相似文献   

9.
刘玥  刘红云 《心理学报》2017,(9):1234-1246
双因子模型可以同时包含一个全局因子和多个局部因子,在描述多维测验结构时有其独特优势,近些年应用越来越广泛。文章基于双因子模型,提出了4种合成总分和维度分的方法,分别是:原始分法,加和法,全局题目加权加和法和局部题目加权加和法,并采用模拟的方法,在样本量、测验长度、维度间相关变化的条件下考察了这些方法与传统多维IRT方法的表现。最后,通过实证研究对结果进行了验证。结果显示:(1)全局加权加和法和局部加权加和法,尤其是局部加权加和法合成的总分和维度分与真值最接近、信度最高。(2)在维度间相关较高,测验长度较长的条件下,局部加权加和法的结果较好,部分条件下甚至优于多维IRT法。(3)仅有局部加权加和法合成的维度分能够反应维度间真实的相关关系。  相似文献   

10.
Response style in objective psychological testing is an important issue in the reliability and validity of tests as well as in the interpretation of test results. The MCMI provides two response-style indices, the validity scale and the weight factor. The present work presents an additional statistic to assess random response in subjects. The Consistency Coefficient is the correlation between the subjects' endorsement of even and odd items across the 20 MCMI scales. The distributions of 500 patient and 500 randomly generated profiles were compared. Good separation between these distributions was found. The subject data were extremely negatively skewed, whereas the randomly generated data were normally distributed. Data are presented that display positive and negative predictive values, as well as sensitivity and specificity across ranges of prevalence and cut score. These data facilitate the identification of subjects who respond to the MCMI in a random manner so that their scores can be interpreted accordingly.  相似文献   

11.
The predictive validity of a measure of job compatibility was studied for theater personnel. Scores on a forced-choice instrument, developed from the Job Compatibility Questionnaire (JCQ), predicted employee performance (r = .22, p< .05), turnover (r = -.35, p< .01), and scores on a "value composite" (reflecting a combination of job performance and employee retention criteria) as defined by the research sponsor (P = .41, p< .01). Furthermore, job compatibility scores explained a statistically significant increment in turnover and value composite score variance when analyzed concomitantly with verbal and numerical ability test scores. Finally, job compatibility scores were shown to be nonredundant with hiring decisions based on an application review, reference check, and interview, whereas the cognitive ability test scores shared considerable redundancy with hiring decisions based on the current selection system.  相似文献   

12.
This study examined the magnitude of differences in standard scores, convergent validity, and concurrent validity when an individual's performance was gauged using the revised and the normative update (Woodcock, 1998) editions of the Woodcock Reading Mastery Test in which the actual test items remained identical but norms have been updated. From three metropolitan areas, 899 first to third grade students referred by their teachers for a reading intervention program participated. Results showed the inverse Flynn effect, indicating systematic inflation averaging 5 to 9 standard score points, regardless of gender, IQ, city site, or ethnicity, when calculated using the updated norms. Inflation was greater at lower raw score levels. Implications for using the updated norms for identifying children with reading disabilities and changing norms during an ongoing study are discussed.  相似文献   

13.
The aim of our study was to provide validation and reproducible data for the anxiety thermometer. This thermometer is either a continuous or a 10-point Likert-type scale on which subjects are asked to rate their anxiety feelings at a particular moment. It is a quick way to measure state-anxiety. As a validation criterion the State-Trait Anxiety Inventory (STAI) A-State scale was used. To test the reproducibility of the thermometer, a test-retest correlation coefficient was calculated, with a retrospective second thermometer score. The ego-threatening situation used was a written examination. Two experiments were carried out during different examination conditions. The data consistently indicated that the validity and reproducibility of the anxiety thermometer is fair (correlation coefficients between .60 and .78) In the second study, the possible influence of two factors on the retrospective scores were additionally tested.  相似文献   

14.
Abstract

The hypothesis that belief in the paranormal is related to reasoning errors was tested. College students were administered the Belief in the Paranormal Scale (Jones, Russell, &; Nickel, 1977) and a syllogistic reasoning test. A slight but statistically significant correlation was observed between BPS scores and the number of errors made on the reasoning test. This relationship was larger but not significantly so for reasoning items with paranormal content than for symbolic content. An a priori comparison indicated that the relationship between BPS scores and reasoning errors was significantly greater for problems that required subjects to determine the validity of hypotheses given statements of evidence than for problems that required subjects to determine the validity of deduced empirical predictions given hypotheses. Thus, belief in the paranormal among college students was very moderately correlated with reasoning ability and was observed most clearly when the reasoning problems contained paranormal content and when they required subjects to determine the validity of hypotheses given evidential statements.  相似文献   

15.
General cognitive ability (GCA) is a recognized construct for predicting job performance and capacity to learn. However, it has recently been argued that the time constraints under which GCA is assessed might provoke test anxiety, which negatively biases GCA scores. This can then lead to erroneous rejection of qualified candidates in personnel hiring contexts. This paper aimed to investigate: (1) to what extent candidates’ GCA scores increase when tested without time constraints and the ability of this GCA score to predict job performance; and (2) the personality characteristics that hinder GCA test performance under time constraints. Results from two field studies conducted in an actual personnel selection context partially confirmed the hypotheses. They revealed that, aside from the improvement of all candidates’ GCA scores when time constraint was removed, only GCA assessed without time constraints predicts job performance. Furthermore, while all candidates’ scores were influenced by the time constraint condition, individuals who are anxious, low-impulse, low value-questioning and deliberating are more penalized by the time constraint condition of such testing and, thus, are more likely to be erroneously eliminated in a selection process.  相似文献   

16.
This study examines the predictive criterion-related validity of a series of professional certification tests for water and wastewater management operators. Certification test data were obtained on 164 operators holding one of three jobs in water or wastewater management facilities. The certification test scores were broken down into four component scores and a total score. Criterion data consisted of performance evaluations obtained from the operator's supervisor, a self-rating from the operator, or both. Test scores were correlated with the job performance evaluations. The results indicated that scores on the certification tests were not related to rated job performance by job type, job level, source of performance evaluation, or component of job performance. The findings are discussed in the context of establishing an appropriate criterion against which to validate certification tests, and the practical problems of doing so in an applied setting.  相似文献   

17.
目的:修订梅尔美术判断测验(Meier Art Judgment Test)并对其信度、效度进行检验。方法:通过对来自6所大学、中专共2270人施测梅尔美术判断测验,采用CTT区分度和IRT的模型拟合检验、区分度筛选项目,以霍兰德艺术分测验、学生艺术创作水平自评与艺术过往经历分量表为效标,以及采用效标组法(美术与非美术专业)检验效标关联效度。结果:保留的61题都拟合IRT的2参数logistic模型,量表得分与各效标得分相关显著,美术与非美术专业学生得分存在显著差异; 但测验信息量分析表明,对高能力被试的测量误差相对较大。结论:修订的量表能测量个体的美术判断能力; 今后改进方向应该是增加更难的试题。  相似文献   

18.
zWork simulation performance tests of filing and proofreading were the principal criteria in a validation study of a paper and pencil clerical selection test constructed by content-oriented methods; supervisory ratings were also used. Experiment 1 was a concurrent study using as subjects 59 provisional employees. Experiment 2 was a predictive study using as subjects 184 employees actually selected on the basis of test scores. In both studies, substantial correlations were found between scores on the selection test and performance on the work samples: Estimates of correlations in the original selection group ranged from .4 to .8. Correlations between the selection test and the supervisor ratings were significant in Experiment 1, but not in Experiment 2; even when significant, they were much lower than the correlations with the work samples. These results suggest the value of using work samples as criteria for validation studies. Implications for other validation efforts are considered.  相似文献   

19.
This report analyzes the contribution of gender, ethnic status, age, and school classification to the five factor scores and the comparison score of the Adaptive Behavior Scale-School Edition (ABS-SE). These factor scores were derived from extensive analysis of the performance of subjects of different ages and different levels of mental retardation. The comparison score evolved from discriminant analysis of the factor scores and is computed as a weighted sum of the three factor scores from Part One of the ABS-SE. The results of the ANOVAs conducted to test the main and interaction effects showed significant mean differences between normal, mildly retarded, and moderately to severely retarded subjects over the age range from 3 through 16 years on the ABS-SE factor and comparison scores. In general, there was no significant contribution of either gender or ethnic status to scores from ages 7 through 16, but there was a significant difference attributable to ethnic status with a meaningful amount of explained variance in the community and self-sufficiency and comparison scores of subjects aged 3–6. Although these differences were significant, children 3–6 years old classified as white did not necessarily perform better on all factor scores than those of minority ethnic groups. Discussion of the results in the context of contemporary criteria for test bias and of competing explanations for ethnic group differences in performance of young children on the ABS-SE follows the presentation of the findings. The results provide additional evidence for the validity of the ABS-SE factor and comparison scores and show that in general the factor and comparison scores are not affected by gender or ethnic status.  相似文献   

20.
The predictive validity of General Aptitude Test Battery and Sixteen Personality Factor Questionnaire scores were compared to standard training ratings made by vocational instructors against the criterion of work performance measured by the Minnesota Satisfactoriness Scales for a sample of 106 employees with severe handicaps. The psychometric test variables were not correlated with the criterion; however, the training ratings were consistently predictive of the job satisfactoriness scores. These results suggest that the employment potential of job applicants with disabilities can be assessed more accurately using situational training ratings, as opposed to standardized psychometric test scores.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号