首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Significant job-relatedness was found for a posttraining job knowledge test criterion using an application of Lawshe's content validity method. The aide test was used as a criterion to assess the predictive validity of a vocabulary test and a civil service test with samples of black ( N = 43) and white ( N = 62) psychiatric aides. Significant validities were found on both tests, but a vocabulary test proved to be the better predictor of the criterion in both samples. The obtained validities were discussed in terms of differential validity, test fairness, and sample size. This study demonstrated that a content validity method could be applied to criteria as well as selection tests. It was concluded that content validity methods may be able to help solve the problem of criterion relevance in validation research by providing quantitative evidence of the job-relatedness of criteria.  相似文献   

2.
The relative validities of forced‐choice (ipsative) and Likert rating‐scale item formats as criterion measures are examined. While there has been much debate about the relative technical and psychometric merits and demerits of ipsative instruments, the present research focused on the crucial question of whether the use of this format has any practical benefit – in terms of improved validity. An analysis is reported from a meta‐analysis data set. This demonstrates that higher operational validity coefficients (prediction of line‐manager ratings of competencies) are associated with the use of forced‐choice (r=.38) rather than rating scale (r=.25) item formats for the criterion measurement instrument when performance is rated by the same line managers on both formats and where the predictor is held constant. Thus the apparent criterion‐related validity of a predictor can increase by 50% simply by changing the format of the criterion measurement instrument. The implications of this for practice are discussed.  相似文献   

3.
The correspondence between inferences made using two validation strategies–content and criterion-related–were examined in a specific personnel selection application. Empirical validity values and Law-she's (1975) content validity ratios (CVR) were obtained for items from three structured interview guides used in the selection of insurance agents. Ratings of each item by over 300 field managers were used to calculate the CVR values. Statistically significant, yet modest correlations were found between empirical item validities and content validities for an interview guide used to select applicants with prior insurance sales experience. No significant differences were found among these correlations by comparing job experts of different levels of managerial experience and experience in selection. Data for the interview guide used to select experienced applicants also indicated that a content validity approach can be useful in developing a selection instrument with an empirically valid composite rating. The hypotheses were not confirmed for interview guides used to select applicants with no prior insurance sales experience. The practical importance of these results are discussed, as are plans for future research.  相似文献   

4.
Expert review sessions are often conducted to determine the content validity of scale items. The accurate quantification of content validity is usually limited by a relatively small number of experts as well as by a small number of rating categories. These factors, combined with the bounded and discrete nature of rating scale categories, hinder use of traditional methods for computing standard errors and confidence intervals. Using an application of the score method, researchers can construct an asymmetric interval that is better suited for these situations. SAS code is provided to automate the computations, and a discussion of two methods for using the obtained results for content validation decisionmaking follows.  相似文献   

5.
The cross-cultural validity of a North American personality inventory, namely, the Personality Research Form (Jackson 1984) was examined using 394 university students in the Philippines who were able to speak and read English. Scale validities, with self and peer ratings as criteria, were generally significant but modest. Moderate scale and peer rating reliabilities probably contributed to these results. Elevated scores on a PRF scale designed to detect careless responding suggested failure to understand instructions or insufficient motivation may also account for the findings. Interestingly, recalculating validities for subsamples comprising ‘dependable’ and ‘undependable’ subjects yielded no substantial differences in overall validity. Implications for cross-cultural personality assessment are discussed.  相似文献   

6.
We developed a 36-item scale to measure Openness, using items on the California Psychological Inventory (CPI; Gough, 1957, 1987, 1996), Form 434. Items were initially chosen on the basis of content validity. Five samples (N = 2,375) were used to establish reliability, validity, and norms; 4 samples consisted of university undergraduate students, and 1 comprised applicants for nonmanagement call centerjobs. Internal consistency estimates obtained in each sample averaged approximately .75, and test-retest stability, assessed in 1 sample, was estimated at .84. Cross-correlations with related scales, for example, the NEO Personality Inventory-Revised Openness scale (Costa & McCrae, 1992) and other CPI-based scales, provided evidence of construct validity. Statistically significant predictive validities were obtained in 2 call centerjob-incumbent samples, with range-corrected true validities of .20 to .36 for a number of job performance criteria. Construct and predictive validity were found to be higher than for other scales consisting of CPI items designed to measure Openness or a related construct. Finally, norms were prepared for university undergraduate students (n = 1,847) and nonmanagement service-sector job applicants (n = 528).  相似文献   

7.
We used meta-analytic procedures to investigate the criterion-related validity of assessment center dimension ratings. By focusing on dimension-level information, we were able to assess the extent to which specific constructs account for the criterion-related validity of assessment centers. From a total of 34 articles that reported dimension-level validities, we collapsed 168 assessment center dimension labels into an overriding set of 6 dimensions: (a) consideration/awareness of others, (b) communication, (c) drive, (d) influencing others, (e) organizing and planning, and (f) problem solving. Based on this set of 6 dimensions, we extracted 258 independent data points. Results showed a range of estimated true criterion-related validities from .25 to .39. A regression-based composite consisting of 4 out of the 6 dimensions accounted for the criterion-related validity of assessment center ratings and explained more variance in performance (20%) than Gaugler, Rosenthal, Thornton, and Bentson (1987) were able to explain using the overall assessment center rating (14%).  相似文献   

8.
This paper describes the development of a behaviorally based performance appraisal system. Blanz and Ghiselli's Mixed Standard Scale was used as the basis for developing the performance appraisal system for assessing the performance of highway patrol personnel. However, the particular developmental procedures described here differ in some respects from those reported in the literature. Rather than developing rating items describing general traits such as "diligence,""initiative," or "enthusiasm" in behavioral terms, the items in the present scale were developed to describe proficiency levels of specific job tasks. This characteristic is expected to enhance the objectivity of the evaluation system for both appraisal and job counseling purposes. The appraisal instrument was subjected to a series of reliability and validity tests that demonstrated its high reliability and validity. Although the content of the appraisal sytem desribed here included highway patrol tasks, a similar system could be developed using the procedures described for a wide variety and level of jobs.  相似文献   

9.
What type of items, keyed positively or negatively, makes social-emotional skill or personality scales more valid? The present study examines the different criterion validities of true- and false-keyed items, before and after correction for acquiescence. The sample included 12,987 children and adolescents from 425 schools of the State of São Paulo Brazil (ages 11–18 attending grades 6–12). They answered a computerized 162-item questionnaire measuring 18 facets grouped into five broad domains of social-emotional skills, i.e.: Open-mindedness (O), Conscientious Self-Management (C), Engaging with others (E), Amity (A), and Negative-Emotion Regulation (N). All facet scales were fully balanced (3 true-keyed and 3 false-keyed items per facet). Criterion validity coefficients of scales composed of only true-keyed items versus only false-keyed items were compared. The criterion measure was a standardized achievement test of language and math ability. We found that coefficients were almost as twice as big for false-keyed items’ scales than for true-keyed items’ scales. After correcting for acquiescence coefficients became more similar. Acquiescence suppresses the criterion validity of unbalanced scales composed of true-keyed items. We conclude that balanced scales with pairs of true and false keyed items make a better scale in terms of internal structural and predictive validity.  相似文献   

10.
The criterion‐related validities of empirical, rational, and hybrid keying procedures for a biodata inventory were compared at different sample sizes. Rational keying yielded the lowest validities. Hybrid keying performed best at the smallest sample sizes studied, followed by empirical keying at moderate sizes, and stepwise regression weighting of items at the largest sample sizes.  相似文献   

11.
Review and metaanalyses of published validation studies for the years 1964-1982 of Journal of Applied Psychology and Personnel Psychology were undertaken to examine the effect of (1) research design; (2) criterion used; (3) type of selection instrument used; (4) occupational group studies; and (5) predictor-criterion combination on the level of observed validity coefficients. Results indicate that concurrent validation designs produce validity coefficients roughly equivalent to those obtained in predictive validation designs and that both of these designs produce higher validity coefficients than does a predictive design which includes use of the selection instrument. Of the criteria examined, performance rating criteria generally produced lower validity coefficients than did the use of other more "objective" criteria. In comparing the validities of various types of predictors, it was found cognitive ability tests were not superior to other predictors such as assessment centers, work samples, and supervisory/peer evaluations as has been found in previous metaanalytic work. Personality measures were clearly less valid. Compared to previous validity generalization work, much unexplained variance in validity coefficients remained after corrections for differences in sample size. Finally, the studies reviewed were deficient for our purposes with respect to the data reported. Selection ratios, standard deviations, reliabilities, predictor and criterion intercorrelations were rarely and inconsistently reported. There are also many predictor-criterion relationships for which very few validation efforts have been undertaken.  相似文献   

12.
Despite extensive evidence that tests are valid for employee selection, Federal Guidelines have urged employers to seek alternative selection procedures that are equally valid but have less adverse impact on minorities. Research on the validity, adverse impact and fairness of eight categories of alternatives was reviewed. Feasibility of operational use of each type of alternative in an employment setting was also discussed. Only biodata and peer evaluation were supported as having validities substantially equal to those for standardized tests. Previous reviews and more recent research indicated that interviews, self-assessments, reference checks, academic achievement, expert judgment and projective techniques had levels of validity generally below those reported for tests. Data, where available, offered no clear indication that any of the alternatives met the criterion of having equal validity with less adverse impact. Results are discussed and several additional promising alternatives are described.  相似文献   

13.
The reliabilities and validities of true-false and forced-choice formats in personality assessment were compared. Subjects from college residential units were assigned randomly to groups receiving the Personality Research Form (PRF) in either forced-choice or standard true-false form. Reliabilities were substantially higher for the true-false form. Peer rating validities for each format were in a comparable range, but correlations with self-ratings were higher for the true-false form. Results do not support the contention that a forced-choice format is consistently more valid than a standard format. Subjects well acquainted with ratees manifested more highly differentiated judgments, showed consistently higher validity, but were more prone to show a bias to attribute more salient traits, like dominance and exhibition, to ratees.  相似文献   

14.
Supervisors' opportunity to observe incumbents' performance (i.e., how often a supervisor typically sees an employee's performance) has been suggested to be important for accurate performance rating and to be a moderator of criterion‐related validity. Here we test these suggestions and present empirical evidence for the effects of opportunity to observe. In Study 1, supervisors in a multi‐occupation/organization criterion‐related validation study for a biodata measure indicated the opportunity they had to observe incumbents. The data were split according to different levels of opportunity to observe. Higher validities were found when opportunity to observe was maximal. In Study 2, this finding was replicated using a cognitive ability test. These results suggest that psychologists should consider measuring opportunity to observe in criterion‐related validity studies.  相似文献   

15.
A criterion-related validation was conducted to assess the validity of four aptitude tests and five tests of content taken directly from job tasks in predicting job sample performance of apprentices in eight skilled trades. Observed validities were above .40 (corrected for range restriction, validities averaged .52). Though there were large subgroup mean differences on both predictor and criterion measures, there was no evidence of significant differential prediction.  相似文献   

16.
A survey of the On-Line Psychology DEC Users’ Group was taken to evaluate DEC’s performance in the psychology research laboratory. Data were obtained for 20 rating scale items and 7 open-ended items in five categories. On a 10-point scale, DEC averaged about 6 on purchasing procedures and documentation, about 4 on delivery time and maintenance service, and about 7 on product satisfaction. The open-ended items provided users’ explanations for the ratings and new hardware and software that users would like to see from DEC in the near future.  相似文献   

17.
18.
PERSONALITY MEASURES AS PREDICTORS OF JOB PERFORMANCE: A META-ANALYTIC REVIEW   总被引:13,自引:0,他引:13  
The purpose of this study was to investigate conflicting findings in previous research on personality and job performance. Meta-analysis was used to (a) assess the overall validity of personality measures as predictors of job performance, (b) investigate the moderating effects of several study characteristics on personality scale validity, and (c) appraise the predictability of job performance as a function of eight distinct categories of personality content, including the "Big Five" personality factors. Based on review of 494 studies, usable results were identified for 97 independent samples (total N = 13,521). Consistent with predictions, studies using confirmatory research strategies produced a corrected mean personality scale validity (.29) that was more than twice as high as that based on studies adopting exploratory strategies (.12). An even higher mean validity (.38) was obtained based on studies using job analysis explicitly in the selection of personality measures. Validities were also found to be higher in longer tenured samples and in published articles versus dissertations. Corrected mean validities for the "Big Five" factors ranged from .16 for Extroversion to .33 for Agreeableness. Weaknesses in the reporting of validation study characteristics are noted, and recommendations for future research in this area are provided. Contrary to conclusions of certain past reviews, the present findings provide some grounds for optimism concerning the use of personality measures in employee selection.  相似文献   

19.
Using the Theory of Planned Behavior as a framework, the Attitude to Leisure-time Physical Activity, Expectations of Others, Perceived Control, and Intention of Engage in Leisure-time Physical Activity scales were developed for use among high school students. The study population included 20 boys and 68 girls 13 to 17 years of age (for boys, M = 15.1 yr., SD = 1.0; for girls, M = 15.0 yr., SD = 1.1). Generation of items and the establishment of content validity were performed by professionals in exercise physiology, physical education, and clinical psychology. Each scale item was phrased in a Likert-type format. Both unipolar and bipolar scales with seven response choices were developed. Following the pilot testing and subsequent revisions, 32 items were retained in the Attitude to Leisure-time Physical Activity scale, 10 items were retained in the Expectations of Others scale, 3 items were retained in the Perceived Control Scale, and 24 items were retained in the Intention to Engage in Leisure-time Physical Activity scale. Coefficients indicated adequate stability and internal consistency with alpha ranging from .81 to .96. Studies of validities are underway, after which scales would be made available to those interested in intervention techniques for promoting positive attitudes toward physical fitness, perception of control over engaging in leisure-time physical activities, and good intentions to engage in leisure-time physical activities. The present results are encouraging.  相似文献   

20.
This is a response to Gray and Wilson’s (2007) article: “A detailed analysis of the reliability and validity of the sensation seeking scale in a UK sample”. Gray and Wilson analysed the items in the four subscales of the SSS-V, using a Likert type response format and deconstructing the forced choice format of the original. However they used some anachronistic items from the old 1978 form rather than the revisions of these items in the newer form. But even excluding the 19 items from the 80 item test not meeting their internal reliability criterion did not improve the reliabilities of the old scales in their Likert format. Validity of the SSS is not really addressed despite the title of the article.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号