首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A meta-analysis of job analysis reliability   总被引:1,自引:0,他引:1  
  相似文献   

2.
The purpose and background of a series of three studies dealing with validity generalization are discussed, an overview of the series is provided, and the initial study is described.
In the initial study a total of 76 insurance company jobs was analyzed by 203 raters in an effort to assess the potential usefulness of the Position Analysis Questionnaire (PAQ) as a job analysis device to be employed in a more extensive, companywide research program. Examination of rate-rerate and interrater reliability suggested that rate-rerate reliability was quite good, while interrater reliability estimates were somewhat lower. Cluster analysis techniques were applied to the five Overall and 17 Component dimensions of the PAQ yielding six job families in both analyses. The job families were described in terms of the various PAQ dimensions and were judged to be organizationally meaningful. Although the job clustering results were generally acceptable, it was conjectured that job families formed on the basis of company specific dimensions might prove even more meaningful.  相似文献   

3.
4.
Lopez MN  Lazar MD  Oh S 《Assessment》2003,10(1):66-70
The psychometric properties of the Hooper Vsual Organization Test (VOT) have not been well investigated Here the authors present internal consistency and interrater reliability coefficients, and an item analysis, using data from a sample (N = 281) of "cognitively impaired" and "cognitively intact" patients, and patients with undetermined cognitive status. Coefficient alpha for the VOT total sample was .882. An item analysis found that 26 of the 30 items were good at discriminating among patients. Also, the interrater reliabilities for three raters (.992), two raters (.988), and one rater (.977) were excellent. Therefore, the judgmental scoring of the VOT does not interfere significantly with its clinical utility. The authors conclude that the VOT is a psychometrically sound test.  相似文献   

5.
A finding by Smith and Hakel (1979) is that job expert ratings on the PAQ correlate highly with ratings obtained from college students who are given no information about jobs other than their titles. One possible explanation for this finding is that the PAQ measures only trivial or common knowledge about work that both experts and naive observers possess. This particular view, of course, has serious implications for the use of the PAQ. In this paper we point out a problem in the way Smith and Hakel calculated the convergent validity between expert raters and naive raters. We also present the results from a replication study that indicate convergent validities are much lower than those reported by Smith and Hakel. Additional points are presented in order to caution against the interpretation that the PAQ measures only common knowledge about jobs.  相似文献   

6.
Recent studies have attempted to reduce the cost and intrusiveness of the Position Analysis Questionnaire (PAQ) by limiting the amount of information provided to the analyst, with consistently negative results. We examined an alternative technique for improving the cost-effectiveness of the PAQ that avoids the need to rate the hundreds of items that constitute the instrument. Three groups of raters (professional job analysts, graduate students in industrial psychology who were familiar with the PAQ, and PAQ-unfamiliar undergraduates) made direct holistic ratings of the PAQ dimensions for four familiar jobs. The holistic ratings were compared with decomposed PAQ dimension profiles obtained from the item-level ratings of the professional analysts. Cronbach accuracy analyses indicated near-zero convergence between the holistic and decomposed dimension ratings, even for the professional PAQ job analysts. We conclude that holistic rating of dimensions is not an effective means of reducing the cost of a PAQ job analyses and that it is likely to be similarly ineffective with task- or ability-based instruments.  相似文献   

7.
8.
A program is described for computing interrater reliability by averaging, for each rater, the correlations between one rater’s ratings and every other rater’s ratings. For situations in which raters rate more than one ratee, raters’ reliabilities can be computed for either each item or each ratee. The program reads data from a text file and puts the reliability coefficients in a text file. The standard Macintosh interface is implemented. The Quick-BASIC program is distributed both as a listing and in compiled form; it can be run with advantage with math coprocessors.  相似文献   

9.
The assessment of multiliterate handwriting performance is rarely reported despite increased globalization. The present study describes the psychometric properties of a handwriting speed test developed for children who are biliterate in English and Chinese. This included interrater reliability, test-retest reliability, interitem correlation, construct validity, and concurrent validity. The test's reliabilities between two raters and over a 1-wk. interval were high with ICCs ranging from .89 to .99. Interitem correlation between the English and Chinese items was .87. The presence of age trends but not sex differences was a positive indicator of the test's validity. Correlations of .91 and 1.00 between the Chinese and the English items of the Handwriting Assessment Tool with the Chinese Handwriting Speed Test and Handwriting Speed Test, respectively, provided evidence of concurrent validity. These preliminary results showed the Handwriting Assessment Tool is reliable and is a potentially useful handwriting test for children biliterate in English and Chinese. The feasibility of assessing biliterate handwriting speed performance with the same set of scoring criteria for different writing systems was supported.  相似文献   

10.
This paper demonstrates and compares methods for estimating the interrater reliability and interrater agreement of performance ratings. These methods can be used by applied researchers to investigate the quality of ratings gathered, for example, as criteria for a validity study, or as performance measures for selection or promotional purposes. While estimates of interrater reliability are frequently used for these purposes, indices of interrater agreement appear to be rarely reported for performance ratings. A recommended index of interrater agreement, theT index (Tinsley & Weiss, 1975), is compared to four methods of estimating interrater reliability (Pearsonr, coefficient alpha, mean correlation between raters, and intraclass correlation). Subordinate and superior ratings of the performance of 100 managers were used in these analyses. The results indicated that, in general, interrater agreement and reliability among subordinates were fairly high. Interrater agreement between subordinates and superiors was moderately high; however, interrater reliability between these two rating sources was very low. The results demonstrate that interrater agreement and reliability are distinct indices and that both should be reported. Reasons are discussed as to why interrater reliability should not be reported alone.This paper is based, in part, on a thesis submitted to East Carolina University by the second author. Portions of this study were presented at the American Psychological Association meeting in New Orleans, LA, August, 1989. The authors would like to thank Michael Campion and two anonymous reviewers for their comments on earlier drafts of this paper.  相似文献   

11.
Previous research on measurement error in job performance ratings estimated reliability using coefficients: alpha, test–retest, and interrater correlation. None of these three coefficients control for the four main sources of error in performance ratings. For this reason, coefficient of equivalence and stability (CES) has been suggested as the ideal estimate of reliability. This article presents the estimates of CES for a time interval of 1, 2, and 3 years. The values obtained for a single rater were .51, .48, and .44, respectively. For two raters, the values were .59, .55, and .51. The findings suggest that previous reliability estimates based on alpha, test–retest, and interrater coefficients overestimated the reliability of job performance ratings. In the present study, the interrater coefficient overestimates reliability by 13.6–25.4% for an interval time of 1–3 years, as it does not control for transient error. Results also showed that the importance of transient error increases as the length of the interval between the measures increases. Based on the results, it is suggested that corrected validities based on interrater reliability underestimate the magnitude of the validity. The implications of these findings for future efforts to estimate criterion reliability and predictor validity are discussed.  相似文献   

12.
The available statistical tests of the equality of nonindependent alpha reliability coefficients require that the product of the number of test parts times the number of subjects be quite large—1000 or more. A modification of one of these tests is derived which avoids this limitation. Monte Carlo studies indicate that the modified test effectively controls the Type I error rate with as few as 2 or 3 test parts and 50 subjects. This means the modified test can be safely employed in comparisons between interrater reliabilities.  相似文献   

13.
Cooke DJ  Hart SD  Michie C 《心理评价》2004,16(3):335-339
Cross-national differences in the prevalence of psychopathy have been reported. This study examined whether rater effects could account for these differences. Psychopathy was assessed with the Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991). Videotapes of 6 Scottish prisoners and 6 Canadian prisoners were rated by 10 Scottish and 10 Canadian raters. No significant main or interaction effects involving the nationality of raters were detected at the level of full scores or factor scores. Using a generalizability theory approach, it was demonstrated that the interrater reliability of total scores was good, that is, the proportion of variance in test scores attributable to raters was small. The interrater reliability of factor scores was lower, typically falling in the fair range. Overall, the results suggest that the reported cross-national differences are more likely to be in the expression of the disorder rather than in the eye of the beholder.  相似文献   

14.
The purpose of this study was to investigate the interrater reliability of the visual-motor portion of the Copying subtest of the Stanford-Binet Intelligence Scale: Fourth Edition. Eight raters independently scored 11 protocols completed by children aged 5 through 10 years, using the scoring criteria and guidelines in the manual. The raters marked each of 10 items pass or fail and computed a total raw score for each protocol. Interrater reliability coefficients were obtained for each child's protocol, and the Kappa coefficient was computed for each item. Significant raters' reliability coefficients ranged from .82 to .91, which were low in comparison to test-retest reliability and Kuder-Richardson-20 coefficients for this and other subtests of the Stanford-Binet in the technical manual. Percent agreement among 8 raters also indicated weak reliability. Although the obtained results suggested some interrater reliability coefficients within acceptable levels, questions were raised about the scoring criteria for individual items. Caution is warranted in the use of cognitive measures which include subjective judgement of the examiner in applying scoring criteria.  相似文献   

15.
A monte carlo computer study was conducted where the statistical power of the univariate repeated measures ANOVA design proposed by Arvey and Mossholder (1977) to detect job differences was investigated. Also investigated was the relative value and usefulness of omega-squared estimates to indicate job similarities and differences. Job profile means and covariance structures were generated by using data from six relatively similar jobs and six dissimilar jobs based on Position Analysis Questionnaire (PAQ) data bank information. Different combinations of job differences (4 conditions), number of job raters (2 conditions), and violations of statistical assumptions (3 conditions) were generated (1000 sets for each of the 24 combinations) and each data set analyzed using the ANOVA design. Results indicate that testing for statistical significance is not as useful in determining job differences as examining the omega-squared estimates. Specifically, the omega-squared estimates for the interaction of the Jobs × Dimension effect is a relatively sensitive and stable indicator of job differences regardless of the number of raters and violations of the statistical assumptions.  相似文献   

16.
This study examined the feasibility of developing a job (clerical) family specific PAQ which could yield substantially the same amount of information as the original PAQ. By deleting 80 minimally applicable items, a 107-item custom PAQ was developed. A component analysis of the custom PAQ resulted in 18 interpretable dimensions which were quite similar to many of the original 32 dimensions. The ability of the 18 dimensions to predict compensation data and discriminate between job levels was investigated. The use of custom PAQ division data for these purposes was also investigated. Results of the study indicated that the 18 (custom PAQ) clerical dimensions were able to discriminate between job levels while retaining the predictive power of the original PAQ. Additionally, the modified PAQ division scores exhibited significant utility in differentiating among job levels, but were less efficient in predicting compensation than the dimension scores.The authors would like to express their deep appreciation to Brett Avner, Elizabeth Carlson, Dave Drehmer, and Jack Edwards for their help with the study. The authors would also like to thank Dr. Ernest J. McCormick and two anonymous reviewers for their comments.  相似文献   

17.
Interrater and intrarater reliability were evaluated for a test measuring active rotation range in a standing position. Subjects stood with their feet comfortably apart while a horizontal bar rested on their shoulders. A plumb bob attached to the end of the bar was allowed to drop to the floor, indicating maximal rotation range achieved. Two raters measured 24 subjects (M age = 35 +/- 14 yr.), who were sedentary office workers and active recreational golfers, on two occasions separated by two weeks to obtain values for left and right trunk rotation range. The test had good intrarater and interrater reliabilities, with standard error of measurement values varying from 5.6 degrees to 8.6 degrees against an overall mean range of 128 degrees. This simple active rotation test requires inexpensive equipment and could be incorporated into clinical examinations when there is a need to assess active rotation in standing with minimal constraints.  相似文献   

18.
Since the first published Situational Interview (SI) study (Latham, Saari, Pursell, & Campion, 1980), research has shown practical and psychometric support for the usefulness of this behavioral interview method. However, such studies have often failed to distinguish the effects of "interview context" factors, such as the SI's behaviorally anchored scale and the use of job expert interviewers on SI ratings. To aid HR managers interested in adopting a behavioral interview system, this study examined the contributions of the SI's behaviorally anchored scale and the interviewer's job expertise to the interrater agreement and accuracy of ratings of situational questions. Two police samples (job content experts) and a student sample (naive raters) showed that ratings of videotaped interviews for police sergeant/lieutenant positions based on the SI scale were significantly superior to those gained using a more traditional rating format, and that job experts did not produce better ratings than naive raters.  相似文献   

19.
We studied 205 low-income families, using the Family Needs Scale (FNS). Factor analysis of the FNS data resulted on a 7-factor solution with high internal consistency within the various subscales. We provide normative scores based on the factor structure of the FNS. A total of 53 parents completed the FNS on two occasions with an average of four weeks between these two ratings. In general, the test-retest reliabilities were low to moderate. A total of 61 pairs of parents independently rated their families with the FNS. Again, agreement between raters was low to moderate. Several factors that may have detracted from better test-retest and interrater reliability were identified. Our data point to the need for more psychometric studies with the FNS.  相似文献   

20.
Despite the rising popularity of the practice of competency modeling, research on competency modeling has lagged behind. This study begins to close this practice–science gap through 3 studies (1 lab study and 2 field studies), which employ generalizability analysis to shed light on (a) the quality of inferences made in competency modeling and (b) the effects of incorporating elements of traditional job analysis into competency modeling to raise the quality of competency inferences. Study 1 showed that competency modeling resulted in poor interrater reliability and poor between‐job discriminant validity amongst inexperienced raters. In contrast, Study 2 suggested that the quality of competency inferences was higher among a variety of job experts in a real organization. Finally, Study 3 showed that blending competency modeling efforts and task‐related information increased both interrater reliability among SMEs and their ability to discriminate among jobs. In general, this set of results highlights that the inferences made in competency modeling should not be taken for granted, and that practitioners can improve competency modeling efforts by incorporating some of the methodological rigor inherent in job analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号