首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
评价中心的评分维度和评分效果   总被引:3,自引:1,他引:2  
对近年来国内外关于评价中心的研究进行了比较系统的介绍。首先,文章讨论了评分维度的数目对于评分结果的影响,以及评价中心中的4个元维度;其次,介绍了评价中心中衡量评分效果的指标,并讨论了评分者培训的分类及其对评分效果的影响;第三,虽然评价中心具有良好的效标关联效度,但对于其结构效度的研究至今尚无统一结论。最后,文章对评价中心未来的研究趋势进行了探讨。  相似文献   

2.
Frame-of-reference (FOR) rater training is one technique used to impart a theory of work performance to raters. In this study, the authors explored how raters' implicit performance theories may differ from a normative performance theory taught during training. The authors examined how raters' level and type of idiosyncrasy predicts their rating accuracy and found that rater idiosyncrasy negatively predicts rating accuracy. Moreover, although FOR training may improve rating accuracy even for trainees with lower performance theory idiosyncrasy, it may be more effective in improving errors of omission than commission. The discussion focuses on the roles of idiosyncrasy in FOR training and the implications of this research for future FOR research and practice.  相似文献   

3.
We tested the effects of rater agreeableness on the rating of others’ poor performance in performance appraisal (PA). We also examined the interactions between rater agreeableness and two aspects of the rating context: ratee self‐ratings and the prospect of future collaboration with the ratee. Participants (n= 230) were allocated to one of six experimental groups (a 3 × 2 between‐groups design) or a control group (n= 20). Participants received accurate, low‐deviated, or high‐deviated self‐ratings from the ratee. Half were notified they would collaborate with the ratee in a future task. High rater agreeableness, positive deviations in self‐rating, and the prospect of future collaboration were all independent predictors of higher PA ratings. The interactions between rater agreeableness and rating context were very small. We argue that conflict avoidance is an important motivation in the PA process.  相似文献   

4.
Once central to the identity and practice of clinical psychology, psychological assessment (PA) is currently more limited in professional practice and generally less emphasized in graduate training programs than in the past. Performance-based personality tests especially are taught and used less, even though scientific evidence of their utility and validity has never been stronger. We review research on training in PA and discuss challenges that contributed to its decreased popularity. We then review continuing education requirements for ethical practice in PA and recommend that PA should be reconceptualized as a specialty best practiced by psychologists who have the resources and time to maintain competency. We offer recommendations about how professional organizations concerned with PA can promote its practice and how individual expert clinicians can assist. We conclude by describing a collaborative model for providing group consultation in PA to practicing psychologists. If implemented widely, this model could help promote PA and raise its standard of practice.  相似文献   

5.
This review covers the past 25 years of research literature on training observers of behavior, specifically in the areas of interviewing, reducing rater bias, interpersonal perception and observation as a research tool. The focus is on training procedure (i.e., the various training designs and their components). An attempt is made to organize and systematize the research and to answer two important questions. Which approach(es) used to train observers of behavior has (have) been most successful? What are the theoretical or empirical bases for the development of those training programs?  相似文献   

6.
A total of 4 raters, including 2 teachers and 2 research assistants, used Direct Behavior Rating Single Item Scales (DBR-SIS) to measure the academic engagement and disruptive behavior of 7 middle school students across multiple occasions. Generalizability study results for the full model revealed modest to large magnitudes of variance associated with persons (students), occasions of measurement (day), and associated interactions. However, an unexpectedly low proportion of the variance in DBR data was attributable to the facet of rater, as well as a negligible variance component for the facet of rating occasion nested within day (10-min interval within a class period). Results of a reduced model and subsequent decision studies specific to individual rater and rater type (research assistant and teacher) suggested degree of reliability-like estimates differed substantially depending on rater. Overall, findings supported previous recommendations that in the absence of estimates of rater reliability and firm recommendations regarding rater training, ratings obtained from DBR-SIS, and subsequent analyses, be conducted within rater. Additionally, results suggested that when selecting a teacher rater, the person most likely to substantially interact with target students during the specified observation period may be the best choice.  相似文献   

7.
This study examined the impact of various components of rater training on the accuracy of rating behavior using Direct Behavior Rating-Single Item Scales (DBR-SIS). Specifically, the addition of frame-of-reference and rater error training components to a standard package involving an overview and then modeling, practice, and feedback was investigated. In addition, amount of exposure to the direct training component (i.e., number of practice and feedback opportunities) was evaluated, and the rates at which behavior was displayed were carefully manipulated to control for and evaluate training impact by target and rate of behavior. The sample consisted of undergraduate students assigned to one of 6 possible conditions. Overall findings suggested that completion of a training package did result in enhanced accuracy when using DBR-SIS to rate academic engagement and disruption. However, results also supported that the most comprehensive package of DBR training may not always result in greater improvements over a standard package involving direct training. In general, a more intensive training package appeared beneficial at improving ratings for targets that had previously been difficult to rate accurately (e.g., medium rate disruptive behavior). Limitations and implications for future research are discussed.  相似文献   

8.
Eighty-six incumbents of three different jobs produced job-analytic ratings using either a decomposed (task-based) or a holistic (job-based) rating strategy. Approximately half of them received rater training in making inferential decisions. When tasks were less complex than the job as a whole, rating decomposition generally had positive effects on ratings' quality. Similarly, when the number of tasks rated was low to moderate, rater training was effective. A contingency approach, where limitations concerning the use of rating decomposition and inferential training were outlined, should serve to inform future uses and theories of rating aids in job analysis.We would like to acknowledge David Dorsey for his significant contribution to the rater training program. This article is a summary of the doctoral dissertation of Juan Sanchez, which was conducted under the supervision of Edward Levine.  相似文献   

9.
In performance‐feedback situations, reactions to the rater have been examined rarely. A clearer understanding of what causes negative reactions toward raters might be used by them to better control feedback outcomes without having to distort the feedback message. In Study I, ratee reactions to the packaging of feedback messages were examined in a laboratory experiment. A legitimizing statement included in the feedback message resulted in more positive reactions to the rater than when no such statement was presented. In addition, the use of less personal feedback language resulted in more positive reactions to the rater than when more personal language was used. Neither legitimization statement nor type of language significantly impacted reactions to the feedback message or perceptions of performance, indicating that they did not distort the feedback message. In Study 2, a laboratory observation, the use of more personal language by the rater was related negatively to ratee confidence in rater judgment and rater likability. More research on feedback packaging, with the goal of training raters in how best to convey the feedback message, is needed.  相似文献   

10.
Despite the popularity of frame‐of‐reference training (FORT), it is not clear how different structural elements of FORT work in concert to improve rating accuracy. Furthermore, past rater training studies have lacked rigorous control groups leading to low thresholds for showing improvements in rating accuracy due to FORT. The current study allowed for the isolation of components of rater training that increase rating accuracy when compared to a rigorously designed control group. Results indicated that repeated rendering of practice ratings improve rating accuracy and this practice effect was amplified by practice rating feedback. Although accuracy‐based training content improved interrater agreement, it did not contribute to improvements rating accuracy over and above the control group. We discuss the implications of the findings in relation to best practices for designing rater training programs.  相似文献   

11.
Aims: This study addresses the effects of structured training on the development of coding skills used in psychotherapy process research. Method: Participants included graduate trainees enrolled in an APA approved Clinical PhD programme. A course outline for training is reviewed and examined in relation to ratings of therapist techniques used during psychotherapy sessions. Results: The effects of this structured training protocol for raters resulted in good to excellent levels of interrater reliability. Different groups of raters were compared along multiple factors such as level of graduate training, training received on a particular measure, and psychotherapy experience. Discussion: The implications of these findings for rater training in psychotherapy process research are discussed.  相似文献   

12.
Differential rater functioning (DRF) occurs when raters show evidence of exercising differential severity or leniency when scoring examinees within different subgroups. Previous studies of DRF have examined rater bias using manifest variables (e.g., use of covariates) to determine the subgroups. These manifest variables include gender and the ethnicity of the examinee. For example, a rater may score males more severely. Ideally, each rater’s severity should be invariant across subgroups. This study examines DRF in the context of latent subgroups that classify possible sources of DRF based on raters’ scoring behavior rather than manifest factors. An extension of the latent class signal detection theory (LC-SDT) model for identifying DRF is proposed and examined using real-world data and simulations. Results from real-world data show that the signal detection approach leads to an effective method to identify latent DRF. Simulations with varying sample sizes and conditions of rater precision were shown to recover parameters at an adequate level, supporting its use to identify latent DRF in large-scale data. These findings suggest that the DRF extension of the LC-SDT can be a useful model to examine characteristics of raters and add information that can aid rater training.  相似文献   

13.
The role of two types of acute physical activity (PA) bouts were assessed on young adults’ free-recall and recognition memory in two experiments, which differed in the temporal relation of PA and word encoding. Before or following training on the Rey Auditory Verbal Learning Task, participants performed a simple two-step dance, a complex four-step dance, or remained seated. Hypotheses proposed that PA prior to encoding and complex PA would enhance PA’s mnemonic benefits. Memory assessed post-PA, 24 h, and 7 days after training indicated that timing and complexity of PA did not impact free-recall or recognition memory. Findings differ from a previous study showing complex PA benefited motor learning more than simple PA (Tomporowski & Pendleton, 2018). The inconsistency may be due to different working memory processes underlying consolidation and retrieval of procedural or episodic information. Theory-based explanations regarding memory storage and retrieval are proposed to elucidate this selective process.  相似文献   

14.
对于评定耗时较长的测验来说,时间因素对评分精确性的影响不容忽视,因此,评分者漂移方面的研究备受关注。研究基于康春花,孙小坚和曾平飞(2016)提出的等级反应多水平侧面模型建构出可用于检测评分者漂移的等级反应多水平评分者漂移模型,并通过模拟研究对模型性能进行验证。结果表明:模型能够精确估计项目和能力参数;且与固定效应模型相比,评分者随机效应模型能更有效地检测出评分者漂移效应,随机效应模型的有效性和稳定性更佳。  相似文献   

15.
The present study updates Woehr and Huffcutt's (1994) rater training meta‐analysis and demonstrates that frame‐of‐reference (FOR) training is an effective method of improving rating accuracy. The current meta‐analysis includes over four times as many studies as included in the Woehr and Huffcutt meta‐analysis and also provides a snapshot of current rater training studies. The present meta‐analysis also extends the previous meta‐analysis by showing that not all operationalizations of accuracy are equally improved by FOR training; Borman's differential accuracy appears to be the most improved by FOR training, along with behavioural accuracy, which provides a snapshot into the cognitive processes of the raters. We also investigate the extent to which FOR training protocols differ, the implications of protocol differences, and if the criteria of interest to FOR researchers have changed over time.  相似文献   

16.
Parallel analysis (PA; Horn, 1965) is a technique for determining the number of factors to retain in exploratory factor analysis that has been shown to be superior to more widely known methods (Zwick & Velicer, 1986). Despite its merits, PA is not widely used in the psychological literature, probably because the method is unfamiliar and because modern, Windows-compatible software to perform PA is unavailable. We provide a FORTRAN-IMSL program for PA that runs on a PC under Windows; it is interactive and designed to suit the range of problems encountered in most psychological research. Furthermore, we provide sample output from the PA program in the form of tabled values that can be used to verify the program operation; or, they can be used either directly or with interpolation to meet specific needs of the researcher.  相似文献   

17.
18.
We examined Work Behavior to knowledge, skill, or ability linkage ratings for 9 jobs to determine the degree to which differences in the ratings were due to rater type. We collected ratings from incumbents and 2 types of job analysts: project job analysts (analysts knowledgeable of the job) and nonproject job analysts (analysts with very little or no knowledge of the job). In our analyses of the data, we calculated means, standard deviations, effect sizes, and correlations for each rater type, as well as compared the reliability of the ratings. We also estimated variance components for each job by conducting generalizability analyses ( Brennan, 1983 ; Shavelson, Webb, & Rowley, 1989 ). Our findings indicate that the level of linkage ratings is similar across rater types, that it is important to obtain ratings from multiple raters regardless of rater type, and that ratings from job analysts may be more reliable than those of incumbents.  相似文献   

19.
The aim of this study was to evaluate inter‐rater reliability when using the Swedish version of the Motivational Interviewing Treatment Code (MITI) as an adjunct to MI training, clinical practice and research. Coders were trained to use the MITI for scoring taped sessions. The 4‐month basic training had a duration of 39 hours. Following training, 60 audio‐taped live interviews were randomly assigned for MITI coding. Mean intra‐class correlation (ICC) coefficients were calculated for 7 coders across all pairs of coders. Cronbach's alpha was calculated to estimate the covariance between each pair across their common interviews. Six months later, a second inter‐rater reliability test was performed, when 5 coders coded the same 15 randomly selected tapes. At the second reliability testing the mean ICC was 0.81 and the mean Cronbach's alpha was 0.96. However, the ICC varied for different sub‐variables of the MITI, ranging from 0.42 empathy to 0.79 for number of Closed questions. In conclusion, MITI shows promising potential to be a reliable tool to confirm and enhance MI training as well as practice in clinical settings and in evaluating MI integrity in clinical MI research. However, coder assessment of empathy and MI‐spirit, “global” variables, requires further refinement.  相似文献   

20.
This study examined the moderating effect of rater nationality on the relationships among ratee task, contextual, and counterproductive behaviors, and rater salary estimates gauging the dollar value of overall job performance. As hypothesized, rater nationality had a significant moderating effect, such that the Lebanese sample showed stronger relationships between each of the three types of job performance and their dollar value estimates than did the American sample. In other words, results indicated that Lebanese participants made stronger salary differentiations among the different levels of task, contextual, and counterproductive performance. These results seem to suggest that Lebanese participants provided salary estimates using more of an equity approach, whereas American participants provided salary estimates using more of an equality approach. Results contribute to the growing evidence that national culture is important in evaluating job performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号