Two laboratory studies were conducted to assess the role of behavioral data in the formation of appraisal ratings. In the first study, which included 200 participants, significant main effects using a split-plot factorial analysis of variance design were found for performance variability, performance level, and ratee order. A central drift phenomenon also was discovered with ratees whose performance profiles were both similar and extreme (extremely high or extremely low), receiving ratings that drifted toward average as a function of rating queue position. A second study, which included 126 participants, replicated this drift phenomenon and isolated the cause as profile similarity rather than profile extremity. Implications of these findings for rating accuracy and future "truescore" research designs are discussed.  相似文献   

The present study demonstrated that the presence of evaluatively polarized context performances not only produces contrast and halo effects on judgments of a target performance, but also causes judgments to be made much faster. Processing speed and positive halo were highly correlated, supporting the notion that halo in performance ratings results from raters' recall and use of a single, general impression. Furthermore, regression analyses demonstrated that processing speed mediates the relationship between context and halo. The relationship between these findings, halo, processing speed, and general impressions, as well as implications for performance appraisals, are discussed.  相似文献   



The purpose of this study was to take an inductive approach in examining the extent to which organizational contexts represent significant sources of variance in supervisor performance ratings, and to explore various factors that may explain contextual rating variability.


Using archival field performance rating data from a large state law enforcement organization, we used a multilevel modeling approach to partition the variance in ratings due to ratees, raters, as well as rating contexts.


Results suggest that much of what may often be interpreted as idiosyncratic rater variance, may actually reflect systematic rating variability across contexts. In addition, performance-related and non-performance factors including contextual rating tendencies accounted for significant rating variability.


Supervisor ratings represent the most common approach for measuring job performance, and understanding the nature and sources of rating variability is important for research and practice. Given the many uses of performance rating data, our findings suggest that continuing to identify contextual sources of variability is particularly important for addressing criterion problems, and improving ratings as a form of performance measurement.


Numerous performance appraisal models suggest the importance of context; however, previous research had not partitioned the variance in supervisor ratings due to omnibus context effects in organizational settings. The use of a multilevel modeling approach allowed the examination of contextual influences, while controlling for ratee and rater characteristics.

Reducing the Effects of Performance Expectations on Behavioral Ratings   总被引:2,自引:0,他引:2  
In this study, we develop and test two strategies for reducing the effects of performance expectations on behavioral ratings. A 3 × 3 experimental design (N = 169), manipulating preobservation performance cues (positive, negative, or none) and the type of intervention (halo error training, structured recall memory, or none), was conducted. The results of this study indicate that both interventions reduced the effects of performance expectations on behavioral ratings. However, analyses of rating accuracy and measures from both recognition memory and recall memory tests suggest that the structured recall memory intervention has distinct advantages. These analyses indicate that the structured recall intervention can reduce raters' reliance on heuristics and increase the correspondence between raters' memory and their subsequent ratings.  相似文献   


Recent trends indicate that organizations will continue their strategic pursuit of teamwork for the foreseeable future, which will create a need for accurate assessments of individuals’ performance in teams. Although individual behaviors can be perceived and assessed by fellow team members (i.e., peers), the extent to which the team shapes perceivers’ judgments versus the target’s behavior is unclear. We conducted two studies to understand how and why team context influences peer ratings of individual performance. In study 1, we conducted cross-classified modeling on a sample of 7160 performance observations of 568 targets made by 567 perceivers, who were each members of four separate teams. Results indicated that team membership accounted for a substantially higher proportion of perceiver, relative to target, variance. In study 2, we conducted social relations modeling with a sample of 679 performance observations collected from 217 individuals nested in 46 teams to test the effects of psychological safety on perceiver, target, and team variance components. Perceptions of psychological safety accounted for proportionally larger perceiver, relative to target, variance in OCB, and task performance ratings. Altogether, team context appears to affect perceivers’ judgments of behavior more than the target’s behavior itself, implying that peer ratings sourced from different teams may not be comparable. We consider the implications for the collection and interpretation of peer performance ratings in teams and the potential implications for social cognitive theory, such that certain aspects of the team context, including psychological safety, may act as a cognitive heuristic by molding perceiver judgments of targets.


Previous research has demonstrated that performance information (e.g., prior supervisor ratings) can bias behavioral ratings. However, research has not fully explored the effects of performance cues on raters' memory. In addition, no studies have attempted to eliminate this performance cue effect. This study addressed these deficiencies by collecting both free recall and recognition memory measures while testing an unstructured free recall intervention. Results indicate that performance cues do affect the recall of performance relevant behaviors from memory. Contrary to expectations, free recall did not prove to be an effective intervention. Implications of these findings for future attempts to remove the performance cue effect are discussed.  相似文献   

Two studies were conducted to examine discrepancies in the evaluation of men and women regarding the performance of organizational citizenship behavior (OCB). In Study 1, base‐rate differences in the perceived frequency and value of citizenship behaviors performed by males and females were investigated. A gender by job type interaction was found indicating that women were perceived to engage in OCB more frequently than were men in gender‐neutral and male‐typed jobs. No gender differences were found regarding the value associated with citizenship behaviors. In Study 2, undergraduates rated videotaped male and female instructors who exhibited different levels of OCB. Results revealed a gender by OCB interaction such that more accurate behavioral observations were made when observing males exhibiting OCB and females exhibiting no OCB than when observing males who did not exhibit OCB and females who did exhibit OCB. No gender by OCB interactions were found with regard to ratings of overall performance evaluation or reward recommendations.  相似文献   

Racial bias in performance ratings may be inferred when ratings hold differ- ent meanings for different racial subgroups. Operationally, this would be indi- cated by differences (by ratee race) in the correlation between performance ratings and objective indices of performance. In this study, the effects of rate race on the relations between supervisory ratings and more objective criteria of job knowledge and work performance were examined by aggregating corre- lations across 25 studies. The results indicated that supervisory ratings were more highly related to work-performance measures-and to a lesser extent to job-knowledge measures-for Black than for White ratees. Tivo theories were proposed that could account for such differences.  相似文献   

The impact of performance extremities (peaks and troughs) on performance ratings was unexamined. Based on judgment and decision-making theories, we hypothesized that performance extremities would exert an incremental (beyond performance mean and trend) impact on ratings and that the impact of troughs would exceed that of peaks. We also hypothesized that extremities would exert a greater impact when performance trends were in the opposite direction and when performance information was presented in a graphical rather than tabular display format. We tested these hypotheses via a policy-capturing study in which participants rated employee performance profiles across which extremities, trends, and mean levels were manipulated. The results consistently indicated that performance troughs, but not performance peaks, influenced performance ratings in expected ways.  相似文献   

The present study examined the impact of attentional and memory demands on work performance ratings accorded men and women in traditionally male jobs. Of interest was whether sex discrimination would abate in the face of individuating and job-relevant work behavior even when the demands likely to be faced in actual work settings were taken into account. Two hundred and two subjects read a vignette depicting the work behavior of a male or female police officer and then rated the individual's work performance. The attentional demands imposed on subjects while reading the vignette and the amount of time elapsed prior to issuing the performance ratings were systematically varied. As predicted, men were evaluated more favorably than women when raters were faced with an additional task requiring attention and time pressures were made salient. Only when subjects were able to carefully allocate all of their attentional resources did sex bias in work performance ratings abate. Memory demands had no effects on work performance ratings. Gender-related work characterizations paralleled the performance ratings, providing support for the idea that sex stereotypes mediate discrimination in performance appraisal judgments. The theoretical and practical implications of these findings, as well as suggestions for future research, are discussed.  相似文献   



This article investigates the efficacy of the Structured Free Recall Intervention (SFRI; J Bus Psychol 15:229?C246, 2000a; Organ Behav Hum Decis Process 82:237?C267, 2000b ) for reducing the impact of bodyweight-based stereotype endorsement on performance ratings, both immediately and when a time delay occurs between the observation and rating of performance.


512 undergraduates participated in a 2?×?2?×?2 between-subjects factorial experiment. A measure of bodyweight-based stereotype endorsement was pre-screened, and participants were randomly assigned to (a) either a no-delay or two-day time delay condition, (b) view either an average bodyweight or overweight ratee, and (c) undergo the SFRI or not.


Results suggest that (a) bodyweight-based stereotype endorsement predicts performance ratings for overweight ratees, (b) the SFRI is effective at reducing the impact of such stereotypes on performance ratings when conducted immediately after the observation of performance, and (c) the SFRI maintains this efficacy after a two-day delay between the observation and rating of performance.


These findings suggest that the best real-world application of the SFRI paradigm may be to situations with minimal delays between the observation and rating of performance, such as selection assessment centers or pre-employment interviews.


Drawing on theories from the cognitive information processing literature, this paper extends previous research regarding the efficacy of the SFRI by demonstrating that short time delays between performance observation and rating??a common organizational phenomena??have minimal observed effects on the efficacy of the SFRI as a performance rating intervention.  相似文献   

The present paper reports on a study investigating whether the presence of a foreign accent negatively affects credibility judgments. Previous research suggests that trivia statements recorded by speakers with a foreign accent are judged as less credible than when recorded by native speakers due to increased cognitive demands (Lev-Ari and Keysar in J Exp Soc Psychol 46(6):1093–1096, 2010. doi: 10.1016/j.jesp.2010.05.025). In the present study, 194 French- and 183 Swiss-German-speaking participants were asked to judge the truthfulness of 48 trivia statements recorded by speakers with French, Swiss-German, Italian and English accents by means of an online survey. Before submitting the survey, raters were asked to attribute given labels—including adjectives referring to credibility—to a language group aiming to elicit raters’ stereotypes in a direct manner. Although the results of this task indicate that the raters do hold different stereotypes concerning credibility of speech communities, foreign accent does not seem to have an impact on credibility ratings in the Swiss context.  相似文献   

Social cognition theory asserts that perceivers (raters) assign stimulus persons (ratees) to social categories. These categories help the raters encode, store, and recall information. In a longitudinal design that represented a performance appraisal situation, this study examined the effects of information about a ratee's category membership on the amount of information that raters collected about the ratee prior to rating. One hundred fourteen subjects participated in three separate experimental sessions which spanned a 3-week time period. Among other tasks, subjects were required to rate a subordinate who was described in a manner which made it either difficult or easy to assign the subordinate to a social category. It was predicted and found that raters of ratees who were easily categoriezed spent less time observing the ratees' performance than raters of ratees who were less easily classified. Furthermore, results indicated that it was the effect of rater categorization on observation time that was critical to rating accuracy.  相似文献   

Shintaro Okazaki 《Sex roles》2007,57(11-12):897-908
This study examines how gender affects mobile advertising acceptance in Japan. Drawing upon cultural, socioeconomic, and industry-specific factors, five hypotheses and two research questions are formulated for four dependent variables (trust, attitude toward the ad, attitude toward the brand, and ad recall) and two independent variables (gender and ad type). User frequency was considered a covariate. An empirical survey was conducted in Japan: Forty thousand respondents were randomly selected, and 3,254 responses were received. Two mobile campaigns (one durable and one nondurable good) were used as stimuli. Multivariate data analysis found significant multivariate effects as well as univariate effects. There was a significant interaction effect of gender and ad type on ad recall. In closing, the study’s implications are discussed.  相似文献   

The present study extended research on contrast effects by (a) examining the effect of context performances on ratings of a target performance when a prior impression of the target performer already exists, and (b) clarifying the issue of whether contrast is caused by attention to context-discrepant behavior or shifts in judgment standards. The results demonstrated that the existence of a prior impression mitigates the influence of context performances on ratings. Judgment standards were found to be unstable and dependent on information provided to raters by the experimental manipulations. Regression analyses showed that both attention and standards of judgment mediate the relationship between context and ratings. Implications of these findings for contrast effects, performance ratings, and the importance of reliable judgment standards for real-world performance appraisals are discussed.  相似文献   

Much of the prior research investigating the influence of cultural values on performance ratings has focused either on conducting cross-national comparisons among raters or using cultural level individualism/collectivism scales to measure the effects of cultural values on performance ratings. Recent research has shown that there is considerable within country variation in cultural values, i.e. people in one country can be more individualistic or collectivistic in nature. Taking the latter perspective, the present study used Markus and Kitayama's (1991) conceptualization of independent and interdependent self-construals as measures of individual variations in cultural values to investigate within culture variations in performance ratings. Results suggest that rater self-construal has a significant influence on overall performance evaluations; specifically, raters with a highly interdependent self-construal tend to show a preference for interdependent ratees, whereas raters high on independent self-construal do not show a preference for specific type of ratees when making overall performance evaluations. Although rater self-construal significantly influenced overall performance evaluations, no such effects were observed for specific dimension ratings. Implications of these results for performance appraisal research and practice are discussed.  相似文献   

This study examined 2 possible ways of increasing the predictive validity of personality measures: using observer (i.e., supervisor and coworker) ratings and work‐specific self‐ratings of Big Five personality factors. Results indicated that among general self‐ratings of Big Five personality dimensions, Conscientiousness was the best predictor of in‐role performance, and Agreeableness and Emotional Stability were the best predictors of organizational citizenship behavior (OCB). Observer ratings of personality accounted for incremental variance in job performance (in‐role performance and OCB) beyond that accounted for by general self‐ratings. However, contrary to our expectations, work‐specific (i.e., contextual) self‐ratings of personality generally did not account for incremental variance in job performance beyond that accounted for by general self‐ratings.  相似文献   

The effect of the congruity between the involvement types of advertising commercial and a television program on the effectiveness of the commercial was studied. Participants (N = 103) viewed either a cognitive or an affective commercial for a product, which was embedded in either a cognitive or an affective television program. The results showed that the effects of the congruence influence the impact on memory. Free recall and cued recall were significantly influenced by the program-commercial congruity. Free recall and cued recall were significantly higher for the cognitively involving commercial in the cognitively involving program context than in the affectively involving program context. Similarly, free recall and cued recall were significantly higher for the affectively involving commercial in the affectively involving program context than in the cognitively involving program context.  相似文献   

