首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ObjectivesSerial performance evaluations show calibration effects: Judges avoid extreme categories in the beginning (e.g. best or worst) because they need to calibrate an internal judgment scale (Unkelbach et al., 2012). Successful calibration is therefore important for fair and unbiased evaluations. A central prerequisite for successful calibration is knowledge about the performance range. The present study tests whether advance knowledge about the range (best and worst) of performances in a series reduces calibration effects.DesignA 2 × 2 × 2 design was developed with two between subject factors: the knowledge about the performance range (with vs. without) and two different talent tests (specific vs. unspecific). As within subject factor the position of the performances in the series (position 1–10 vs. 11–20) was integrated. The combination of the between subject factors resulted in four experimental conditions.MethodHandball coaches were randomly assigned to one of the conditions. Afterwards twenty performances were evaluated in a randomized order by the coaches.ResultsWithout knowledge about the range, they showed the expected avoidance of extreme categories in the beginning independent of the presented talent test. However, observing the best and worst performance in advance prevented the biases. Range-presentation is therefore a viable theory-based intervention to improve fairness in serial judgments.  相似文献   

2.
ObjectivesIn the beginning of serial evaluations, raters assess performances without knowledge about following performances. We assume that judges must observe a certain number of performances to calibrate their judgment scale, leading to systematic biases (avoidance of extreme judgments) in the beginning of judgment series. The present experiment investigates how many performance observations are necessary to calibrate internal judgment scales, leading to consistent judgments.DesignA between-group design was used.MethodVideos of a talent-test were presented in different orders. Every performance was presented in early, medium, and late positions. Thirty participants rated these series and we assessed the effect of performance position on performance evaluations.ResultsWe found calibration biases within the first nine performances; after that, evaluations remained stable and consistent.ConclusionCalibration processes are completed after a specific number of judgments; this implies interesting interventions for consistent judgments independent of the position in a series.  相似文献   

3.

Objectives

Judges avoid extreme judgments in the beginning of evaluation sequences. The calibration hypothesis attributes this bias to judges’ need to preserve their judgmental degrees of freedom. It follows that the expectation of a sequence leads to avoiding extreme judgments in the beginning. Thus, judges may make extreme judgments if they expect only one performance but should avoid extreme judgments if they expect a sequence.

Design

A between-group design was used.

Method

One experimental group (n = 21) expected to judge only one gymnastics performance whereas the other group (n = 20) expected to judge a sequence of performances. Both groups then judged only one identical performance.

Results

Groups differed significantly in the frequency of extreme judgments. Participants expecting one performance used extreme judgment categories more often; participants expecting a series avoided extreme judgments.

Conclusion

The results support calibration processes in sequential judgments. The specification of the underlying process will allow testing possible interventions to avoid serial position biases in serial evaluations in the future.  相似文献   

4.
The effects of a variety of experimental conditions on the judgments (length of lines) of 16 normal and 16 mentally retarded observers were examined using category and magnitude scaling techniques. Using error and variability of judgment as criteria for measuring response bias, for normal subjects knowledge about the stimulus range, whether learned or provided, had as much to do with resulting judgments as the type of scale used. Judgment error of the retarded group was significantly greater than the normal group and appeared to be related to their limited ability to assign categories or proportions to the simuli used.  相似文献   

5.
6.
Research and theory emphasizing the role of cue diagnosticity in judgment (e.g. Skowronski and Carlston, 1987, 1989) suggests that under the proper conditions: (a) negativity effects should be observed in judgments of honesty/dishonesty; (b) positivity effects should be observed in judgments of intelligence/unintelligence, and (c) intelligence-implicative and dishonesty-implicative cues should be increasingly difficult to contradict as those cues become more extreme. Two experiments yielded data consistent with these predictions. In addition, two other important findings emerged from these studies. The results of Experiments 1 and 2 indicated that subjects do not respond as if highly diagnostic cues are sufficient for category membership, suggesting that the representational format of trait categories does not correspond to the format suggested by the ‘classical model’ of categorization. The results of Experiment 2 also indicated that negativity and positivity effects are not substantially altered by a role-playing manipulation designed to increase subjects' involvement in the judgment task.  相似文献   

7.
In the first of a series of three experiments, two groups of 10 subjects judged either the bottom half or the top half of a series of 16 squares according to their size. After two presentations with the pretraining series, 10 trials with the total series followed. The initial judgments of the pretraining stimuli showed a marked tendency to persist all through the experiment, thus demonstrating a primacy effect. The effect was not completely removed by instructions to modify the judgment scale (Experiment 2). In the third experiment, the number of categories was varied. With a larger number of categories the primacy effect tended to wear off after a few postshift trials. The results are interpreted as providing an alternative explanation of the frequency effect and the number-of-categories effect discovered by Parducci. The present paper considers the frequency effect a special case of the primacy effect. The explanation is based on the fact that ordinarily stimulus frequency and time of first occurrence are confounded.  相似文献   

8.
Groups of naive judges rated 18 videotaped stimulus persons on masculinity, femininity, “dominance, assertiveness,” and “compassion, sensitivity to others.” Stimulus persons were broken down by sex and sex-typing—half were male, half female—and within sexes one third were classified as masculine, feminine, and androgynous on the basis of their scores on the Bem Sex Role Inventory. Two experiments are reported in which groups of judges rate stimulus persons on the basis of such different expressive information as videotaped pictures and recorded voices, videotaped pictures alone, videotaped bodies, videotaped heads, recorded voices, and still photos. The results showed: (1) Judges reliably rated masculinity-femininity from largely expressive cues: (2) judgments of masculinity-femininity were not predominantly determined by judgments of sex role-related traits: (3) the naive judgment of masculinity-femininity significantly corresponded to stimulus subjects' assessed sex roles; (4) stimulus subjects (particulary males) showed a consistent display of masculinity-femininity across expressive channels; and (5) judges used different expressive cues in judging masculinity-femininity in males and females. These results are related to broader questions concerning the relation between expressive behavior and personality.  相似文献   

9.
We found that the depth of sequential effects depends on the judgment task. An experiment with squares indicated that stimulus-response pairs up to two trials back were included in the judgment process when subjects were required to make category judgments of size, whereas only the immediately preceding event was incorporated when subjects were making magnitude estimations. In the case of category judgment, interactions between the current stimulus and prior stimuli as well as configural effects indicated that events one and two trials back meet an equivalent function in the judgment process and that these events may jointly operate in one trial. These findings can be explained by a class of models that assume that the position of preceding stimuli relative to the current stimulus is decisive in the judgment process. The multiple-standards model is a representative of this class according to which there are two types of standards: (1) the endpoints of the range as long-term standards and (2) traces of preceding stimuli as short-term standards.  相似文献   

10.
We found that the depth of sequential effects depends on the judgment task. An experiment with squares indicated that stimulus-response pairs up to two trials back were included in the judgment process when subjects were required to make category judgments of size, whereas only the immediately preceding event was incorporated when subjects were making magnitude estimations. In the case of category judgment, interactions between the current stimulus and prior stimuli as well as configural effects indicated that events one and two trials back meet an equivalent function in the judgment process and that these events may jointly operate in one trial. These findings can be explained by a class of models that assume that the position of preceding stimuli relative to the current stimulus is decisive in the judgment process. The multiple-standards model is a representative of this class according to which there are two types of standards: (1) the endpoints of the range as long-term standards and (2) traces of preceding stimuli as short-term standards.  相似文献   

11.
The present investigation was based on the concept of invariance, which holds that identical principles govern the judgment of stimuli arrayed on both physically and socially defined scales. Two experiments were conducted in which involvement was manipulated through the use of instructions presented in conjunction with category judgments obtained in a training session. This was followed by an anchor session. Experiment 1 employed a series of weights as stimuli, while Experiment 2 used random pattern dot slides. The experiments were similar, except for the inclusion of two additional features in the weight study. These were: (1) positive feedback, introduced as an independent variable between the judgment sessions and tested for its effect as an enhancer of involvement; and (2) two stimuli in the anchor session not part of the original stimulus series. These weights provided a test for the generality of involvement set. Among various hypotheses tested in both studies was the expectancy that the introduction of involvement associated with the formation of a judgment scale would lessen the impact of an anchor on judgment. This expectancy was based on the observed effects of involvement in social judgment contexts. Results of both experiments supported the judgment maintenance hypothesis.  相似文献   

12.
Category representations and their implications for category structure   总被引:1,自引:0,他引:1  
In a series of experiments and reanalyses of previous research, we tested the hypothesis that categories that are primarily represented by extrinsic features (i.e., those that are relations between two or more entities) would yield more graded structures than would categories primarily represented by intrinsic features (i.e., those features true of an item considered in isolation). These predictions were confirmed. Extrinsically represented categories showed (1) less agreement across subjects on membership judgments, (2) more graded membership in a membership judgment task, and (3) smaller differences between gradients of typicality and of membership judgments  相似文献   

13.
To the extent that categories inform judgments about items, the accuracy with which categories capture the statistical structure of experience should affect judgment accuracy. The authors argue that representations of feature correlations can serve as Bayesian priors, increasing the accuracy of stimulus estimates by decreasing variability. Participants viewed a series of objects that varied on two dimensions that were either uncorrelated or correlated. They estimated each item by manipulating a response object to make it match the presented stimulus. Subsequent classification and featureinference tasks indicated that the correlation was detected. The pattern of variability in recollections of stimuli suggested that the feature correlation informed estimates as predicted by a Bayesian model of category effects on memory. This work was supported by NIMH Grant 1 F31 MH12072-01A1 to the first author.  相似文献   

14.
The accuracy of confidence judgments can be determined using measures of discrimination and calibration. The present paper utilizes a new assessment methodology that decomposes the confidence assessment task, allowing us to investigate discrimination and calibration skills in greater depth than has been done in previous studies. Researchers investigating the goodness of confidence judgments have typically grouped forecasters' assessments into experimenter-defined categories, generally in equal widths of .10. In the present research, subjects created their own categories and later assigned confidence judgments to the categories, separating the tasks of discriminating categories (discrimination) and assigning numbers to categories (calibration). Further, the typical assessment procedure assumes that subjects are able to discriminate equally across the confidence scale. Since subjects in the present study defined their own assessment categories, they could locate those categories at any point on the scale. A final issue of interest was whether subjects were able to determine accurately the number of categories into which they could discriminate. Sixty subjects performed 1 of 2 tasks, general knowledge or forecasting, in both relatively easy and relatively hard conditions. Results showed a trade-off in performance: Calibration generally became worse as the number of categories increased, while discrimination generally improved. Overall accuracy was not affected by the number of categories used. Further, subjects partitioned categories more at the high end of the scale. Finally, measures showed that subjects were not accurate in their beliefs about their own discrimination ability.  相似文献   

15.
Judges were asked to evaluate the overall performance of hypothetical students, given their scores on two examinations. The distribution of total scores was manipulated in order to investigate the loci of contextual effects. The interaction between the two exams was reversed by manipulation of the distribution. When the distribution of total scores was positively skewed, judgments showed a convergent interaction as a function of the two exams; when the distribution was negatively skewed, the interaction was divergent. The data were consistent with the hypothesis that the distribution of total scores affects only the transformation from integrated impressions to overt responses. This transformation (judgment function) was well-described by an extension of range-frequency theory. The finding that the interaction can be manipulated by changing the stimulus distribution has methodological implications for the popular interpretation of interactions or lack thereof. A good model may be improperly rejected or a bad one improperly retained through lack of attention to contextual effects.  相似文献   

16.
17.
Two experiments are reported that test the hypothesis that the serial position effect in comparative judgment of ordinal position in arbitrary serial lists results from differential memory or associative strength among list items. The serial position effect in comparative judgment is typically a pattern in which pairs that contain a term from one of the two extremes of the list are processed faster and more accurately than pairs that contain no end terms. The experiments show that a new term added to either the end or the middle of a well-practiced fourterm series behaves almost immediately like the end or central term, respectively, of a well-practiced five-term series. Furthermore, when the added term is removed, the list reverts immediately to the position effect obtained in a four-term series. Theories that explain the position effect by differential build-up of item strength or of interitem associative strength over practice cannot explain these effects. We propose instead that learning of a serial list is accomplished by assigning list members to positions in a general-purpose linear order schema and that subjects can make these assignments rapidly and flexibly.  相似文献   

18.
The reliability of subjects’ judgments of the groups present in dot patterns and the sensitivity of those judgments to stimulus transformation were assessed. The subjects indicated the groups that they saw within random dot patterns, and each judgment was compared with those of other subjects and with their own judgments for related presentations. Within subjects, each pattern appeared in an initial presentation, an identical repetition, and a transformed state (a rotation or a change in scale). Witfiin-subjects judgments were more reliable than between-subjects judgments. An interpretation of within-subjects results was made in relation to predictions made by a formal algorithm of grouping by proximity (the CODE algorithm), which assumes that grouping by proximity is invariant over transformations such as rotations or changes in scale. A slight cost to transforming the patterns was found. The implications for CODE and for using grouping judgments as data are discussed.  相似文献   

19.
Confidence-accuracy calibration was examined for both absolute (recognizing single faces as old or new) and relative (selecting which of pairs of faces is old) judgments, using both full- (0%-100%) and half-range (50%-100%) confidence scales. The half-range confidence scale demonstrated superior calibration to the full-range scale, for which a confidence-accuracy association was evident only for the upper half (i.e., 50%-100%) of the scale. Good calibration was observed for the absolute judgment conditions, but the relative judgment conditions evidenced marked underconfidence. Also, in the absolute judgment conditions, good calibration for positive recognition decisions and poorer calibration for negative decisions was observed. These results are discussed in the context of theories of confidence and accuracy in face recognition memory and also of eyewitness identification research.  相似文献   

20.
I examined 2 different views on organization of linear order information in memory: the linear schema view and the hierarchical structure view. The linear schema view holds that there is a strong tendency in memory to organize an array of transitively related elements into a unidimensional order. The hierarchical structure view maintains that the transitively related elements are organized in memory in a hierarchy of items. I proposed an input structure and retrieval compatibility hypothesis as a coherent explanation for the contradictory views on the memory organization of serial elements. I argue that the organization of linear order information in memory is determined by the nature of the memory retrieval task used. For example, a comparative judgment task is more compatible with a unidimensional structure, whereas a sequential recall or serial position identification task is more compatible with a hierarchical organization. The input structure and retrieval compatibility concept can also explain why dichotomous categories imposed on a linear order play very little role in comparative judgments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号