首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Interval by interval reliability has been criticized for "inflating" observer agreement when target behavior rates are very low or very high. Scored interval reliability and its converse, unscored interval reliability, however, vary as target behavior rates vary when observer disagreement rates are constant. These problems, along with the existence of "chance" values of each reliability which also vary as a function of response rate, may cause researchers and consumers difficulty in interpreting observer agreement measures. Because each of these reliabilities essentially compares observer disagreements to a different base, it is suggested that the disagreement rate itself be the first measure of agreement examined, and its magnitude relative to occurrence and to nonoccurrence agreements then be considered. This is easily done via a graphic presentation of the disagreement range as a bandwidth around reported rates of target behavior. Such a graphic presentation summarizes all the information collected during reliability assessments and permits visual determination of each of the three reliabilities. In addition, graphing the "chance" disagreement range around the bandwidth permits easy determination of whether or not true observer agreement has likely been demonstrated. Finally, the limits of the disagreement bandwidth help assess the believability of claimed experimental effects: those leaving no overlap between disagreement ranges are probably believable, others are not.  相似文献   

2.
3.
ABSTRACT— In two experiments, 3- and 4-year-olds were tested for their sensitivity to agreement and disagreement among informants. In pretest trials, they watched as three of four informants ( Experiment 1 ) or two of three informants ( Experiment 2 ) indicated the same referent for an unfamiliar label; the remaining informant was a lone dissenter who indicated a different referent. Asked for their own judgment, the preschoolers sided with the majority rather than the dissenter. In subsequent test trials, one member of the majority and the dissenter remained present and continued to provide conflicting information about the names of unfamiliar objects. Children remained mistrustful of the dissenter. They preferred to seek and endorse information from the informant who had belonged to the majority. The implications and scope of children's early sensitivity to group consensus are discussed.  相似文献   

4.
The rater agreement literature is complicated by the fact that it must accommodate at east two different properties of rating data: the number of raters (two versus more than two) and the rating scale level (nominal versus metric). While kappa statistics are most widely used for nominal scales, intraclass correlation coefficients have been preferred for metric scales. In this paper, we suggest a dispersion-weighted kappa framework for multiple raters that integrates some important agreement statistics by using familiar dispersion indices as weights for expressing disagreement. These weights are applied to ratings identifying cells in the traditional inter-judge contingency table. Novel agreement statistics can be obtained by applying less familiar indices of dispersion in the same wayThis revised article was published online in August 2005 with the PDF paginated correctly.  相似文献   

5.
Two sources of variability must each be considered when examining change in level between two sets of data obtained by human observers; namely, variance within data sets (phases) and variability attributed to each data point (reliability). Birkimer and Brown (1979a, 1979b) have suggested that both chance levels and disagreement bands be considered in examining observer reliability and have made both methods more accessible to researchers. By clarifying and extending Birkimer and Brown's papers, a system is developed using observer agreement to determine the data point variability and thus to check the adequacy of obtained data within the experimental context.  相似文献   

6.
This study compared a group decision support system (GDSS) with face-to-face (FTF) group discussion on characteristics of information exchange and decision quality. Participants given conflicting information tended to share more of their unique data and engaged in more critical argumentation when using the GDSS than when meeting FTF. Conversely, when information was consistent among members, there were no such differences between FTF and GDSS groups. The GDSS groups significantly outperformed the FTF groups in agreeing on the superior hidden profile candidate, especially when there was a lack of prediscussion consensus. Individual-level analyses revealed that members of GDSS groups that did not have a prediscussion consensus tended to experience stronger preference shifts toward the group's consensus decision.  相似文献   

7.
In this study, groups who could not reach a consensus were investigated using the group polarization paradigm. The purpose was to explore the conditions leading to intragroup disagreement and attitude change following disagreement among 269 participants. Analysis indicated that the probability of consensus was low when the group means differed from the grand mean of the entire sample. When small differences among group members were found, depolarization (reverse direction of polarization) followed disagreement. These results suggested the groups which deviated most from the population tendency were the most likely to cause within-group disagreement, while within-group variances determined the direction of attitude change following disagreement within the group.  相似文献   

8.
When sequences of discrete events, or other units, are independently coded by two coders using a set of mutually exclusive and exhaustive codes, but the onset times for the codes are not preserved, it is often unclear how pairs of protocols should be aligned. Yet such alignment is required before Cohen’s kappa, a common agreement statistic, can be computed. Here we describe a method—based on the Needleman and Wunsch (1970) algorithm originally devised for aligning nucleotide sequences—for optimally aligning such sequences; we also offer the results of a simulation study of the behavior of alignment kappa with a number of variables, including number of codes, varying degrees of observer accuracy, sequence length, code variability, and parameters governing the alignment algorithm. We conclude that (1) under most reasonable circumstances, observer accuracies of 90% or better result in alignment kappas of .60 or better; (2) generally, alignment kappas are not strongly affected by sequence length, the number of codes, or the variability in the codes’ probability; (3) alignment kappas are adversely affected when missed events and false alarms are possible; and (4) cost matrices and priority orders used in the algorithm should favor substitutions (i.e., disagreements) over insertions and deletions (i.e., missed events and false alarms). Two computer programs were developed: Global Sequence Alignment, or GSA, for carrying out the simulation study, and Event Alignment, or ELign, a user-oriented program that computes alignment kappa and provides the optimal alignment given a pair of event sequences.  相似文献   

9.
This paper seeks to refute one variant of a view that scientific disciplines are intrinsically more objective than non‐scientific ones, and that this greater objectivity explains increasing social agreement about the findings of science, by contrast with increasing disagreement about the findings of, e.g., ethics. Such a view rests on the implicit assumption that all forms of discourse aim equally at the generation of consensus; instead, differing degrees of consensus in different disciplines are often explicable by sociological, not metaphysical, differences in the disciplines concerned. A detailed example is presented of a discipline (Indian folk dietary medicine) in which considerable lack of consensus is observed, for sociologically explicable reasons, in spite of its claims to scientific objectivity. It is concluded that disciplines may differ in the degree of truth of the claims advanced in them, and in the importance of consensus among their social aims. But neither of these is to be explained by differences in respect of some independent property of objectivity.  相似文献   

10.
AIMS. To develop a new protocol for the assessment of action observation (AO) abilities and imitation of meaningful and non-meaningful gestures, to examine its psychometric properties in children with DCD and typically developing (TD) children. BACKGROUND. For learning manual skills, AO and imitation are considered fundamental abilities. Knowledge about these modalities in children with DCD is scarce and an assessment protocol is lacking. METHOD. The protocol consists of 2 tests. The AO test consists of two assembly tasks. The imitation test includes 12 meaningful and 20 non-meaningful gestures. Items of both tests are rated on a 4-point scale. Twelve children with DCD (mean age 8y3m, SD, 1.30) and 11 TD children (mean age 8y2m, SD 1.52) were enrolled. For inter-rater reliability, intraclass correlation coefficients (ICC) were calculated for the total score, weighted kappa and percentage agreement for single items. Known group validity was assessed by comparison of DCD and TD group (Wilcoxon rank sum test). For construct validity, the mABC-2 test was used. The protocol was adapted and confirmed by an intra and inter-rater reliability study (new sample of 11 DCD children, mean age 7y5m, SD 1.37). RESULTS. Excellent ICCs were reported for intra and inter-rater reliability for the final protocol. A significant difference between DCD and TD group was found for AO abilities (p < .01), for nonmeaningful gestures (p < .001). A significant correlation was reported between the AO test and the mABC-2 test (r = 56;p ≤0.0001). No significant correlations were revealed for the imitation tests. DISCUSSION AND CONCLUSION. The results support the psychometric properties of this protocol. When fully validated, it may contribute to map the deficits in AO abilities and imitation, to evaluate treatment effects of imitation and AO interventions.  相似文献   

11.
When determining interrater reliability for scoring the Rorschach Comprehensive System (Exner, 1993), researchers often report coding agreement for response segments (i.e., Location, Developmental Quality, Determinants, etc.). Currently, however, it is difficult to calculate kappa coefficients for these segments because it is tedious to generate the chance agreement rates required for kappa computations. This study facilitated kappa calculations for response segments by developing and validating formulas to estimate chance agreement. Formulas were developed for 11 segments using 400 samples, cross-validated on 100 samples, and applied to the data from 5 reliability studies. On cross-validation, the validity of the prediction formulas ranged from .93 to 1.0 (M = .98). In the 5 reliability studies, the average difference between estimated and actual chance agreement rates was .00048 and the average difference between estimated and actual kappa values was .00011 (maximum = .0052). Thus, the regression formulas quite accurately predicted chance agreement rates and kappa coefficients for response segments.  相似文献   

12.
As L. Festinger (1957) argued, the social group is a source of cognitive dissonance as well as a vehicle for reducing it. That is, disagreement from others in a group generates dissonance, and subsequent movement toward group consensus reduces this negative tension. The authors conducted 3 studies to demonstrate group-induced dissonance. In the first, students in a group with others who ostensibly disagreed with them experienced greater dissonance discomfort than those in a group with others who agreed. Study 2 demonstrated that standard moderators of dissonance in past research--lack of choice and opportunity to self-affirm, similarly reduced dissonance discomfort generated by group disagreement. In Study 3, the dissonance induced by group disagreement was reduced through a variety of interpersonal strategies to achieve consensus, including persuading others, changing one's own position, and joining an attitudinally congenial group.  相似文献   

13.
Lyle Crawford 《Ratio》2013,26(3):250-264
The simulation hypothesis claims that the whole observable universe, including us, is a computer simulation implemented by technologically advanced beings for an unknown purpose. The simulation argument (as I reconstruct it) is an argument for this hypothesis with moderately plausible premises. I develop two lines of objection to the simulation argument. The first takes the form of a structurally similar argument for a conflicting conclusion, the claim that I am a so‐called freak observer, formed spontaneously in a quantum or thermodynamic fluctuation rather than through ordinary processes of evolution and growth. The second rejects the basic line of reasoning of both arguments: the sort of evidence they cite is not capable of supporting either the claim that I am a simulant or the claim that I am a freak observer. The evidence that simulants or freak observers exist is not a reason to think that I am one of them.  相似文献   

14.
Despite widespread agreement that multi-method assessments are optimal in personality research, the literature is dominated by a single method: self-reports. This pattern seems to be based, at least in part, on widely held preconceptions about the costs of non-self-report methods, such as informant methods. Researchers seem to believe that informant methods are: (a) time-consuming, (b) expensive, (c) ineffective (i.e., informants will not cooperate), and (d) particularly vulnerable to faking or invalid responses. This article evaluates the validity of these preconceptions in light of recent advances in Internet technology, and proposes some strategies for making informant methods more effective. Drawing on data from three separate studies, I demonstrate that, using these strategies, informant reports can be collected with minimal effort and few monetary costs. In addition, informants are generally very willing to cooperate (e.g., response rates of 76–95%) and provide valid data (in terms of strong consensus and self-other agreement). Informant reports represent a mostly untapped resource that researchers can use to improve the validity of personality assessments and to address new questions that cannot be examined with self-reports alone.  相似文献   

15.
When two (or more) observers are independently categorizing a set of observations, Cohen’s kappa has become the most notable measure of interobserver agreement. When the categories are ordinal, a weighted form of kappa becomes desirable. The two most popular weighting schemes are the quadratic weights and linear weights. Quadratic weights have been justified by the fact that the corresponding weighted kappa is asymptotically equivalent to an intraclass correlation coefficient. This paper deals with linear weights and shows that the corresponding weighted kappa is equivalent to the unweighted kappa when cumulative probabilities are substituted for probabilities. A numerical example is provided.  相似文献   

16.
The research published in the Journal of Applied Behavior Analysis (1968 to 1975) was surveyed for three basic elements: data-collection methods, reliability procedures, and reliability scores. Three-quarters of the studies reported observational data. Most of these studies' observational methods were variations of event recording, trial scoring, interval recording, or time-sample recording. Almost all studies reported assessment of observer reliability, usually total or point-by-point percentage agreement scores. About half the agreement scores were consistently above 90%. Less than one-quarter of the studies reported that reliability was assessed at least once per condition.  相似文献   

17.
This is a normative study with 409 adult nonpatients living in the state of Sao Paulo, Brazil. The Rorschach was administered by a team of nine psychologists; eight had had further training in the Rorschach method by the Brazilian Rorschach Society and one intensively was prepared by the project coordinator. Of the study participants, 200 lived in the state capital (Sao Paulo) and the other 209 were in other large and small cities in the state, including a coastal city and one in the mountains. Previous psychological or psychiatric treatments were criteria for exclusion. Each protocol was coded independently by two examiners, and then agreement of the two codings was checked. Differences between the two codings were discussed in a meeting of the whole team, which was supervised by the project coordinator to guarantee codification quality control. Upon completion of the codings, an analysis of examiner differences was undertaken, the results of which are in the text. Interrater reliability statistics among examiners were calculated, including percentage of agreement and kappa. Reliability statistics among examiners at the response level are presented as are Comprehensive System (CS; 1999, 2003) findings.  相似文献   

18.
19.
Clients' resistance relates negatively to their retention and outcomes in psychotherapy; thus, it has been increasingly identified as a key process marker in both research and practice. This study compared therapists' postsession ratings of resistance with those of trained observers in the context of 40 therapist–client dyads receiving 15 sessions of cognitive-behavioral therapy for generalized anxiety disorder. Therapist and observer ratings were then examined as correlates of proximal (therapeutic alliance quality and homework compliance) and distal (posttreatment worry severity) outcomes. Although there was reasonable concordance between rater perspectives, observer ratings were highly and consistently related to both proximal and distal outcomes, while therapist ratings were not. These findings underscore the need to enhance therapists' proficiency in identifying important and often covert in-session clinical phenomena such as the cues reflecting resistance and noncollaboration.  相似文献   

20.
The study investigated the level of agreement among graduate students (N = 14) and school psychologists (N = 18) in scoring drawings for the 10 designs on the WPPSI Geometric Design subtest. Considerable scoring disagreement occurred within each group. Unanimous agreement was found for only 11 out of 50 drawing items among the graduate students and for only 7 out of 50 drawing items among the school psychologists. While the raters were generally confident of their ratings, there also was a significant positive relationship between level of scoring agreement and confidence ratings (rho = .76, p < .05). Scoring disagreement was greater for the drawings on designs 6 through 9 than on other designs. The results suggest that careful study of the WPPSI scoring criteria is needed in order to achieve scoring proficiency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号