期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Back to basics: Percentage agreement measures are adequate, but there are easier ways

Birkimer JC Brown JH 《Journal of applied behavior analysis》1979,12(4):535-543

Percentage agreement measures of interobserver agreement or "reliability" have traditionally been used to summarize observer agreement from studies using interval recording, time-sampling, and trial-scoring data collection procedures. Recent articles disagree on whether to continue using these percentage agreement measures, and on which ones to use, and what to do about chance agreements if their use is continued. Much of the disagreement derives from the need to be reasonably certain we do not accept as evidence of true interobserver agreement those agreement levels which are substantially probable as a result of chance observer agreement. The various percentage agreement measures are shown to be adequate to this task, but easier ways are discussed. Tables are given to permit checking to see if obtained disagreements are unlikely due to chance. Particularly important is the discovery of a simple rule that, when met, makes the tables unnecessary. If reliability checks using 50 or more observation occasions produce 10% or fewer disagreements, for behavior rates from 10% through 90%, the agreement achieved is quite improbably the result of chance agreement. 相似文献

2.

Beurteilerübereinstimmung von Psychotherapie-Gutachtern

Dr. phil. Dipl.-Psych. H. Vogel K. Meng 《Psychotherapeut》2007,52(1):35-40

The access to outpatient psychotherapy in Germany is regulated by an application and expert opinion procedure in a peer-review system. In an external assessment procedure, the application of each patient is considered concerning the existence of a mental disorder, a positive prognosis as well as the adequacy of the chosen therapy rationale. The present paper examines the reliability of this procedure by reanalysing the data from three studies on interrater agreement in the expert opinions about psychoanalytic/psychodynamic therapy, behaviour therapy or child and youth behaviour therapy. In the study of Rudolf et al. (2002) 48 experts re-examined two already assessed cases, in the studies of Sulz et al. (2003) as well as Sulz and Peterander (2004) each of 30 and respectively 7 experts had judged five non selected or seven selected applications. The interrater agreement was calculated using the kappa coefficient by Fleiss for the agreement among many raters, which tests the observed agreement probability against the expected agreement probability that would occur by chance. The level of agreement among the experts differs between 46% and 70%. With the chosen method it is mostly not possible to show that there is a significant higher agreement than by chance. The generalizability of the results to the usual assessment procedure is discussed as well as their potential for the advancement of the application procedure and expert peer-review system. 相似文献

3.

Comparison of the Matthews Youth Test for Health and the Hunter-Wolf A-B Rating Scale: measures of type A behavior in children

C Jackson D W Levine 《Health psychology》1987,6(3):255-267

This study evaluated the extent to which two youth measures of Type A behavior--the Matthews Youth Test for Health (MYTH) and the Hunter-Wolf A-B Rating Scale--similarly assess the Type A construct. Data from 25 elementary teachers and from 300 of their students revealed that these scales are weakly correlated and that the concordance of their Type A-Type B classifications was only slightly above that expected by chance. Weak agreement was found even when teachers and students rated the same Type A behaviors, which suggests that variability in content was not the principal reason for the lack of agreement between these measures. The major implication of this study is that the MYTH and the Hunter-Wolf scale should not be considered interchangeable measures of Type A behavior. The study reveals that investigators' choice of a Type A measure may strongly affect the nature of their research findings and may inhibit the integration of their data with data from other investigators who employ different measures of the Type A construct. It is recommended that, if multiple measures cannot be used in a study, the investigator employ the MYTH. 相似文献

4.

Motoric indicators of laterality and determination of lateral dominance in schizophrenia

D G Scott 《Perceptual and motor skills》1985,60(3):971-985

The literature suggests that schizophrenics exhibit reduced or reversed cerebral lateral dominance relative to normal control subjects. An hypothesis which predicted reduced or reversed cerebral laterality for schizophrenics was tested on 60 young, familially right-handed males, with 20 men in each of the following three groups: schizophrenic inpatients, nonschizophrenic psychiatric inpatient controls, and normal controls. The subjects were administered a battery of seven measures of cerebral laterality. The application of multivariate statistical techniques showed groups did not differ significantly in the degree or the direction of their cerebral lateral dominance. Also there were no significant correlations between the measures of laterality. The findings suggest that cerebral lateral dominance is not necessarily altered concomitantly with psychopathology but rather that it is a complex phenomenon which may not be reliably determined on the basis of simple behavioral characteristics. 相似文献

5.

A probability-based formula for calculating interobserver agreement

Yelton AR Wildman BG Erickson MT 《Journal of applied behavior analysis》1977,10(1):127-131

Estimates of observer agreement are necessary to assess the acceptability of interval data. A common method for assessing observer agreement, per cent agreement, includes several major weaknesses and varies as a function of the frequency of behavior recorded and the inclusion or exclusion of agreements on nonoccurrences. Also, agreements that might be expected to occur by chance are not taken into account. An alternative method for assessing observer agreement that determines the exact probability that the obtained number of agreements or better would have occurred by chance is presented and explained. Agreements on both occurrences and nonoccurrences of behavior are considered in the calculation of this probability. 相似文献

6.

Evaluating interobserver reliability of interval data

Hopkins BL Hermann JA 《Journal of applied behavior analysis》1977,10(1):121-126

Previous recommendations to employ occurrence, nonoccurrence, and overall estimates of interobserver reliability for interval data are reviewed. A rationale for comparing obtained reliability to reliability that would result from a random-chance model is explained. Formulae and graphic functions are presented to allow for the determination of chance agreement for each of the three indices, given any obtained per cent of intervals in which a response is recorded to occur. All indices are interpretable throughout the range of possible obtained values for the per cent of intervals in which a response is recorded. The level of chance agreement simply changes with changing values. Statistical procedures that could be used to determine whether obtained reliability is significantly superior to chance reliability are reviewed. These procedures are rejected because they yield significance levels that are partly a function of sample sizes and because there are no general rules to govern acceptable significance levels depending on the sizes of samples employed. 相似文献

7.

Personal efficacy, external locus of control, and perceived contingency of parental reinforcement among depressed, paranoid, and normal subjects

M Rosenbaum D Hadari 《Journal of personality and social psychology》1985,49(2):539-547

Bandura (1982) suggested that judgments of personal efficacy and outcome expectancies (i.e., locus of control) jointly affect behavior. We hypothesized that different combinations of these two sets of beliefs would characterize the thought structures of normal subjects and of psychiatric patients suffering from distinctly different disorders. Normal subjects, depressed subjects, and paranoid subjects completed scales with which we measured beliefs in personal efficacy and beliefs that outcomes are controlled either by chance or by powerful others, as well as a scale with which we assessed perceived contingency of parental reinforcement. The major findings were as follows: Normals judged themselves to be more efficacious than did psychiatric subjects; whereas depressives expected outcomes to be controlled by chance, paranoids expected outcomes to be under the control of powerful others; among the normals, outcome expectancies were strongly associated with personal efficacy, but among the psychiatric patients, these beliefs were unrelated; depressives and paranoids equally reported more noncontingent parental reinforcement than did normals; and perceived contingency of parental reinforcement was predictive of outcome expectancies but not of personal efficacy. The data suggest that low personal efficacy may be a distinguishing characteristic of all psychiatric patients, whereas outcome expectancies may determine the specific nature of the psychiatric disorder. 相似文献

8.

A two-stage logistic regression model for analyzing inter-rater agreement

Stuart?R.?Lipsitz Michael?Parzen Garrett?M.?Fitzmaurice Email author Neil?Klar 《Psychometrika》2003,68(2):289-298

Studies of agreement commonly occur in psychiatric research. For example, researchers are often interested in the agreement among radiologists in their review of brain scans of elderly patients with dementia or in the agreement among multiple informant reports of psychopathology in children. In this paper, we consider the agreement between two raters when rating a dichotomous outcome (e.g., presence or absence of psychopathology). In particular, we consider logistic regression models that allow agreement to depend on both rater- and subject-level covariates. Logistic regression has been proposed as a simple method for identifying covariates that are predictive of agreement (Coughlin et al., 1992). However, this approach is problematic since it does not take account of agreement due to chance alone. As a result, a spurious association between the probability (or odds) of agreement and a covariate could arise due entirely to chance agreement. That is, if the prevalence of the dichotomous outcome varies among subgroups of the population, then covariates that identify the subgroups may appear to be predictive of agreement. In this paper we propose a modification to the standard logistic regression model in order to take proper account of chance agreement. An attractive feature of the proposed method is that it can be easily implemented using existing statistical software for logistic regression. The proposed method is motivated by data from the Connecticut Child Study (Zahner et al., 1992) on the agreement among parent and teacher reports of psychopathology in children. In this study, parents and teachers provide dichotomous assessments of a child's psychopathology and it is of interest to examine whether agreement among the parent and teacher reports is related to the age and gender of the child and to the time elapsed between parent and teacher assessments of the child.The authors thank the Associate Editor and the referees for their helpful comments and suggestions. We also thank Gwen Zahner for use of data from the Connecticut Child Study, which was conducted under contract to the Connecticut Department of Children and Youth Services. This research was supported by grants HL 69800, AHRQ 10871, HL52329, HL61769, GM 29745, MH 54693 and MH 17119 from the National Institutes of Health. 相似文献

9.

Comparative analysis of three approaches for rater agreement

Ato M Benavente A López JJ 《Psicothema》2006,18(3):638-645

相似文献

10.

Can group differences in hemispheric asymmetry be inferred from behavioral laterality indices?

Steven Schwartz Kim Kirsner 《Brain and cognition》1984,3(1):57-70

A large and often contradictory literature purports to demonstrate different patterns—or at least different degrees—of hemispheric specialization across various groups of people. Schizophrenics, dyslexies, stutterers, musicians, Orientals, Jews, and many other groups have been alleged to display idiosyncratic laterality patterns. An examination of this literature reveals three important problems. First the groups concerned are rarely homogeneous. This makes it difficult to know which group characteristics, if any, are responsible for the observed differences. Second, most behavioral laterality indices are of low reliability making group differences highly unstable. Third, the validity of many behavioral laterality indices has not been substantiated. Because of these problems, it is concluded that caution should be exercised in using and interpreting laterality measures to make between-group comparisons. For now at least, group differences in laterality cannot be inferred. 相似文献

11.

Schizotypy,cognitive performance,and genetic risk for schizophrenia in a non-clinical population

Emma L. Leach Peter L. Hurd Bernard J. Crespi 《Personality and individual differences》2013

Schizophrenia risk alleles are expected to mediate effects on cognitive task performance, and aspects of personality including schizotypy, in nonclinical populations. We investigated how 32 of the best-validated schizophrenia risk alleles, singly and as summed genetic risk, were related to measures of schizotypal personality and measures of two aspects of cognitive performance, verbal skills (vocabulary) and visual-spatial skills (mental rotation), in healthy individuals. Summed genetic risk score was not associated with levels of total schizotypy or its three main subscales. Similarly, genotypic variation at none of the individual risk loci was related to cognitive performance measures, after correction for multiple tests. Higher overall genetic risk score was, however, associated with lower performance on the mental rotation test in males, with a broad set of loci contributing to this effect. These results imply that there is a lack of linear, genetically-based continuity connecting schizotypal cognition with the expression of schizophrenia itself, and indicate that, for males, higher genetic risk of schizophrenia exerts negative effects on visual-spatial skills, as measured by mental rotation. 相似文献

12.

How incidental values from the environment affect decisions about money, risk, and delay

Ungemach C Stewart N Reimers S 《Psychological science》2011,22(2):253-260

How different are ￡0.50 and ￡1.50, "a small chance" and "a good chance," or "three months" and "nine months"? Our studies show that people behave as if the differences between these values are altered by incidental everyday experiences. Preference for a ￡1.50 lottery rather than a ￡0.50 lottery was stronger among individuals exposed to intermediate supermarket prices than among those exposed to lower or higher prices. Preference for "a good chance" rather than "a small chance" of winning a lottery was stronger among participants who predicted intermediate probabilities of rain than among those who predicted lower or higher chances of rain. Preference for consumption in "three months" rather than "nine months" was stronger among participants who planned for an intermediate birthday than among participants who planned for a sooner or later birthday. These fluctuations directly challenge economic accounts that translate monies, risks, and delays into subjective equivalents with stable functions. The decision-by-sampling model-in which subjective values are rank positions constructed from comparisons with samples-predicts these effects and indicates a primary role for sampling in decision making. 相似文献

13.

A graphical judgmental aid which summarizes obtained and chance reliability data and helps assess the believability of experimental effects

Birkimer JC Brown JH 《Journal of applied behavior analysis》1979,12(4):523-533

Interval by interval reliability has been criticized for "inflating" observer agreement when target behavior rates are very low or very high. Scored interval reliability and its converse, unscored interval reliability, however, vary as target behavior rates vary when observer disagreement rates are constant. These problems, along with the existence of "chance" values of each reliability which also vary as a function of response rate, may cause researchers and consumers difficulty in interpreting observer agreement measures. Because each of these reliabilities essentially compares observer disagreements to a different base, it is suggested that the disagreement rate itself be the first measure of agreement examined, and its magnitude relative to occurrence and to nonoccurrence agreements then be considered. This is easily done via a graphic presentation of the disagreement range as a bandwidth around reported rates of target behavior. Such a graphic presentation summarizes all the information collected during reliability assessments and permits visual determination of each of the three reliabilities. In addition, graphing the "chance" disagreement range around the bandwidth permits easy determination of whether or not true observer agreement has likely been demonstrated. Finally, the limits of the disagreement bandwidth help assess the believability of claimed experimental effects: those leaving no overlap between disagreement ranges are probably believable, others are not. 相似文献

14.

A cross-culturally standardized set of pictures for younger and older adults: American and Chinese norms for name agreement, concept agreement, and familiarity. 总被引：1，自引：0，他引：1

Carolyn Yoon Fred Feinberg Ting Luo Trey Hedden Angela Hall Gutchess Hiu-Ying Mary Chen Joseph A Mikels Shulan Jiao Denise C Park 《Behavior research methods, instruments & computers》2004,36(4):639-649

The present study presents normative measures for 260 line drawings of everyday objects, found in Snodgrass and Vanderwart (1980), viewed by individuals in China and the United States. Within each cultural group, name agreement, concept agreement, and familiarity measures were obtained separately for younger adults and older adults. For a subset of 57 pictures (22%), there was equivalence in both name agreement and concept agreement, and for an additional subset of 29 pictures (11%), there was nonequivalent name agreement but equivalent concept agreement, across all culture-by-age groups. The data indicate substantial differences across culture-by-age groups in name agreement percentages and number of distinct name responses provided. We discovered significant differences between older and younger American adults in both name agreement percentages (67 pictures, or 26%) and concept agreement percentages (44 pictures, or 17%). Written naming responses collected for the entire set of Snodgrass and Vanderwart pictures showed shifts in both naming and concept agreement percentages over the intervening decades: Although correlations in name agreement were strong (r = .71, p < .001) between our younger American samples and those of Snodgrass and Vanderwart, name agreement percentages have changed for a substantial proportion (33%) of the 260 pictures; moreover, 63% of the stimuli for which Snodgrass and Vanderwart reported concept agreement now appear to differ. We provide comprehensive comparison statistics and tests for both the present study and prior ones, finding differences across numerous item-level measures. The corpus of data suggests that substantial differences in all measures can be found across age as well as culture, so that unequivocal conclusions with respect to cross-cultural or age-related differences in cognition can be made only when appropriate stimuli are selected for studies. Data for all 260 pictures, for each of the four groups, and all supporting materials and tests are freely archived at http://agingmind.cns.uiuc.edu/Pict_Norms. The full set of these norms may be downloaded from www.psychonomic.org/archive/. 相似文献

15.

Uncoupled leftward asymmetries for planum morphology and functional language processing

Eckert MA Leonard CM Possing ET Binder JR 《Brain and language》2006,98(1):102-111

Explanations for left hemisphere language laterality have often focused on hemispheric structural asymmetry of the planum temporale. We examined the association between an index of language laterality and brain morphology in 99 normal adults whose degree of laterality was established using a functional MRI single-word comprehension task. The index of language laterality was derived from the difference in volume of activation between the left and right hemispheres. Planum temporale and brain volume measures were made using structural MRI scans, blind to the functional data. Although both planum temporale asymmetry (t(1,99) = 6.86, p < .001) and language laterality (t(1,99) = 15.26, p < .001) were significantly left hemisphere biased, there was not a significant association between these variables (r(99) = .01,ns). Brain volume, a control variable for the planum temporale analyses, was related to language laterality in a multiple regression (beta = -.30, t = -2.25, p < .05). Individuals with small brains were more likely to demonstrate strong left hemisphere language laterality. These results suggest that language laterality is a multidimensional construct with complex neurological origins. 相似文献

16.

Perceived probability, perceived severity, and health-protective behavior.

N D Weinstein 《Health psychology》2000,19(1):65-74

It seems obvious that 2 key attributes of health hazards, their perceived probability and perceived severity, do not act independently on the motivation to engage in protective behavior. If a health problem is perceived to have no chance of occurring, there should be no interest in acting against it, regardless of how serious it might be. Nevertheless, researchers seldom observe the expected interaction between probability and severity. A case study approach was used to examine how probability and severity combine to influence interest in protection. Ratings of motivation to act, probability, and severity for 201 hazards were collected from 12 participants, and data were analyzed for each person separately. Analyses revealed the expected Probability x Severity interaction. Additional calculations showed why it is difficult to detect this interaction using between-subjects designs. The data also revealed that people are surprisingly insensitive to variations in hazard probability when probabilities are in the moderate to high range. 相似文献

17.

Dichhaptic laterality and field dependence

Paul Weener Malcolm Van Blerkom 《Brain and cognition》1982,1(3):323-330

Relationships among laterality, field dependence, and sex were investigated using right handed subjects. Dichhaptic measures of laterality showed a significant left-hand advantage for the discrimination of irregular shapes. No significant relationships were found between laterality and sex or sex role perceptions nor between laterality and field dependence. A modality of response effect on the laterality task was interpreted in terms of the functional cerebral space principle. 相似文献

18.

Friends and strangers: acquaintanceship, agreement, and the accuracy of personality judgment 总被引：2，自引：0，他引：2

D C Funder C R Colvin 《Journal of personality and social psychology》1988,55(1):149-158

We examined the effect of acquaintanceship on interjudge agreement in personality ratings. Approximately 150 undergraduates described their own personalities using the Q-sort. They were also described by two close acquaintances and by two "strangers" who knew them only via a single, spontaneous interaction viewed on videotape. The effect of acquaintanceship was powerful: Judgments by close acquaintances agreed with each other and with subjects' self-judgments much better than did judgments by strangers, even though strangers' judgments agreed with each other and with subjects' self-judgments beyond a chance level. This result implies that agreement among acquaintances' judgments must derive at least partly from experience with and observation of the person who is judged. The same traits that yielded better agreement among acquaintances also yielded better agreement among strangers and tended to be rated higher in subjective visibility, suggesting that people are intuitively knowledgeable about the traits they can judge with more and less agreement. 相似文献

19.

The use of configural frequency analysis for explorative data analysis

《The British journal of mathematical and statistical psychology》2006,59(1):59-73

Configural frequency analysis (CFA) is a widely used method of explorative data analysis. It tries to detect patterns in the data that occur significantly more or significantly less often than expected by chance. Patterns which occur more often than expected by chance are called CFA types, while those which occur less often than expected by chance are called CFA antitypes. The patterns detected are used to generate knowledge about the mechanisms underlying the data. We investigate the ability of CFA to detect adequate types and antitypes in a number of simulation studies. The basic idea of these studies is to predefine sets of types and antitypes and a mechanism which uses them to create a simulated data set. This simulated data set is then analysed with CFA and the detected types and antitypes are compared to the predefined ones. The predefined types and antitypes together with the method to generate the data are called a data generation model. The results of the simulation studies show that CFA can be used in quite different research contexts to detect structural dependencies in observed data. In addition, we can learn from these simulation studies how much data is necessary to enable CFA to reconstruct the predefined types and antitypes with sufficient accuracy. For one of the data generation models investigated, implicitly underlying knowledge space theory, it was shown that zero‐order CFA can be used to reconstruct the predefined types (which can be interpreted in this context as knowledge states) with sufficient accuracy. Theoretical considerations show that first‐order CFA cannot be used for this data generation model. Thus, it is wrong to consider first‐order CFA, as is done in many publications, as the standard or even only method of CFA. 相似文献

20.

A cross-culturally standardized set of pictures for younger and older adults: American and Chinese norms for name agreement,concept agreement,and familiarity

Carolyn?Yoon Email author Fred?Feinberg Ting?Luo Trey?Hedden Angela?Hall?Gutchess Hiu-Ying?Mary?Chen Joseph?A.?Mikels Shulan?Jiao Denise?C.?Park 《Behavior research methods》2004,36(4):639-649

The present study presents normative measures for 260 line drawings of everyday objects, found in Snodgrass and Vanderwart (1980), viewed by individuals in China and the United States. Within each cultural group, name agreement, concept agreement, and familiarity measures were obtained separately for younger adults and older adults. For a subset of 57 pictures (22%), there was equivalence in both name agreement and concept agreement, and for an additional subset of 29 pictures (11%), there was nonequivalent name agreement but equivalent concept agreement, across all culture-by-age groups. The data indicate substantial differences across culture-by-age groups in name agreement percentages and number of distinct name responses provided. We discovered significant differences between older and younger American adults in both name agreement percentages (67 pictures, or 26%) and concept agreement percentages (44 pictures, or 17%). Written naming responses collected for the entire set of Snodgrass and Vanderwart pictures showed shifts in both naming and concept agreement percentages over the intervening decades: Although correlations in name agreement were strong (r = .71,p < .001) between our younger American samples and those of Snodgrass and Vanderwart, name agreement percentages have changed for a substantial proportion (33%) of the 260 pictures; moreover, 63% of the stimuli for which Snodgrass and Vanderwart reported concept agreement now appear to differ. We provide comprehensive comparison statistics and tests for both the present study and prior ones, finding differences across numerous item-level measures. The corpus of data suggests that substantial differences in all measures can be found across age as well as culture, so that unequivocal conclusions with respect to cross-cultural or age-related differences in cognition can be made only when appropriate stimuli are selected for studies. Data for all 260 pictures, for each of the four groups, and all supporting materials and tests are freely archived athttp://agingmind.cns.uiuc.edu/Pict Norms. The full set of these norms may be downloaded fromwwwpsychonomic.org/archive/. 相似文献