首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A standardized estimation of Rorschach interrater agreement is needed. Percentage agreement, although widely used, is found to be unsuitable. Forty-one protocols from adults in both a normal and a psychiatric sample were scored by two or three scorers, making 85 scoring pairs. Percentage agreement, correlations (phi and Pearson's r ), and kappa were computed on single response, total score, and category level. Percentage agreement shows minimal variation. Even when exceeding 0.80, it can obscure major disagreements. Kappa and correlations both vary in a similar way with level of disagreement. Total score level does not give additional information compared to single score and category levels. Kappa proved to be conservative and reliable and is therefore suggested as a standard estimate.  相似文献   

2.
Because behavior analysis is a data-driven process, a critical skill for behavior analysts is accurate visual inspection and interpretation of single-case data. Study 1 was a basic study in which we increased the accuracy of visual inspection methods for A-B designs through two refinements of the split-middle (SM) method called the dual-criteria (DC) and conservative dual-criteria (CDC) methods. The accuracy of these visual inspection methods was compared with one another and with two statistical methods (Allison & Gorman, 1993; Gottman, 1981) using a computer-simulated Monte Carlo study. Results indicated that the DC and CDC methods controlled Type I error rates much better than the SM method and had considerably higher power (to detect real treatment effects) than the two statistical methods. In Study 2, brief verbal and written instructions with modeling were used to train 5 staff members to use the DC method, and in Study 3, these training methods were incorporated into a slide presentation and were used to rapidly (i.e., 15 min) train a large group of individuals (N = 87). Interpretation accuracy increased from a baseline mean of 55% to a treatment mean of 94% in Study 2 and from a baseline mean of 71% to a treatment mean of 95% in Study 3. Thus, Study 1 answered basic questions about the accuracy of several methods of interpreting A-B designs; Study 2 showed how that information could be used to increase the accuracy of human visual inspectors; and Study 3 showed how the training procedures from Study 2 could be modified into a format that would facilitate rapid training of large groups of individuals to interpret single-case designs.  相似文献   

3.
The conservative dual‐criterion (CDC) method was developed to standardize the analysis of single‐subject experimental designs data, but to date its accuracy has been evaluated only by comparing results to the statistical parameters of graphs. Our study investigated agreement between expert visual analysts and the CDC method on 66 AB tiers from published multiple baseline graphs. We found strong agreement between the two methods for certain types of graphs and discuss implications of the findings and areas for future research.  相似文献   

4.
Visual analysis is the dominant method of analysis for single-case time series. The literature assumes that visual analysts will be conservative judges. We show that previous research into visual analysis has not adequately examined false alarm and miss rates or the effect of serial dependence. In order to measure false alarm and miss rates while varying serial dependence, amount of random variability, and effect size, 37 students undertaking a postgraduate course in single-case design and analysis were required to assess the presence of an intervention effect in each of 27 AB charts constructed using a first-order autoregressive model. Three levels of effect size and three levels of variability, representative of values found in published charts, were combined with autocorrelation coefficients of 0, 0.3 and 0.6 in a factorial design. False alarm rates were surprisingly high (16% to 84%). Positive autocorrelation and increased random variation both significantly increased the false alarm rates and interacted in a nonlinear fashion. Miss rates were relatively low (0% to 22%) and were not significantly affected by the design parameters. Thus, visual analysts were not conservative, and serial dependence did influence judgment.  相似文献   

5.
Using functional analysis results to prescribe treatments is the preferred method for developing behavioral interventions. Little is known, however, about the reliability and validity of visual inspection for the interpretation of functional analysis data. The purpose of this investigation was to develop a set of structured criteria for visual inspection of multielement functional analyses that, when applied correctly, would increase interrater agreement and agreement with interpretations reached by expert consensus. In Study 1, 3 predoctoral interns interpreted functional analysis graphs, and interrater agreement was low (M = .46). In Study 2, 64 functional analysis graphs were interpreted by a panel of experts, and then a set of structured criteria were developed that yielded interpretive results similar to those of the panel (exact agreement = .94). In Study 3, the 3 predoctoral interns from Study 1 were trained to use the structured criteria, and the mean interrater agreement coefficient increased to .81. The results suggest that (a) the interpretation of functional analysis data may be less reliable than is generally assumed, (b) decision-making rules used by experts in the interpretation of functional analysis data can be operationalized, and (c) individuals can be trained to apply these rules accurately to increase interrater agreement. Potential uses of the criteria are discussed.  相似文献   

6.
It is common practice in research on the treatment of problem behavior to compare levels of targeted behaviors during treatment to levels when treatment is not in place. Some researchers use data collected as part of a multielement functional analysis as the initial baseline, whereas others collect new baseline data following completion of the functional analysis. We evaluated whether the source of baseline data influences the reliability and efficiency of decision-making. Results suggest that similar decisions are made in regard to treatment efficacy using the different sources of baseline data, but using data from a multielement functional analysis as baseline may save time. Interrater agreement was adequate, but lower for some graphs than has been observed in past studies. Several potential explanations for this discrepancy are discussed.  相似文献   

7.
徐晓锋  刘勇 《心理科学》2007,30(5):1175-1178
在行为科学的研究和实践中,研究者常常需要将个体层次的评价,整合到群体层次的评价,对于这种自下而上整合模式的一致性问题,国内一些学者常常错误地使用评分者内部信度作为评分者内部一致性的指标。评分者内部一致性和评分者内部信度不仅在理论基础上存在差异,而且在实践中也存在前者很高(或很低),而后者却很低(或很高)的不一致情况。文章阐述了学术界对评分一致性这一问题的提出、争论和取得一致观点的发展脉络,以期学者们对这一问题能够有深入的思索,避免在今后的研究中出现类似的错误。  相似文献   

8.
The current study examines agreement among individuals with varying expertise in behavior analysis about the length of baseline when data were presented point by point. Participants were asked to respond to baseline data and to indicate when to terminate the baseline phase. When only minimal information was provided about the data set, experts and Board Certified Behavior Analyst participants generated baselines of similar lengths, whereas novices did not. Agreement was similar across participants when variability was low but deteriorated as variability in the data set increased. Participants generated shorter baselines when provided with information regarding the independent or dependent variable. Implications for training and the use of visual inspection are discussed.  相似文献   

9.
We developed masked visual analysis (MVA) as a structured complement to traditional visual analysis. The purpose of the present investigation was to compare the effects of computer‐simulated MVA of a four‐case multiple‐baseline (MB) design in which the phase lengths are determined by an ongoing visual analysis (i.e., response‐guided) versus those in which the phase lengths are established a priori (i.e., fixed criteria). We observed an acceptably low probability (less than .05) of false detection of treatment effects. The probability of correctly detecting a true effect frequently exceeded .80 and was higher when: (a) the masked visual analyst extended phases based on an ongoing visual analysis, (b) the effects were larger, (c) the effects were more immediate and abrupt, and (d) the effects of random and extraneous error factors were simpler. Our findings indicate that MVA is a valuable combined methodological and data‐analysis tool for single‐case intervention researchers.  相似文献   

10.
Two sources of variability must each be considered when examining change in level between two sets of data obtained by human observers; namely, variance within data sets (phases) and variability attributed to each data point (reliability). Birkimer and Brown (1979a, 1979b) have suggested that both chance levels and disagreement bands be considered in examining observer reliability and have made both methods more accessible to researchers. By clarifying and extending Birkimer and Brown's papers, a system is developed using observer agreement to determine the data point variability and thus to check the adequacy of obtained data within the experimental context.  相似文献   

11.
12.
The percentage agreement index has been and continues to be a popular measure of interobserver reliability in applied behavior analysis and child development, as well as in other fields in which behavioral observation techniques are used. An algebraic method and a linear programming method were used to assess chance-corrected reliabilities for a sample of past observations in which the percentage agreement index was used. The results indicated that, had kappa been used instead of percentage agreement, between one-fourth and three-fourth of the reported observations could be judged as unreliable against a lenient criterion and between one-half and three-fourths could be judged as unreliable against a more stringent criterion. It is suggested that the continued use of the percentage agreement index has seriously undermined the reliabilities of past observations and can no longer be justified in future studies.  相似文献   

13.
The present study evaluated the effects of both a traditional lecture and the conservative dual-criterion (CDC) judgment aid on the ability of 6 university students to visually inspect AB-design line graphs. The traditional lecture reliably failed to improve visual inspection accuracy, whereas the CDC method substantially improved the performance of each participant.  相似文献   

14.
Current methods employed to interpret functional analysis data include visual analysis and post-hoc visual inspection (PHVI). However, these methods may be biased by dataset complexity, hand calculations, and rater experience. We examined whether an automated approach using nonparametric rank-based statistics could increase the accuracy and efficiency of functional analysis data interpretation. We applied Automated Nonparametric Statistical Analysis (ANSA) to a sample of 65 published functional analyses for which additional experimental evidence was available to verify behavior function. Results showed that exact behavior function agreement between ANSA and the publications authors was 83.1%, exact agreement between ANSA and PHVI was 75.4%, and exact agreement across all 3 methods was 64.6%. These preliminary findings suggest that ANSA has the potential to support the data interpretation process. A web application that incorporates the calculations and rules utilized by ANSA is accessible at https://ansa.shinyapps.io/ansa/  相似文献   

15.
Visual inspection of data is a common method for understanding, responding to, and communicating important behavior-environment relations in single-subject research. In a field that was once dominated by cumulative, moment-to-moment records of behavior, a number of graphic forms currently exist that aggregate data into larger units. In this paper, we describe the continuum of aggregation that ranges from distant to intimate displays of behavioral data. To aid in an understanding of the conditions under which a more intimate analysis is warranted (i.e., one that provides a richer analysis than that provided by condition or session aggregates), we review a sample of research articles for which within-session data depiction has enhanced the visual analysis of applied behavioral research.  相似文献   

16.
Ninety Board Certified Behavior Analysts (BCBAs) and 19 editorial board members evaluated hypothetical data presented in a multielement design. We manipulated the variability, trend, and mean shift of the data and asked the participants to determine if the data demonstrated experimental control. The results showed that variability, trend, and mean shift interacted to affect the participants’ ratings of experimental control. The level of agreement between participants was variable, but was generally lower than in previous research.  相似文献   

17.
Visual inspection of single‐case data is the primary method of interpretation of the effects of an independent variable on a dependent variable in applied behavior analysis. The purpose of the current study was to replicate and extend the results of DeProspero and Cohen (1979) by reexamining the consistency of visual analysis across raters. We recruited members of the board of editors and associate editors for the Journal of Applied Behavior Analysis to judge graphs on a 100‐point scale of experimental control and by providing a dichotomous response (i.e., “yes” or “no” for experimental control). Results showed high interrater agreement across the three types of graphs, suggesting that visual inspection can lead to consistent interpretation of single‐case data among well‐trained raters.  相似文献   

18.
Users of interobserver agreement statistics have heretofore ignored the problem of autocorrelation in behavior sequences when testing the statistical significance of agreement measures. Due to autocorrelation traditional reliability tests based on the 2 × 2 contingency-table model (e.g., kappa, phi) are incorrect. Correct tests can be developed by using the bivariate time series as a statistical model. Seen from this perspective, testing the significance of interobserver agreement becomes formally equivalent to testing the significance of the lag-zero cross-correlation between two time series. The robust procedure known as the jackknife is suggested for this purpose.  相似文献   

19.
Prior research has evaluated the reliability and validity of structured criteria for visually inspecting functional‐analysis (FA) results on a post‐hoc basis, after completion of the FA (i.e., post‐hoc visual inspection [PHVI]; e.g., Hagopian et al., 1997). However, most behavior analysts inspect FAs using ongoing visual inspection (OVI) as the FA is implemented, and the validity of applying structured criteria during OVI remains unknown. In this investigation, we evaluated the predictive validity and efficiency of applying structured criteria on an ongoing basis by comparing the interim interpretations produced through OVI with (a) the final interpretations produced by PHVI, (b) the authors’ post‐hoc interpretations (PHAI) reported in the research studies, and (c) the consensus interpretations of these two post‐hoc analyses. Ongoing visual inspection predicted the results of PHVI and the consensus interpretations with a very high degree of accuracy, and PHAI with a reasonably high degree of accuracy. Furthermore, the PHVI and PHAI results involved 32 FA sessions, on average, whereas the OVI required only 19 FA sessions to accurately identify the function(s) of destructive behavior (i.e., a 41% increase in efficiency). We discuss these findings relative to other methods designed to increase the accuracy and efficiency of FAs.  相似文献   

20.
Proposed methods of assessing the statistical significance of interobserver agreements provide erroneous probability values when conducted on serially correlated data. Investigators who wish to evaluate interobserver agreements by means of statistical significance can do so by limiting the analysis to every k(th) interval of data, or by using Markovian techniques which accommodate serial correlations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号