首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This study assessed the psychometric properties of a parent-reported tic severity measure, the Parent Tic Questionnaire (PTQ), and used the scale to establish guidelines for delineating clinically significant tic treatment response. Participants were 126 children ages 9 to 17 who participated in a randomized controlled trial of Comprehensive Behavioral Intervention for Tics (CBIT). Tic severity was assessed using the Yale Global Tic Severity Scale (YGTSS), Hopkins Motor/Vocal Tic Scale (HMVTS) and PTQ; positive treatment response was defined by a score of 1 (very much improved) or 2 (much improved) on the Clinical Global Impressions – Improvement (CGI-I) scale. Cronbach’s alpha and intraclass correlations (ICC) assessed internal consistency and test-retest reliability, with correlations evaluating validity. Receiver- and Quality-Receiver Operating Characteristic analyses assessed the efficiency of percent and raw-reduction cutoffs associated with positive treatment response. The PTQ demonstrated good internal consistency (α = 0.80 to 0.86), excellent test-retest reliability (ICC = .84 to .89), good convergent validity with the YGTSS and HM/VTS, and good discriminant validity from hyperactive, obsessive-compulsive, and externalizing (i.e., aggression and rule-breaking) symptoms. A 55% reduction and 10-point decrease in PTQ Total score were optimal for defining positive treatment response. Findings help standardize tic assessment and provide clinicians with greater clarity in determining clinically meaningful tic symptom change during treatment.  相似文献   

2.
The baseline inter-rater reliability, test-retest reliability, follow-up inter-rater reliability, and follow-up longitudinal reliability (interrater reliability between generations of raters) of borderline symptoms and the diagnosis of borderline personality disorder (BPD) were assessed using the Revised Diagnostic Interview for Borderlines (DIB-R). Excellent kappa s (> .75) were found in each of these reliability substudies for the diagnosis of BPD itself. Excellent kappa s were also found in each of the three inter-rater reliability substudies for the vast majority of borderline symptoms assessed by the DIB-R. Test-retest reliability for these symptoms was somewhat lower but still very good. More specifically, one-third of the BPD symptoms assessed had a kappa in the excellent range and the remaining two-thirds had a kappa in the fair-good range (.57-.73). The dimensional reliability of BPD symptom areas was somewhat higher than for categorical measures of the subsyndromal phenomenology of BPD. More specifically, all five dimensional measures of borderline psychopathology had intraclass correlation coefficients in the excellent range for all four reliability substudies. Taken together, the results of this study suggest that both the borderline diagnosis and the symptoms of BPD can be diagnosed reliably when using the DIB-R. They also suggest that excellent reliability, once achieved, can be maintained over time for both the syndromal and subsyndromal phenomenology of BPD.  相似文献   

3.
The Functional Assessment Rating Scale was developed as a measure of psychiatric symptomatology and psychosocial impairments. This study was designed to report estimates of reliability and validity with a population of schizophrenic patients. The scale showed very good interrater agreement, test-retest reliability, construct validity, and concurrent validity, so the scale seems a useful measure of psychopathology which may be used to assess and monitor patients displaying severe mental illnesses.  相似文献   

4.
This study examined the short-interval test-retest reliability of the Structured Clinical Interview (SCID-II: First, Spitzer, Gibbon, & Williams, 1995) for DSM-IV personality disorders (PDs). The SCID-II was administered to 69 in- and outpatients on two occasions separated by 1 to 6 weeks. The interviews were conducted at three sites by ten raters. Each rater acted as first and as second rater and equal number of times. The test-retest interrater reliability for the presence or absence of any PD was fair to good (kappa = .63) and was higher than values found in previous short-interval test-retest studies with the SCID-II for DSM-III-R. Test-retest reliability coefficients for trait and sumscores were sufficient, except for dependent PD. Values for single criteria were variable, ranging from poor to good agreement. Further large-scale test-retest research is needed to test the interrater reliability of more categorical diagnoses and single traits.  相似文献   

5.
In this study, the authors examined the stability of Minnesota Multiphasic Personality Inventory--2 (J. N. Butcher, W. G. Dahlstrom, J. R. Graham, A. Tellegen, & B. Kaemmer, 1989) code types in a sample of 94 injured workers with a mean test-retest interval of 21.3 months (SD = 14.1). Congruence rates for undefined code types were 34% for high-point codes, 22% for 2-point codes, and 22% for 3-point codes. The data provide tentative evidence suggesting that defined code types are more stable than undefined code types. Cohen's kappa, a statistic that controls for chance agreement, was calculated for each clinical scale for both 2-point and 3-point code types. Only 2 of the 20 kappa coefficients were not significant at the p = .05 level.  相似文献   

6.
Stice E  Telch CF  Rizvi SL 《心理评价》2000,12(2):123-131
This article describes the development and validation of a brief self-report scale for diagnosing anorexia nervosa, bulimia nervosa, and binge-eating disorder. Study 1 used a panel of eating-disorder experts and provided evidence for the content validity of this scale. Study 2 used data from female participants with and without eating disorders (N = 367) and suggested that the diagnoses from this scale possessed temporal reliability (mean kappa = .80) and criterion validity (with interview diagnoses; mean kappa = .83). In support of convergent validity, individuals with eating disorders identified by this scale showed elevations on validated measures of eating disturbances. The overall symptom composite also showed test-retest reliability (r = .87), internal consistency (mean alpha = .89), and convergent validity with extant eating-pathology scales. Results implied that this scale was reliable and valid in this investigation and that it may be useful for clinical and research applications.  相似文献   

7.
The objective of this study was to examine the level of agreement between child- and caregiver-reports of the child’s psychosocial problems presenting to a Pediatric Emergency Department (PED) using a validated screening tool. This was an anonymous, prospective, cross-sectional, multi-informant (child and caregiver) study assessing cognitive, emotional, and behavioral problems and physical complaints in children and adolescents presenting to a PED. Three-hundred and fifty-eight children and adolescents (8–18 years old) and their caregivers participated. Children completed the Youth-Pediatric Symptom Checklist (PSC-Y), while their caregivers completed the Pediatric Symptom Checklist–35 (PSC-35) to measure psychosocial impairment. The child’s physical complaints (e.g., chief complaint, chronicity, other medical problems, medications) and demographic information were assessed using an investigator-developed patient background questionnaire completed by the caregivers. Physical complaints (e.g., chief complaint, chronicity, other medical problems, medications) were assessed using an investigator-developed patient background questionnaire. Agreement between child- and caregiver- reports was analyzed using Cohen’s kappa coefficient. Differences between child and caregiver-reported scores were determined by t-tests. Poor to moderate agreement was found between child- and caregiver-reports of attention problems (κ = .355), externalizing problems (κ = .340), internalizing problems (κ = .065), and total PSC score (κ = .410). Both children and caregivers should complete the psychosocial screener to maximize the accuracy of assessment and the identification of impairment.  相似文献   

8.
Both the interrater and test-retest-retest reliability of axis I and axis II disorders were assessed using the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I) and the Diagnostic Interview for DSM-IV Personality Disorders (DIPD-IV). Fair-good median interrater kappa (.40-.75) were found for all axis II disorders diagnosed five times or more, except antisocial personality disorder (1.0). All of the test-retest kappa for axis II disorders, except for narcissistic personality disorder (1.0) and paranoid personality disorder (.39), were also found to be fair-good. Interrater and test-retest dimensional reliability figures for axis II were generally higher than those for their categorical counterparts; most were in the excellent range (> .75). In terms of axis I, excellent median interrater kappa were found for six of the 10 disorders diagnosed five times or more, whereas fair-good median interrater kappa were found for the other four axis I disorders. In general, test-retest reliability figures for axis I disorders were somewhat lower than the interrater reliability figures. Three test-retest kappa were in the excellent range, six were in the fair-good range, and one (for dysthymia) was in the poor range (.35). Taken together, the results of this study suggest that both axis I and axis II disorders can be diagnosed reliably when using appropriate semistructured interviews. They also suggest that the reliability of axis II disorders is roughly equivalent to that reliability found for most axis I disorders.  相似文献   

9.
Some Paradoxical Results for the Quadratically Weighted Kappa   总被引:1,自引:0,他引:1  
The quadratically weighted kappa is the most commonly used weighted kappa statistic for summarizing interrater agreement on an ordinal scale. The paper presents several properties of the quadratically weighted kappa that are paradoxical. For agreement tables with an odd number of categories n it is shown that if one of the raters uses the same base rates for categories 1 and n, categories 2 and n−1, and so on, then the value of quadratically weighted kappa does not depend on the value of the center cell of the agreement table. Since the center cell reflects the exact agreement of the two raters on the middle category, this result questions the applicability of the quadratically weighted kappa to agreement studies. If one wants to report a single index of agreement for an ordinal scale, it is recommended that the linearly weighted kappa instead of the quadratically weighted kappa is used.  相似文献   

10.
Young people with borderline personality disorder (BPD) commonly seek help but often go unrecognized. Screening offers a means of identifying individuals for more detailed assessment for early intervention and for research. AIMS: This study compared the McLean Screening Instrument for Borderline Personality Disorder (MSI-BPD), Borderline Personality Questionnaire (BPQ), the BPD items from the International Personality Disorder Examination Screening Questionnaire and the BPD items from the Structured Clinical Interview for DSM-IV Axis II disorders (SCID-II) Personality Questionnaire. METHOD: 101 outpatient youth (aged 15-25 years) completed the screening measures and were interviewed, blind to screening status, with the SCID-II BPD module. The screening measures were readministered two weeks later to assess test-retest reliability. RESULTS: All four instruments performed similarly but the BPQ had the best mix of characteristics, with moderate sensitivity (0.68), the highest specificity (0.90), high negative predictive value (0.91) and moderate positive predictive value (0.65). Compared to the other three instruments, the BPQ had the highest overall diagnostic accuracy (0.85), a substantially higher kappa (0.57) with the criterion diagnosis, the highest test-retest reliability (ICC = 0.92) and the highest internal consistency (alpha = 0.92). The only clear difference to emerge in the Receiver Operator Curve (ROC) analysis was that the BPQ significantly outperformed the MSI (p = 0.05). CONCLUSION: Screening for BPD in out-patient youth is feasible but is not a replacement for clinical diagnosis.  相似文献   

11.
Data from a community-based prospective longitudinal study were used to investigate the utility of a structured assessment of the DSM-IV General Diagnostic Criteria for a Personality Disorder (PD). The Structured Clinical Interview for DSM-IV PDs (SCID-II) was administered to 154 adults. After completing the interview, an experienced clinician assessed the General Diagnostic Criteria for a PD using a structured rating scale. PD diagnoses, based solely on the rating scale data, demonstrated strong agreement with diagnoses obtained using the diagnostic thresholds for specific PDs (Kappa = 0.89). The sensitivity, specificity, predictive power, and internal reliability of the rating scale were satisfactory. PD diagnoses, based on both of the assessment procedures, were associated with substantial impairment and distress. These findings suggest that a structured assessment of the DSM-IV General Diagnostic Criteria for a Personality Disorder may constitute a useful alternative or supplement to standard assessments of the diagnostic thresholds for specific DSM-IV PDs.  相似文献   

12.
Cohen's kappa is presently a standard tool for the analysis of agreement in a 2 × 2 reliability study, and weighted kappa is a standard statistic for summarizing a 2 × 2 validity study. The special cases of weighted kappa, for example Cohen's kappa, are chance‐corrected measures of association. For various measures of 2 × 2 association it has been observed in the literature that, after correction for chance, they coincide with a special case of weighted kappa. This paper presents the general function, linear in both numerator and denominator, that becomes weighted kappa after correction for chance.  相似文献   

13.
Antisocial behaviors were systematically classified along a hierarchy of seriousness or severity that also accounted for frequency of occurrence and heterogeneity of behaviors. Items from the CD and ODD schedules of the NIMH DISC-IV and from the Elliot Delinquency Scales were listed at specified frequencies. Nine mental health clinicians rated the level of seriousness of each alternative on a scale from 0 (trivial) to 5 (very serious). Reliability of the ratings was assessed. Over two thirds of the items showed excellent agreement among the raters, 8% showed poor agreement and 21% showed fair to moderate agreement. The overall reliability of a single rater’s score was 0.56 and the reliability of the average was 0.84. The classification meets high psychometric standards for reliability and has face validity. The final output provides a classification along a spectrum that takes into account severity, frequency of occurrence during the previous year, and presence of multiple behaviors. It is useful for classification purposes and for longitudinal tracking of antisocial behavior.  相似文献   

14.
Pingke Li 《Psychometrika》2016,81(3):795-801
The linearly and quadratically weighted kappa coefficients are popular statistics in measuring inter-rater agreement on an ordinal scale. It has been recently demonstrated that the linearly weighted kappa is a weighted average of the kappa coefficients of the embedded 2 by 2 agreement matrices, while the quadratically weighted kappa is insensitive to the agreement matrices that are row or column reflection symmetric. A rank-one matrix decomposition approach to the weighting schemes is presented in this note such that these phenomena can be demonstrated in a concise manner.  相似文献   

15.
This study examined the extent to which patterns of psychosocial risk were uniquely associated with long-term outcomes of rheumatoid arthritis (RA), after demographic factors and self-reported symptom severity over time were accounted for. Data were collected over an 8-year period from 561 individuals with RA who were participants in the ongoing UCSF RA Panel Study in 1995. Panel members were interviewed annually, using a comprehensive structured telephone interview. Psychosocial factors assessed included mastery, perceptions about adequacy of social support, the impact of RA and self-assessed ability to cope with RA and satisfaction with health and function. Cluster analysis of psychosocial factors identified three distinctive patterns/levels of psychosocial risk (high, medium and low risk). The unique effects of psychosocial risk status on changes in depressive symptoms, basic functional limitations, global pain ratings and average annual doctor visits over an 8-year period were estimated, using growth curve analyses. Analyses controlled for demographic factors (gender, marital/partner status, education, age and ethnicity), disease duration and year in the panel and time-varying self-reported symptom severity (morning stiffness, swollen joint counts, co-morbid medical conditions, extra-articular RA symptoms and changes in joint appearance), as well as self-reported medications taken over time (disease-modifying antirheumatic drugs [DMARDS], and prednisone). Overall, 32.4% of total variance in depressive symptoms was accounted for by the fully-estimated model, with 12.9% uniquely associated with psychosocial risk status. Half of the total variance (50.0%) in basic functional limitations was explained, with 12.1% of variance uniquely predicted by psychosocial risk status. Psychosocial risk status accounted for comparatively little total explained variance in global pain ratings (total = 38.6%, incremental = 3.2%), and average annual total doctor visits (total = 10.9%, incremental = 1.5%). Thus, psychosocial risk factors are more closely linked to depressive symptoms and function over time. Global pain and utilization appear to be more closely related to disease factors.  相似文献   

16.
目的 将婴儿态度量表(Attitude Toward Babies Scale,ABS)进行汉化,并检验在中国已婚育龄女性中的信效度。方法 采用方便取样的方法,选取贵州、山西、湖北等地的700名育龄女性进行施测,通过项目分析、内容效度分析、探索性因素分析、验证性因素分析、效标关联效度、Cronbach α系数、分半信度、重测信度评价其信效度。结果 项目分析表明,婴儿态度量表各条目与量表各维度总分显著相关,具有良好的区分度; 内容效度分析表明专家间一致性水平(IR)为1,I-CVI在0.83~1之间,S-CVI/UA为0.82,S-CVI/Ave为0.97; 探索性因素分析得出5个特征值>1的因子,累计方差贡献率为54.399%; 验证性因素分析表明五因素模型拟合度较好(χ2/df=2.500,CFI=0.922,TLI=0.914,RMSEA=0.048,SRMR=0.050); 各效标与该量表显著相关; 总量表Cronbach α系数为0.748,量表的分半信度为0.661,重测信度为0.639。结论 修订后的ABS具有良好的信效度,可以作为已婚育龄女性生育动机的有效测量工具。  相似文献   

17.
Little attention has been paid to evaluating the use of DSM-III-R with preschool children. Children (N = 510) ages 2 to 5 years who were screened at the time of a pediatric visit were selected to participate in an evaluation which included questionnaires, a semistructured interview, developmental testing, and a play observation. Following the evaluation, two clinical child psychologists independently assigned DSM-III-R diagnoses. For each diagnostic category, kappa and Ycoefficients were calculated; Ycoefficients are less sensitive to base rates of disorders. For overall agreement, the weighted mean kappa (.61), and mean Y(.66) were moderately high. Overall agreement that the child had at least one of the disruptive disorders was substantial (kappa =.64; Y =.65);agreement that there was at least one of the emotional disorders was moderate for kappa (.54), but substantial for Y(.70). Kappa coefficients were higher for major categories of disorder than for specific disorders; however, Ycoefficients did not show a decline for specific disorders. Interrater reliability of DSM-III-R appears to be similar for preschoolers and older children.This study was supported by grant MH46089 from the National Institute of Mental Health.A preliminary report was presented at the Fifth Annual NIMH International Research Conference on the Classification and Treatment of Mental Disorders in General Medical Settings, Bethesda, Maryland, September 1991. We gratefully acknowledge the members of the Pediatric Practice Research Group who participated in this study.  相似文献   

18.
Dibble JL  Levine TR  Park HS 《心理评价》2012,24(3):565-572
A fundamental dimension along which all social and personal relationships vary is closeness. The Unidimensional Relationship Closeness Scale (URCS) is a 12-item self-report scale measuring the closeness of social and personal relationships. The reliability and validity of the URCS were assessed with college dating couples (N = 192), female friends and strangers (N = 330), friends (N = 170), and family members (N = 155). The results show that the scale is unidimensional, with high reliability across relationship types (M α = .96). Evidence consistent with validity included substantial within-couple agreement for the romantic couples (intraclass correlation = .41), substantial friend-stranger discrimination for the female friends (η2 = .82), and measurement invariance across relationship types. Evidence of convergent and divergent validity was obtained for inclusion of other in the self and relational satisfaction, respectively.  相似文献   

19.
Walters GD 《Assessment》2011,18(2):227-233
The possibility of combining indicators to improve recidivism prediction was evaluated in a sample of released federal prisoners randomly divided into a derivation subsample (n = 550) and a cross-validation subsample (n = 551). Five incrementally valid indicators were selected from five domains: demographic (age), historical (prior convictions), adjustment (prior incident reports), rating scale (Violation scale of the Lifestyle Criminality Screening Form), and self-report (General Criminal Thinking score from the Psychological Inventory of Criminal Thinking Styles). After converting scores on the five indicators to a common scale (z score), two combined scores were calculated: a simple summed score (unweighted summed score) and a score computed using beta weights from a Cox survival analysis of the derivation subsample (weighted summed score). Correlational and receiver operating characteristic analyses revealed that the unweighted and weighted summed scores produced equivalent results and that both improved significantly on the results of the five contributing indicators.  相似文献   

20.
This study investigated issues related to commonly used socioeconomic status (SES) measures in 140 participants from three cities (Atlanta, Boston, and Toronto) in two countries (United States and Canada). Measures of SES were two from the United States (four-factor Hollingshead scale, Nakao and Treas scale) and one from Canada (Blishen, Carroll, and Moore scale). Reliability was examined both within (interrater agreement) and across (intermeasure agreement) measures. Interrater reliability and classification agreement was high for the total sample (ranger = .86 to .91), as were intermeasure correlations and classification agreement (range r = .81 to .88). The weakest agreement across measures was found when families had one wage earner who was female. Validity data for these SES measures with academic and intellectual measures also were obtained. Some support for a simplified approach to measuring SES was found. Implications of these findings for the use of SES in social and behavioral science research are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号