期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A system for assessing personal responsibility: validity, reliability and rater trainability

Genthner RW Jones DE 《Journal of personality assessment》1976,40(3):269-275

Summary: Sixty-two subjects completed the California Psjrchological Inventory, the Rotter External-Internal locus of control scale and an audio-taped discussion of their personal problems. The audio-taped problems were rated on a five-point level of personal responsibility scale and were compared wi.th the scores on the California Personality Inventory and the Internal-External locus of control scale in a correlation matrix which was subjected to a factor analysis. The results from these analyses suppovted the hypothesis that the Personal Responsibility Rating System has construct validity as a measure of psychological health. Study I1 assessed the trainability of the Personal Responsibility System. With a four-hour training program it was found that graduate students could be taught to rate personal responsibility in a reliable manner. 相似文献

2.

Intraclass correlation: Estimation of the reliability of ratings

John Mazzeo Mark Borgstrom George W. Seeley 《Behavior research methods》1982,14(1):45-46

相似文献

3.

Poisson models for assessing rater agreement in discrete response studies

Michael L. Feldstein Henry T. Davis 《The British journal of mathematical and statistical psychology》1984,37(1):49-61

In this paper we discuss a method for assessing agreement among raters who are scoring the number of times, a specified event occurs. In such cases, it seems reasonable to define agreement in terms of raters' behaviours in correctly identifying responses which have in fact occurred, and in their falsely counting responses which have not. We exploit the discrete nature of the response variable, and examine a class of models for mean response assuming an underlying Poisson distribution. Test statistics are given for deciding on the applicability of the models, and whether there is agreement with respect to correctly detecting responses, as well as with falsely scoring responses. 相似文献

4.

Computing inter‐rater reliability and its variance in the presence of high agreement

Kilem Li Gwet 《The British journal of mathematical and statistical psychology》2008,61(1):29-48

Pi (π) and kappa (κ) statistics are widely used in the areas of psychiatry and psychological testing to compute the extent of agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results in situations known as the paradoxes of kappa. This paper explores the origin of these limitations, and introduces an alternative and more stable agreement coefficient referred to as the AC₁ coefficient. Also proposed are new variance estimators for the multiple‐rater generalized π and AC₁ statistics, whose validity does not depend upon the hypothesis of independence between raters. This is an improvement over existing alternative variances, which depend on the independence assumption. A Monte‐Carlo simulation study demonstrates the validity of these variance estimators for confidence interval construction, and confirms the value of AC₁ as an improved alternative to existing inter‐rater reliability statistics. 相似文献

5.

Percent of agreement among raters and rater reliability of the copying subtest of the Stanford-Binet Intelligence Scale: Fourth Edition.

E M Mason 《Perceptual and motor skills》1992,74(2):347-353

The purpose of this study was to investigate the interrater reliability of the visual-motor portion of the Copying subtest of the Stanford-Binet Intelligence Scale: Fourth Edition. Eight raters independently scored 11 protocols completed by children aged 5 through 10 years, using the scoring criteria and guidelines in the manual. The raters marked each of 10 items pass or fail and computed a total raw score for each protocol. Interrater reliability coefficients were obtained for each child's protocol, and the Kappa coefficient was computed for each item. Significant raters' reliability coefficients ranged from .82 to .91, which were low in comparison to test-retest reliability and Kuder-Richardson-20 coefficients for this and other subtests of the Stanford-Binet in the technical manual. Percent agreement among 8 raters also indicated weak reliability. Although the obtained results suggested some interrater reliability coefficients within acceptable levels, questions were raised about the scoring criteria for individual items. Caution is warranted in the use of cognitive measures which include subjective judgement of the examiner in applying scoring criteria. 相似文献

6.

Optimal choice of rater teams II: Applications

Janet Dixon Elashoff Donald E. Spiegel 《Psychometrika》1969,34(1):33-44

In a previous paper [Elashoff 1969], we derived optimal rater teams for a particular formulation of the dichotomous rater problem. Here, we describe a computer-based procedure for selecting good rater teams in practice; we apply the procedure to the selection of items for a psychological inventory. This research was supported in part by the author's predoctoral fellowship from the National Institutes of Health and by National Science Foundation Grant GS-341, and National Institutes of Health Grants FR-3 and FR-122. 相似文献

7.

Optimal choice of rater terms I: Theory

Janet Dixon Elashoff 《Psychometrika》1969,34(1):21-32

How can an investigator choose a good team of raters to use for measuring a continuous variable when each available rater produces only dichotomous responses? We formulate an underlying model, define an index of goodness for rater teams in terms of average mean square error of the estimate, develop a new estimator and derive the optimal rater terms. The optimal raters have characteristic curves which are linear in form and satisfy the requirements for a Guttman scale. 相似文献

8.

Intraclass correlation associated with therapists: estimates and applications in planning psychotherapy research

Baldwin SA Murray DM Shadish WR Pals SL Holland JM Abramowitz JS Andersson G Atkins DC Carlbring P Carroll KM Christensen A Eddington KM Ehlers A Feaster DJ Keijsers GP Koch E Kuyken W Lange A Lincoln T Stephens RS Taylor S Trepka C Watson J 《Cognitive behaviour therapy》2011,40(1):15-33

It is essential that outcome research permit clear conclusions to be drawn about the efficacy of interventions. The common practice of nesting therapists within conditions can pose important methodological challenges that affect interpretation, particularly if the study is not powered to account for the nested design. An obstacle to the optimal design of these studies is the lack of data about the intraclass correlation coefficient (ICC), which measures the statistical dependencies introduced by nesting. To begin the development of a public database of ICC estimates, the authors investigated ICCs for a variety outcomes reported in 20 psychotherapy outcome studies. The magnitude of the 495 ICC estimates varied widely across measures and studies. The authors provide recommendations regarding how to select and aggregate ICC estimates for power calculations and show how researchers can use ICC estimates to choose the number of patients and therapists that will optimize power. Attention to these recommendations will strengthen the validity of inferences drawn from psychotherapy studies that nest therapists within conditions. 相似文献

9.

Why convene rater teams: An investigation of the benefits of anticipated discussion,consensus, and rater motivation

Sylvia G. Roch 《Organizational behavior and human decision processes》2007

This study explores the importance of anticipated group discussion, the consensus decision rule, and rater motivation in determining how well rater teams identify ratee behaviors, i.e., behavioral accuracy. Results, based on 382 raters in 111 teams, suggest that the anticipation of group discussion can improve behavioral accuracy, but it appears that the benefits of discussion-only teams are limited to this anticipation effect. Furthermore, it also appears that rater motivation plays an important role in this type of team. Rater teams required to reach consensus, however, appear to show improved behavioral accuracy, regardless of whether raters can anticipate the consensus discussion and regardless of rater motivation levels. Implications, especially for assessment centers, are discussed. 相似文献

10.

Source-monitoring training: toward reducing rater expectancy effects in behavioral measurement

Martell RF Evans DP 《The Journal of applied psychology》2005,90(5):956-963

The authors developed a source-monitoring procedure to reduce the biasing effects of rater expectations on behavioral measurement. Study participants (N = 224) were given positive or negative information regarding the performance of a group and, after observing the group, were assigned to a source-monitoring or control condition. Raters in the source-monitoring condition were instructed to report only behaviors that evoked detailed memories (remember judgments) and to avoid reporting behaviors based on feelings of familiarity (know judgments). Results revealed that controlling raters' response strategy reduced (and often eliminated) the biasing effects of performance expectations. These findings advance our understanding of the performance-cue bias and offer a potentially useful technique for decreasing rater bias. 相似文献

11.

Intraclass Correlation Associated with Therapists: Estimates and Applications in Planning Psychotherapy Research

《Cognitive behaviour therapy》2013,42(1):15-33

It is essential that outcome research permit clear conclusions to be drawn about the efficacy of interventions. The common practice of nesting therapists within conditions can pose important methodological challenges that affect interpretation, particularly if the study is not powered to account for the nested design. An obstacle to the optimal design of these studies is the lack of data about the intraclass correlation coefficient (ICC), which measures the statistical dependencies introduced by nesting. To begin the development of a public database of ICC estimates, the authors investigated ICCs for a variety outcomes reported in 20 psychotherapy outcome studies. The magnitude of the 495 ICC estimates varied widely across measures and studies. The authors provide recommendations regarding how to select and aggregate ICC estimates for power calculations and show how researchers can use ICC estimates to choose the number of patients and therapists that will optimize power. Attention to these recommendations will strengthen the validity of inferences drawn from psychotherapy studies that nest therapists within conditions. 相似文献

12.

The test-retest reliability of the Form 90-DWI: an instrument for assessing intoxicated driving.

Jennifer E Hettema William R Miller J Scott Tonigan Harold D Delaney 《Psychology of addictive behaviors》2008,22(1):117-121

Although driving while intoxicated (DWI) is a pervasive problem, reliable measures of this behavior have been elusive. In the present study, the Form 90, a widely utilized alcohol and substance use instrument, was adapted for measurement of DWI and related behaviors. Levels of reliability for the adapted instrument, the Form 90-DWI, were tested among a university sample of 60 undergraduate students who had consumed alcohol during the past 90 days. The authors administered the instrument once during an intake interview and again, 7-30 days later, to determine levels of test-retest reliability. Overall, the Form 90-DWI demonstrated high levels of reliability for many general drinking and DWI behaviors. Levels of reliability were lower for riding with an intoxicated driver and for variables involving several behavioral conjunctions, such as seat belt use and the presence of passengers when driving with a blood alcohol concentration above .08. Overall, the Form 90-DWI shows promise as a reliable measure of DWI behavior in research on treatment outcome and prevention. 相似文献

13.

The reliability and concurrent validity of alternative methods for assessing ego development 总被引：1，自引：0，他引：1

Sutton PM Swensen CH 《Journal of personality assessment》1983,47(5):468-475

This study demonstrates that an unstructured interview (INT) and the Thematic Apperception Test (TAT) are suitable alternatives to Loevinger's Sentence Completion Test of Ego Development. Seventy subjects were solicited from six groups varying widely with respect to age and education level. Each subject was asked to complete the SCI, an INT and the TAT. Two raters using Loevinger and Wessler's self-training exercises and Loevinger, Wessler, and Redmore's scoring manual rated subjects' responses to each instrument. Reliability was found between raters and concurrent validity between instruments. Subjects scoring high on the SCT were found to score higher on the INT and TAT. 相似文献

14.

The generalizability study as a method of assessing intra- and interobserver reliability in observational research

Cathryn L. Booth Sandra K. Mitchell Frances K. Solin 《Behavior research methods》1979,11(5):491-494

相似文献

15.

If: Some uses

Samuel Fillenbaum 《Psychological research》1975,37(3):245-260

相似文献

16.

Understanding the relation between the need and ability to achieve closure: A single paper meta-analysis assessing subscale correlations

《New Ideas in Psychology》2023

The need for closure and the ability to achieve closure are generally thought to be independent from one another. However, previous researchers have found inconsistent relations between these two variables, possibly due to measurement scale modifications that slightly shifted how the underlying constructs were assessed. The present research attempted to address some of these methodological issues with previous research by conducting a single-paper meta-analysis on the correlations between the ability to achieve closure scale and the full need for closure scale and each of its five subscales. Across six university student samples (N = 1983), the full need for closure scale and most of its subscales were significantly negatively correlated with the ability to achieve closure. This finding suggests that the ability to achieve closure affects the costs and benefits of closure and therefore, consistent with lay epistemic theory, the ability to achieve closure predicts individual differences in the need for closure. 相似文献

17.

Interviewing children with the cognitive interview: assessing the reliability of statements based on observed and imagined events

Larsson AS Granhag PA 《Scandinavian journal of psychology》2005,46(1):49-57

This paper investigated whether criteria stemming from the Reality Monitoring (RM) framework could be trusted to assess the reliability of statements obtained by the use of a cognitive interview (CI). Fifty-eight children, aged 10-11, participated. One-third watched a film about a fakir and were then interviewed according to a CI (n= 19). The remaining two-thirds made up a story about a fakir and were then interviewed according to either a CI (n= 21), or a structured interview (SI) (n= 18). The CI statements based on observed events contained more visual, affective, spatial and temporal information compared to CI statements based on imagined events. The CI statements based on imagined events did not differ from the SI statements based on imagined events. Considerable developmental work is recommended to turn the RM technique to a reliable test that could be used by practitioners. 相似文献

18.

The passion scale: Aspects of reliability and validity of a new 8-item scale assessing passion.

《New Ideas in Psychology》2020

In this article, the psychometric properties of a new scale aimed at quantifying passion are explored, i.e. passion related to becoming good or achieving in some area/theme/skill.The Passion Scale was designed to be quantitative, simple to administer, applicable for large-group testing, and reliable in monitoring passion.A total of 126 participants between 18 and 47 years of age (mean age = 21.65, SD = 3.45) completed an assessment of Passion Scale, enabling us to investigate its feasibility, internal consistency, construct validity and test-retest reliability.FeasibilityThe overall pattern of results suggest that the scale for passion presented here is applicable for the age studied (18–47).Internal consistencyAll individual item scores correlated positively with the total score, with correlations ranging from 0.51 to 0.69. The Cronbach's alpha value for the standardized items was 0.86.Construct validityPearson correlations coefficient between total score passion scale and Grit-S scale were 0.39 for adults, mean age 21.23 (SD = 3.45) (N = 107).Test-retest reliability: Intraclass correlation coefficient (ICCs) between test and retest scores for the total score was 0.92.These promising results warrant further development of the passion scale, including normalization based on a large, representative sample. 相似文献

19.

Reformulation in the phase plane enhances smoothness rater accuracy in stroke

Wininger M Kim NH Craelius W 《Journal of motor behavior》2012,44(3):149-159

To improve the characterization of motor impairment, we compared the sensitivities of a phase plane metric with temporal domain measures derived from integrated squared jerk (ISJ). Five subjects with stroke and a cohort of 21 neurologically intact volunteers performed self-paced, isolated elbow flexions. Analysis of angular trajectories from the stroke group revealed that temporal domain metrics failed to detect a performance deficit at the p < .05 level, while the phase plane metric did resolve a deficit (p < .01). When applied to a subset of movements with arrest periods, the phase measure also uniquely identified impairment (Wilcoxon rank-sum test, p < .001). Finally, when tested on a data-driven model, the phase measure, but not temporal metrics, increased monotonically with the severity of trajectory distortions. We conclude that motion smoothness can be accurately measured in the phase plane. 相似文献

20.

A new protocol for assessing action observation and imitation abilities in children with Developmental Coordination Disorder: A feasibility and reliability study

《Human movement science》2021

AIMS. To develop a new protocol for the assessment of action observation (AO) abilities and imitation of meaningful and non-meaningful gestures, to examine its psychometric properties in children with DCD and typically developing (TD) children. BACKGROUND. For learning manual skills, AO and imitation are considered fundamental abilities. Knowledge about these modalities in children with DCD is scarce and an assessment protocol is lacking. METHOD. The protocol consists of 2 tests. The AO test consists of two assembly tasks. The imitation test includes 12 meaningful and 20 non-meaningful gestures. Items of both tests are rated on a 4-point scale. Twelve children with DCD (mean age 8y3m, SD, 1.30) and 11 TD children (mean age 8y2m, SD 1.52) were enrolled. For inter-rater reliability, intraclass correlation coefficients (ICC) were calculated for the total score, weighted kappa and percentage agreement for single items. Known group validity was assessed by comparison of DCD and TD group (Wilcoxon rank sum test). For construct validity, the mABC-2 test was used. The protocol was adapted and confirmed by an intra and inter-rater reliability study (new sample of 11 DCD children, mean age 7y5m, SD 1.37). RESULTS. Excellent ICCs were reported for intra and inter-rater reliability for the final protocol. A significant difference between DCD and TD group was found for AO abilities (p < .01), for nonmeaningful gestures (p < .001). A significant correlation was reported between the AO test and the mABC-2 test (r = 56;p ≤0.0001). No significant correlations were revealed for the imitation tests. DISCUSSION AND CONCLUSION. The results support the psychometric properties of this protocol. When fully validated, it may contribute to map the deficits in AO abilities and imitation, to evaluate treatment effects of imitation and AO interventions. 相似文献