首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Summary: Sixty-two subjects completed the California Psjrchological Inventory, the Rotter External-Internal locus of control scale and an audio-taped discussion of their personal problems. The audio-taped problems were rated on a five-point level of personal responsibility scale and were compared wi.th the scores on the California Personality Inventory and the Internal-External locus of control scale in a correlation matrix which was subjected to a factor analysis. The results from these analyses suppovted the hypothesis that the Personal Responsibility Rating System has construct validity as a measure of psychological health. Study I1 assessed the trainability of the Personal Responsibility System. With a four-hour training program it was found that graduate students could be taught to rate personal responsibility in a reliable manner.  相似文献   

2.
3.
In this paper we discuss a method for assessing agreement among raters who are scoring the number of times, a specified event occurs. In such cases, it seems reasonable to define agreement in terms of raters' behaviours in correctly identifying responses which have in fact occurred, and in their falsely counting responses which have not. We exploit the discrete nature of the response variable, and examine a class of models for mean response assuming an underlying Poisson distribution. Test statistics are given for deciding on the applicability of the models, and whether there is agreement with respect to correctly detecting responses, as well as with falsely scoring responses.  相似文献   

4.
Pi (π) and kappa (κ) statistics are widely used in the areas of psychiatry and psychological testing to compute the extent of agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results in situations known as the paradoxes of kappa. This paper explores the origin of these limitations, and introduces an alternative and more stable agreement coefficient referred to as the AC1 coefficient. Also proposed are new variance estimators for the multiple‐rater generalized π and AC1 statistics, whose validity does not depend upon the hypothesis of independence between raters. This is an improvement over existing alternative variances, which depend on the independence assumption. A Monte‐Carlo simulation study demonstrates the validity of these variance estimators for confidence interval construction, and confirms the value of AC1 as an improved alternative to existing inter‐rater reliability statistics.  相似文献   

5.
The purpose of this study was to investigate the interrater reliability of the visual-motor portion of the Copying subtest of the Stanford-Binet Intelligence Scale: Fourth Edition. Eight raters independently scored 11 protocols completed by children aged 5 through 10 years, using the scoring criteria and guidelines in the manual. The raters marked each of 10 items pass or fail and computed a total raw score for each protocol. Interrater reliability coefficients were obtained for each child's protocol, and the Kappa coefficient was computed for each item. Significant raters' reliability coefficients ranged from .82 to .91, which were low in comparison to test-retest reliability and Kuder-Richardson-20 coefficients for this and other subtests of the Stanford-Binet in the technical manual. Percent agreement among 8 raters also indicated weak reliability. Although the obtained results suggested some interrater reliability coefficients within acceptable levels, questions were raised about the scoring criteria for individual items. Caution is warranted in the use of cognitive measures which include subjective judgement of the examiner in applying scoring criteria.  相似文献   

6.
In a previous paper [Elashoff 1969], we derived optimal rater teams for a particular formulation of the dichotomous rater problem. Here, we describe a computer-based procedure for selecting good rater teams in practice; we apply the procedure to the selection of items for a psychological inventory. This research was supported in part by the author's predoctoral fellowship from the National Institutes of Health and by National Science Foundation Grant GS-341, and National Institutes of Health Grants FR-3 and FR-122.  相似文献   

7.
How can an investigator choose a good team of raters to use for measuring a continuous variable when each available rater produces only dichotomous responses? We formulate an underlying model, define an index of goodness for rater teams in terms of average mean square error of the estimate, develop a new estimator and derive the optimal rater terms. The optimal raters have characteristic curves which are linear in form and satisfy the requirements for a Guttman scale.  相似文献   

8.
It is essential that outcome research permit clear conclusions to be drawn about the efficacy of interventions. The common practice of nesting therapists within conditions can pose important methodological challenges that affect interpretation, particularly if the study is not powered to account for the nested design. An obstacle to the optimal design of these studies is the lack of data about the intraclass correlation coefficient (ICC), which measures the statistical dependencies introduced by nesting. To begin the development of a public database of ICC estimates, the authors investigated ICCs for a variety outcomes reported in 20 psychotherapy outcome studies. The magnitude of the 495 ICC estimates varied widely across measures and studies. The authors provide recommendations regarding how to select and aggregate ICC estimates for power calculations and show how researchers can use ICC estimates to choose the number of patients and therapists that will optimize power. Attention to these recommendations will strengthen the validity of inferences drawn from psychotherapy studies that nest therapists within conditions.  相似文献   

9.
This study explores the importance of anticipated group discussion, the consensus decision rule, and rater motivation in determining how well rater teams identify ratee behaviors, i.e., behavioral accuracy. Results, based on 382 raters in 111 teams, suggest that the anticipation of group discussion can improve behavioral accuracy, but it appears that the benefits of discussion-only teams are limited to this anticipation effect. Furthermore, it also appears that rater motivation plays an important role in this type of team. Rater teams required to reach consensus, however, appear to show improved behavioral accuracy, regardless of whether raters can anticipate the consensus discussion and regardless of rater motivation levels. Implications, especially for assessment centers, are discussed.  相似文献   

10.
The authors developed a source-monitoring procedure to reduce the biasing effects of rater expectations on behavioral measurement. Study participants (N = 224) were given positive or negative information regarding the performance of a group and, after observing the group, were assigned to a source-monitoring or control condition. Raters in the source-monitoring condition were instructed to report only behaviors that evoked detailed memories (remember judgments) and to avoid reporting behaviors based on feelings of familiarity (know judgments). Results revealed that controlling raters' response strategy reduced (and often eliminated) the biasing effects of performance expectations. These findings advance our understanding of the performance-cue bias and offer a potentially useful technique for decreasing rater bias.  相似文献   

11.
It is essential that outcome research permit clear conclusions to be drawn about the efficacy of interventions. The common practice of nesting therapists within conditions can pose important methodological challenges that affect interpretation, particularly if the study is not powered to account for the nested design. An obstacle to the optimal design of these studies is the lack of data about the intraclass correlation coefficient (ICC), which measures the statistical dependencies introduced by nesting. To begin the development of a public database of ICC estimates, the authors investigated ICCs for a variety outcomes reported in 20 psychotherapy outcome studies. The magnitude of the 495 ICC estimates varied widely across measures and studies. The authors provide recommendations regarding how to select and aggregate ICC estimates for power calculations and show how researchers can use ICC estimates to choose the number of patients and therapists that will optimize power. Attention to these recommendations will strengthen the validity of inferences drawn from psychotherapy studies that nest therapists within conditions.  相似文献   

12.
Although driving while intoxicated (DWI) is a pervasive problem, reliable measures of this behavior have been elusive. In the present study, the Form 90, a widely utilized alcohol and substance use instrument, was adapted for measurement of DWI and related behaviors. Levels of reliability for the adapted instrument, the Form 90-DWI, were tested among a university sample of 60 undergraduate students who had consumed alcohol during the past 90 days. The authors administered the instrument once during an intake interview and again, 7-30 days later, to determine levels of test-retest reliability. Overall, the Form 90-DWI demonstrated high levels of reliability for many general drinking and DWI behaviors. Levels of reliability were lower for riding with an intoxicated driver and for variables involving several behavioral conjunctions, such as seat belt use and the presence of passengers when driving with a blood alcohol concentration above .08. Overall, the Form 90-DWI shows promise as a reliable measure of DWI behavior in research on treatment outcome and prevention.  相似文献   

13.
This study demonstrates that an unstructured interview (INT) and the Thematic Apperception Test (TAT) are suitable alternatives to Loevinger's Sentence Completion Test of Ego Development. Seventy subjects were solicited from six groups varying widely with respect to age and education level. Each subject was asked to complete the SCI, an INT and the TAT. Two raters using Loevinger and Wessler's self-training exercises and Loevinger, Wessler, and Redmore's scoring manual rated subjects' responses to each instrument. Reliability was found between raters and concurrent validity between instruments. Subjects scoring high on the SCT were found to score higher on the INT and TAT.  相似文献   

14.
15.
16.
The need for closure and the ability to achieve closure are generally thought to be independent from one another. However, previous researchers have found inconsistent relations between these two variables, possibly due to measurement scale modifications that slightly shifted how the underlying constructs were assessed. The present research attempted to address some of these methodological issues with previous research by conducting a single-paper meta-analysis on the correlations between the ability to achieve closure scale and the full need for closure scale and each of its five subscales. Across six university student samples (N = 1983), the full need for closure scale and most of its subscales were significantly negatively correlated with the ability to achieve closure. This finding suggests that the ability to achieve closure affects the costs and benefits of closure and therefore, consistent with lay epistemic theory, the ability to achieve closure predicts individual differences in the need for closure.  相似文献   

17.
This paper investigated whether criteria stemming from the Reality Monitoring (RM) framework could be trusted to assess the reliability of statements obtained by the use of a cognitive interview (CI). Fifty-eight children, aged 10-11, participated. One-third watched a film about a fakir and were then interviewed according to a CI (n= 19). The remaining two-thirds made up a story about a fakir and were then interviewed according to either a CI (n= 21), or a structured interview (SI) (n= 18). The CI statements based on observed events contained more visual, affective, spatial and temporal information compared to CI statements based on imagined events. The CI statements based on imagined events did not differ from the SI statements based on imagined events. Considerable developmental work is recommended to turn the RM technique to a reliable test that could be used by practitioners.  相似文献   

18.
In this article, the psychometric properties of a new scale aimed at quantifying passion are explored, i.e. passion related to becoming good or achieving in some area/theme/skill.The Passion Scale was designed to be quantitative, simple to administer, applicable for large-group testing, and reliable in monitoring passion.A total of 126 participants between 18 and 47 years of age (mean age = 21.65, SD = 3.45) completed an assessment of Passion Scale, enabling us to investigate its feasibility, internal consistency, construct validity and test-retest reliability.FeasibilityThe overall pattern of results suggest that the scale for passion presented here is applicable for the age studied (18–47).Internal consistencyAll individual item scores correlated positively with the total score, with correlations ranging from 0.51 to 0.69. The Cronbach's alpha value for the standardized items was 0.86.Construct validityPearson correlations coefficient between total score passion scale and Grit-S scale were 0.39 for adults, mean age 21.23 (SD = 3.45) (N = 107).Test-retest reliability: Intraclass correlation coefficient (ICCs) between test and retest scores for the total score was 0.92.These promising results warrant further development of the passion scale, including normalization based on a large, representative sample.  相似文献   

19.
To improve the characterization of motor impairment, we compared the sensitivities of a phase plane metric with temporal domain measures derived from integrated squared jerk (ISJ). Five subjects with stroke and a cohort of 21 neurologically intact volunteers performed self-paced, isolated elbow flexions. Analysis of angular trajectories from the stroke group revealed that temporal domain metrics failed to detect a performance deficit at the p < .05 level, while the phase plane metric did resolve a deficit (p < .01). When applied to a subset of movements with arrest periods, the phase measure also uniquely identified impairment (Wilcoxon rank-sum test, p < .001). Finally, when tested on a data-driven model, the phase measure, but not temporal metrics, increased monotonically with the severity of trajectory distortions. We conclude that motion smoothness can be accurately measured in the phase plane.  相似文献   

20.
AIMS. To develop a new protocol for the assessment of action observation (AO) abilities and imitation of meaningful and non-meaningful gestures, to examine its psychometric properties in children with DCD and typically developing (TD) children. BACKGROUND. For learning manual skills, AO and imitation are considered fundamental abilities. Knowledge about these modalities in children with DCD is scarce and an assessment protocol is lacking. METHOD. The protocol consists of 2 tests. The AO test consists of two assembly tasks. The imitation test includes 12 meaningful and 20 non-meaningful gestures. Items of both tests are rated on a 4-point scale. Twelve children with DCD (mean age 8y3m, SD, 1.30) and 11 TD children (mean age 8y2m, SD 1.52) were enrolled. For inter-rater reliability, intraclass correlation coefficients (ICC) were calculated for the total score, weighted kappa and percentage agreement for single items. Known group validity was assessed by comparison of DCD and TD group (Wilcoxon rank sum test). For construct validity, the mABC-2 test was used. The protocol was adapted and confirmed by an intra and inter-rater reliability study (new sample of 11 DCD children, mean age 7y5m, SD 1.37). RESULTS. Excellent ICCs were reported for intra and inter-rater reliability for the final protocol. A significant difference between DCD and TD group was found for AO abilities (p < .01), for nonmeaningful gestures (p < .001). A significant correlation was reported between the AO test and the mABC-2 test (r = 56;p ≤0.0001). No significant correlations were revealed for the imitation tests. DISCUSSION AND CONCLUSION. The results support the psychometric properties of this protocol. When fully validated, it may contribute to map the deficits in AO abilities and imitation, to evaluate treatment effects of imitation and AO interventions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号