首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The authors tested the assumption that single-item measures have unacceptably low reliability and validity. On 2 occasions 11 weeks apart, college students reported on the frequency and quantity of alcohol consumption, 2 religious behaviors, time of study and of socializing (focal items), and other qualities and characteristics. Most test-retest reliabilities were good to excellent; objective facts were more reliable than subjective evaluations; and target items had good validity when correlated with 2-week nightly log records of corresponding behaviors in a multimethod multitrait matrix. The exception was self-reported study, with relatively low reliability and validity, suggesting the non-trait-like quality of this behavior. Single-item measures may be better than commonly thought.  相似文献   

2.
While most researchers do agree now that situations may have an effect in the assessment of traits, the consequences have been neglected, so far: if situations affect the assessment of traits we have to take this fact into account in studies on reliability and validity of measurement instruments and their application. In the theoretical part of this article we provide a more formal exposition of this point, introducing the basic concepts of latent state–trait (LST) theory. LST theory and the associated models allow for the estimation of the situational impact on trait measures in non-experimental, correlational studies. In the empirical part, LST theory is applied to three well known trait questionnaires: the Freiburg Personality Inventory, the NEO Five-Factor Inventory and the Eysenck Personality Inventory. It is shown that significant proportions of the variances of the scales of these questionnaires are due to situational effects. The following consequences of this finding are discussed, (i) Instead of the reliability coefficient, the proportion of variance due to the latent trait, the consistency coefficient, should be used for the estimation of confidence intervals for trait scores, (ii) To reduce the situational effects on trait estimates it may be useful to base such an estimate on several occasions, i.e., to aggregate data across occasions. (iii) Reliability and validity studies should not only be based on a sample of persons representative of those to whom the test will be applied; they should also be conducted in situational contexts representative of the intended applications.  相似文献   

3.
Large-sample confidence intervals (CI) for reliability, validity, and unattenuated validity are presented. The CI for unattenuated validity is based on the Bonferroni inequality, which relies on one CI for test-retest reliability and one for validity. Covered are four reliability-validity situations: (a) both estimates were from random samples; (b) reliability was from a random sample but validity was from a selected sample; (c) validity was from a random sample but reliability was from a selected sample; and (d) both estimates were from selected samples. All CIs were evaluated by using a simulation. CIs on reliability, validity, or unattenuated validity are accurate as long as selection ratio is at least 20% and selected sample size is 100 or larger. When selection ratio is less than 20%, estimators tend to underestimate their parameters.  相似文献   

4.
Psychometric properties of the Swedish version of the Body Shape Questionnaire (BSQ) were examined using data from three samples: (1) a sample from the general population (n= 1157), (2) a student sample (n= 124) and (3) a clinical sample (n= 90). Analyses showed that a single factor solution might be a reasonable solution as 32 of the 34 items loaded heavily on the first factor. The derived short 14-item version of the BSQ also showed a coherent structure with all the items loading on one single factor. The BSQ showed high test-retest reliability, very high internal consistency, ranging from 0.94 to 0.97, and high split-half reliability (above 0.93). Furthermore, it showed high validity by correlating highly with the body dissatisfaction subscale of the Eating Disorders Inventory (r= 0.72 and higher), and high discriminant validity. Thus, the Swedish version of the BSQ showed good concurrent and discriminant validity as well as high reliability.  相似文献   

5.
Conceptualizing and measuring global interpersonal mistrust-trust   总被引:1,自引:0,他引:1  
Global interpersonal mistrust is conceptualized as a general mistrust of the motives of others in situations related to one's well-being: a general tendency to view others as mean, selfish, malevolent, or unreliable people who are, thus, not to be depended on to treat one well. The authors developed an 18-item unidimensional self-report inventory measuring interpersonal mistrust as a negative cognitive orientation toward others. The measure comprises items describing perceptions of specific hypothetical interpersonal situations rather than items asking respondents to describe their own general behavior. The measure was reliable and evidenced construct validity in a heterogeneous sample of Australians.  相似文献   

6.
Although information about individuals' exposure to highly stressful events such as traumatic stressors is often very useful for clinicians and researchers, available measures are too long and complex for use in many settings. The Trauma History Screen (THS) was developed to provide a very brief and easy-to-complete self-report measure of exposure to high magnitude stressor (HMS) events and of events associated with significant and persisting posttraumatic distress (PPD). The measure assesses the frequency of HMS and PPD events, and it provides detailed information about PPD events. Test-retest reliability was studied in four samples, and temporal stability was good to excellent for items and trauma types and excellent for overall HMS and PPD scores. Comprehensibility of items was supported by expert ratings of how well items appeared to be understood by participants with relatively low reading levels. In five samples, construct validity was supported by findings of strong convergent validity with a longer measure of trauma exposure and by correlations of HMS and PPD scores with posttraumatic stress disorder (PTSD) symptoms. The psychometric properties of the THS appear to be comparable or better than longer and more complex measures of trauma exposure.  相似文献   

7.
The potential for applicant response distortion on personality measures remains a major concern in high‐stakes testing situations. Many approaches to understanding response distortion are too transparent (e.g., instructed faking studies) – or are too subtle (e.g., correlations with social desirability measures as indices of faking). Recent research reveals more promising approaches in two methods: using forced‐choice (FC) personality test items and warning against faking. The present study examined effects of these two methods on criterion‐related validity and test‐taker reactions. Results supported incremental validity for an FC and Likert‐scale measure in warning and no‐warning conditions, above and beyond cognitive ability. No clear differences emerged between the FC vs Likert measures or warning vs no‐warning conditions in terms of validity. However, some evidence suggested that FC measures and warnings may produce negative test‐taker reactions. We conclude with implications for implementation in selection settings.  相似文献   

8.
A series of studies was conducted in order to construct and validate a measure of interpersonal conflict communication style, the Conflict Communication Scale (CCS). CCS items were designed to reflect variability in approach to conflict situations and to gather information relevant to conflict-related interventions, such as mediation. The measure is comprised of 5 subscales: (a) confrontation, (b) public/private behavior, (c) emotional expression, (d) conflict approach/avoidance, and (e) self-disclosure. Psychometric assessment of the CCS focused on test-retest reliability, social desirability influences, convergent and discriminant validity, discrimination between known groups, concurrent validity, and factor analysis. The resulting scale was found to have high reliability levels, minimal social desirability bias, sensitivity to some cultural differences, and its validity was supported by the majority of validation studies.  相似文献   

9.
This study investigated various measures commonly employed to assess the person reliability of an individual Minnesota Multiphasic Personality Inventory (MMPB protocol. Specifically, relationships among indices of person reliability and the standard MMPI validity scales were examined using the responses of 82 subjects who completed the MMPI on two occasions separated by 1 week. Person reliability indices were based on within-occasion responses to identical and to psychologically similar items, and on three across-occasion response consistency measures. The validity scales, namely, the L, F, K, and Cannot Say scales, showed higher test-retest stability than the within-occasion person reliability indices. Further, the validity scales and person reliability indices appeared to reflect multiple facets of dependable responding. Interestingly, an individual's tendency to change responses to MMPI items from the test to the retest was significantly predictable. Clinical implications of these findings were derived.  相似文献   

10.
Abstract

Global interpersonal mistrust is conceptualized as a general mistrust of the motives of others in situations related to one's well-being: a general tendency to view others as mean, selfish, malevolent, or unreliable people who are, thus, not to be depended on to treat one well. The authors developed an 18-item unidimensional self-report inventory measuring interpersonal mistrust as a negative cognitive orientation toward others. The measure comprises items describing perceptions of specific hypothetical interpersonal situations rather than items asking respondents to describe their own general behavior. The measure was reliable and evidenced construct validity in a heterogeneous sample of Australians.  相似文献   

11.
R H Moos 《Family process》1990,29(2):199-208; discussion 209-11
This article focuses on the reliability and validity of the Family Environment Scale (FES). The FES subscales generally show adequate internal consistency reliability and stability over time when applied in samples that are diverse; the items also have good content and face validity. An extensive body of research supports the construct, concurrent, and predictive validity of the FES. More generally, reliability and validity are a joint function of scale items and response formats and of the characteristics and diversity of specific samples. To contribute to further advances in family assessment, researchers need to use both conceptual and psychometric criteria rather than rely too heavily on the pursuit of internal consistency reliability and factor analytic approaches to scale construction and validation.  相似文献   

12.
This article presents the Agoraphobia Scale (AS), and evidence for its reliability, validity, and sensitivity to change after treatment. The scale consists of 20 items depicting various typical agoraphobic situations, which are rated for anxiety/discomfort (0-4) and avoidance (0-2). The results show that AS has high internal consistency. Regarding concurrent validity it correlated significantly with other self-reported measures of agoraphobia (Mobility Inventory and Fear Questionnaire). The scale's predictive validity was shown as it correlated with avoidance behavior and self-rated anxiety during both an individualized and a standardized behavioral test of agoraphobia. The AS also discriminated between an agoraphobic sample and a normal sample, and a sample of simple phobia patients. Finally, it was sensitive to changes after behavioral treatment. The AS is useful both as a state, and as an outcome self-report measure of agoraphobia.  相似文献   

13.
心理咨询过程-效果研究现状及展望   总被引:5,自引:0,他引:5  
心理咨询过程-效果研究考察咨询过程变量对咨询效果的影响。以人为中心流派、认知流派、行为流派、精神分析流派等都为该研究领域提供了理论基础。主要研究内容有咨询师的反应方式、会谈中的当事人行为、工作同盟、会谈中的重要内容等过程变量与效果的关系。该领域积累的成果还不多,这可能与过程-效果关系本身的复杂性有关,同时研究方法尚有许多欠缺,如考察复杂关系时使用的研究设计过于简单、测量工具不统一导致结果难以比较,有些测量工具不成熟信效度不高。未来的研究除了努力克服这些问题外,还应多考虑理论构建、内隐变量的调节和中介作用、当事人变量,在研究方法上应更为综合和多样化  相似文献   

14.
旨在编制青少年日常情绪调节问卷,通过增添对日常情绪诱发情境的描述,增强关于对青少年日常情绪调节测量的生态效度,获得更具有现实意义且真实的结果。根据访谈(n=30)获得诱发青少年日常情绪的典型情境,编制了问卷的项目;根据探索性因素分析(n=268)确定的正性情绪调节问卷项目为15个,负性情绪调节问卷项目为20个,皆抽取了四个因素分别命名为"认知重评"、"认知沉浸"、"表达抑制"、"表达宣泄"。经验证性因素分析(n=269)以及信度检验,问卷的各个心理测量学指标均符合要求,表明该结果具有良好的结构效度和内部一致性,可用作青少年日常情绪调节的测量工具。  相似文献   

15.
The strengths and weaknesses for the assessment of personality characteristics of self-reports, reports by knowledgeable informants, and measures based on behavioral observation data are discussed and compared It is suggested that measurement methods should be evaluated with respect to Loevinger's concept of structural validity The concept of structural validity is expanded to include evidence that the scores have properties paralleling the theoretical definition of the characteristic with respect to reliability over occasions, range of referent attributes sampled, and generality across situations or the specification of situation parameters Each of the methods is evaluated for its potential to construct measures with structural validity for different models of constructs The methods are also discussed in relation to several other common threats to validity, specifically, threats stemming from the respondent, from the investigator, and from sampling errors It is likely that demonstrations of convergence between these methods will be possible for some but not all constructs It is also suggested that the examination of structural validity will contribute to the evolution and refinement of constructs appropriate for interactionist theories  相似文献   

16.
The ratio of item validity to item-total correlation can be used to select items which will tend to yield the maximum correlation with a criterion. Items to be retained are identified by comparing the ratio for each item with the validity of the original test. Further improvement of the validity in the experimental sample can be obtained by adding items to or removing items from the selected nucleus, according to recomputed ratios involving the correlations of the items with the nucleus and evaluated by means of a revised cut-off point. With slight variations, the method may be used for interest and personality tests as well as for aptitude material. The principal advantage over previous methods is that for any cycle of the analysis an exact cut-off point is provided.  相似文献   

17.
18.
When subjects attempt to fake psychopathology on the MMPI, scores on subtle subscales tend to be lower than those of nonfaking subjects. Our study hypothesized that this paradox comes about because the subtle subscales have no predictive validity, but their face validity for psychopathology is the opposite of the keyed direction for psychopathology. Subjects who attempt to fake psychopathology do so on the basis of item content and thus achieve lower rather than higher scores. Three groups of 80 undergraduates took the MMPI under regular, faking-good, or faking-bad instructions. As expected, faking-bad subjects scored significantly lower than regular subjects on the 100 most subtle items, and this was due to their responses to those. 73 of the items whose face validity was misleading. The results are consistent with other work showing valid uses of subtle items in detecting deception.  相似文献   

19.
This paper describes the development of a behaviorally based performance appraisal system. Blanz and Ghiselli's Mixed Standard Scale was used as the basis for developing the performance appraisal system for assessing the performance of highway patrol personnel. However, the particular developmental procedures described here differ in some respects from those reported in the literature. Rather than developing rating items describing general traits such as "diligence,""initiative," or "enthusiasm" in behavioral terms, the items in the present scale were developed to describe proficiency levels of specific job tasks. This characteristic is expected to enhance the objectivity of the evaluation system for both appraisal and job counseling purposes. The appraisal instrument was subjected to a series of reliability and validity tests that demonstrated its high reliability and validity. Although the content of the appraisal sytem desribed here included highway patrol tasks, a similar system could be developed using the procedures described for a wide variety and level of jobs.  相似文献   

20.
In this study we tested the hypothesis that groups of NEO Personality Inventory-Revised (NEO-PI-R; Costa & McCrae, 1992a) protocols identified as potentially invalid by an inconsistency scale (INC; Schinka, Kinder, & Kremer, 1997) would show reduced reliability and validity according to a series of psychometric tests. Data were obtained from 2 undergraduate student samples, a self-report group (n = 132) who provided NEO-PI-R self-ratings on 2 occasions separated by a 7- to 14-day interval and an informant group (n = 109) who provided ratings of well-known friends or relatives on 2 occasions separated by a 6 month interval. INC scores from the Time 1 protocols were used to divide these samples into low, moderate, and elevated inconsistency groups. In both samples, these 3 groups showed equivalent levels of reliability and validity as measured by: contingency coefficients for the 20 INC item responses across occasions; test-retest intraclass correlations of NEO-PI-R domain scores; convergent correlations with Goldberg's (1992) Bipolar Adjective Scale scores; and discriminant correlations between the 5 NEO-PI-R domain scores. The similarity of results across self-report and informant assessment contexts provides additional evidence that semantic consistency approaches to assessing protocol validity may overestimate the prevalence of random or careless response behavior in standard administration conditions. Several theories are discussed that accommodate the existence of valid inconsistency in structured personality assessment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号