首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Measurements of domain knowledge very often use and report Cronbach's alpha or similar indicators of internal consistency for test construction. In this short article, we argue that this approach is often at odds with the theoretical conception of knowledge underlying the measure. While domain knowledge is usually described as a formative construct (formed by the manifest observations) theoretically, the use of Cronbach's alpha to construct and evaluate an empirical measure implies a reflective model (the construct reflects in manifest behaviors). After illustrating the difference between reflective and formative models, we illustrate how this mismatch between theoretical conception and empirical operationalization can have substantial implications for the assessment and modeling of domain knowledge. Specifically, the construct may be operationalized too narrowly or even be misinterpreted by applying criteria for item selection that focus on homogeneity such as Cronbach's alpha. Rather than maximizing items internal consistency, researchers constructing measures of domain knowledge should, therefore, make strong arguments for the theoretical merit of their items even if they are not correlated to each other.  相似文献   

2.
ObjectivesDoping use is seldom an accident – it is a deliberate action often requiring considerable commitment. Attitudes are known to influence this type of action and hence they are likely to be predictive of doping-related behaviours. To measure ‘doping attitude’, a valid and reliable tool is required.DesignThis paper briefly reviews methodological issues in doping attitude research, introduces the Performance Enhancement Attitude Scale (PEAS) and provides a comparative analysis of its reliability and validity as a self-reported measure of a generalized doping attitude.MethodsThe scale's reliability was examined with Cronbach's internal consistency coefficient and test–retest correlations using data from 9 independent studies encompassing 7 years. Confirmatory factor analysis was performed to assess the scale's structure. Known-groups' validation strategy was employed to examine construct validity in 4 studies.ResultsEstimates of the PEAS' internal consistency (ranged between .71 and .91 across various samples) provided good evidence of the scale's simultaneous reliability. The chi-square/df ratio in all cases was below the threshold with an average of 1.85 (ranging from 1.370 to 2.291), indicating an acceptable measurement model fit. Theoretically expected difference in doping attitudes was found between doping users and non-users with elevated PEAS scores from users, as well as predictable dynamics of PEAS scores across the repeated measures, provided support for construct validity of the scale.ConclusionThe psychometric properties of the 17-item unidimensional PEAS suggest that the scale is a useful tool for measuring self-declared attitudes toward doping, with adequate reliability and promising validity estimates. Suggestions are discussed for the continuous scale development and validation process.  相似文献   

3.
4.
ABSTRACT

This study examined psychometric properties of the Kindergarten Reading Engagement Scale (KRES), a brief teacher-report measure of classroom reading engagement. Participants were 27 students with identified reading deficits from a predominantly low-income, African-American community. Data were collected in kindergarten (Time 1) and first grade (Time 2). The KRES demonstrated strong internal consistency (Cronbach's alpha = .96) and modest test-retest reliability (r = .66). KRES ratings were significantly correlated with scores from the Word Reading subtest of the Wechsler Individual Achievement Test-Second Edition and the Sound Matching subtest of the Comprehensive Test of Phonological Processing, measured at Time 1 and Time 2. Strategies for refining the scale and implications for applying the KRES in school-based program evaluations are discussed.  相似文献   

5.
The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at random, or not missing at random. Cronbach's alpha, Loevinger's scalability coefficient H, and the item cluster solution from Mokken scale analysis of the complete data were compared with the corresponding results based on the data including imputed scores. The multiple-imputation methods, two-way with normally distributed errors, corrected item-mean substitution with normally distributed errors, and response function, produced discrepancies in Cronbach's coefficient alpha, Loevinger's coefficient H, and the cluster solution from Mokken scale analysis, that were smaller than the discrepancies in upper benchmark multivariate normal imputation.  相似文献   

6.
测验信度估计:从α系数到内部一致性信度   总被引:5,自引:0,他引:5  
温忠麟  叶宝娟 《心理学报》2011,43(7):821-829
沿用经典的测验信度定义, 简介了信度与a 系数的关系以及a系数的局限。为了推荐替代a系数的信度估计方法, 深入讨论了与a 系数关系密切的同质性信度和内部一致性信度。在很一般的条件下, 证明了a 系数和同质性信度都不超过内部一致性信度, 后者不超过测验信度, 说明内部一致性信度比较接近测验信度。总结出一个测验信度分析流程, 说明什么情况下a 系数还有参考价值; 什么情况下a 系数不再适用, 应当使用内部一致性信度(文献上也常称为合成信度)。提供了计算同质性信度和内部一致性信度的计算程序, 一般的应用工作者可以直接套用。  相似文献   

7.
This pilot study examined test-retest and internal consistency reliabilities of original and modified formats of the Exercise Self-efficacy Scale in college-age women. 30 completed original and modified versions of the scale. Data from both tests, administered 1 wk. apart, were analyzed using the intraclass correlation coefficient (ICC) to assess test-retest reliability and Cronbach coefficient alpha for internal consistency. Scores for both versions correlated .96. Cronbach coefficients alpha for the original scale were .96 for Time 1 and .98 for Time 2. Cronbach coefficients alpha for the revised scale were .95 for Time 1 and .98 for Time 2. Test-retest reliability and internal consistency remained consistently high for both versions of the scales within this sample. Implications for use of this scale and recommendations for research are given.  相似文献   

8.
This paper highlights the development and testing of the Infant Movement Motivation Questionnaire (IMMQ), an instrument designed to evaluate qualities of infant characteristics that relate specifically to early motor development. The measurement development process included three phases: item generation, pilot testing and evaluation of acceptability and feasibility for parents and exploratory factor analysis. The resultant 27-item questionnaire is designed for completion by parents and contains four factors including Activity, Exploration, Motivation and Adaptability. Overall, the internal consistency of the IMMQ is 0.89 (Cronbach's alpha), with test–retest reliability measured at 0.92 (ICC, with 95% CI 0.83–0.96). Further work could be done to strengthen the individual factors; however it is adequate for use in its full form. The IMMQ can be used for clinical or research purposes, as well as an educational tool for parents.  相似文献   

9.
In this study, we translated and localized the Adult Decision‐making Competence scale (A‐DMC) and tested its reliability and validity with large samples. Results show the Chinese A‐DMC has relatively good reliability (Cronbach's alpha above 0.6 and test–retest reliability coefficients ranging from 0.44 to 0.78 on all subscales), comparable with the original version. Regarding validity, results of exploratory factor analysis and confirmatory factor analysis support the one‐factor model, indicating the A‐DMC has good internal consistency and construct validity. A‐DMC scores correlated positively with cognitive ability, constructive decision‐making styles, and good decision outcomes. Additionally, individuals with higher A‐DMC scores were found to perform better on the Cambridge gambling task and Iowa gambling task. These results confirm the validity of the Chinese version of the A‐DMC, which is suitable for measuring decision‐making competence in Chinese adults.  相似文献   

10.
IntroductionTeaching and nursing are two of the most stressful occupations. The Maslach Burnout Inventory (MBI) has some limitations that require using it with caution out of the American and Anglo-Saxon context. To address the problems associated with the MBI was developed the Spanish Burnout Inventory (SBI).ObjectiveThis study was designed to assess the psychometric properties of the Portuguese version of the SBI.MethodThe sample consisted of 211 teachers and 133 nurses from Portugal. The psychometric properties were examined through the following analyses: confirmatory factor analysis (CFA), reliability (Cronbach's alpha), and concurrent validity with the MBI.ResultsThe four-factor model obtained an adequate data fit for the sample. The Cronbach's alpha coefficient was adequate for the four scales of the instrument. Results supported the concurrent validity.ConclusionAs a whole, results show that the four-factor model of the SBI possesses adequate psychometric properties for the study of burnout in the Portuguese cultural context.  相似文献   

11.
12.
This discussion paper argues that both the use of Cronbach’s alpha as a reliability estimate and as a measure of internal consistency suffer from major problems. First, alpha always has a value, which cannot be equal to the test score’s reliability given the interitem covariance matrix and the usual assumptions about measurement error. Second, in practice, alpha is used more often as a measure of the test’s internal consistency than as an estimate of reliability. However, it can be shown easily that alpha is unrelated to the internal structure of the test. It is further discussed that statistics based on a single test administration do not convey much information about the accuracy of individuals’ test performance. The paper ends with a list of conclusions about the usefulness of alpha.  相似文献   

13.
One of the central tenets of classical test theory is that scales should have a high degree of internal consistency, as evidenced by Cronbach's α, the mean interitem correlation, and a strong first component. However, there are many instances in which this rule does not apply. Following Bollen and Lennox (1991), I differentiate between questionnaires such as anxiety or depression inventories, which are composed of items that are manifestations of an underlying hypothetical construct (i.e., where the items are called effect indicators) and those such as Scale 6 of the Minnesota Multiphasic Personality Inventory (Hathaway & McKinley, 1943) and ones used to tap quality of life or activities of daily living in which the items or subscales themselves define the construct (these items are called causal indicators). Questionnaires of the first sort, which are referred to as scales in this article, meet the criteria of classical test theory, whereas the second type, which are called indexes here, do not. I discuss the implications of this difference for how items are selected, the relationship among the items, and the statistics that should and should not be used in establishing the reliability of the scale or index.  相似文献   

14.
In this article, the psychometric properties of a new scale aimed at quantifying passion are explored, i.e. passion related to becoming good or achieving in some area/theme/skill.The Passion Scale was designed to be quantitative, simple to administer, applicable for large-group testing, and reliable in monitoring passion.A total of 126 participants between 18 and 47 years of age (mean age = 21.65, SD = 3.45) completed an assessment of Passion Scale, enabling us to investigate its feasibility, internal consistency, construct validity and test-retest reliability.FeasibilityThe overall pattern of results suggest that the scale for passion presented here is applicable for the age studied (18–47).Internal consistencyAll individual item scores correlated positively with the total score, with correlations ranging from 0.51 to 0.69. The Cronbach's alpha value for the standardized items was 0.86.Construct validityPearson correlations coefficient between total score passion scale and Grit-S scale were 0.39 for adults, mean age 21.23 (SD = 3.45) (N = 107).Test-retest reliability: Intraclass correlation coefficient (ICCs) between test and retest scores for the total score was 0.92.These promising results warrant further development of the passion scale, including normalization based on a large, representative sample.  相似文献   

15.
The aim of the present study was to develop a Chinese version of the Cognitive Emotion Regulation Questionnaire (CERQ-C) and to examine its psychometric properties in a sample of Chinese university students. The English version of the Cognitive Emotion Regulation Questionnaire was translated and back-translated prior to its administration to 791 participants recruited from two universities in Changsha, Hunan (China). Internal consistency, test–retest reliability, inter-scale reliability, and factorial validity were analysed. The CERQ-C exhibited: (1) moderate internal consistency (Cronbach's α=.83); (2) a mean inter-class correlation coefficient of .79; (3) a mean inter-item correlation coefficient of .09; and (4) moderate test–retest reliability (.64). Confirmatory factor analyses supported the original CERQ nine-factor model. Finally, with respect to criterion validity, several CERQ-C subscales were uniquely associated with symptoms of depression and anxiety.  相似文献   

16.
ObjectiveThis paper presents the findings of three studies aimed at validating a French version of the situational self-awareness scale (Govern & Marsch, 2001). The 9-item scale measures the extent to which people focus their attention on private or public aspects of themselves or on their surroundings. The scale was translated into French. The first study examined the factor structure, the second study focused on consistency and reliability, and the third study examined the convergent and discriminant validity of the scale.MethodFactor analyses were performed on data collected among 397 students. Internal consistency and test-retest reliability were assessed using Cronbach's alpha and correlation analyses. Finally, we induced public and private self-awareness and assessed awareness to test the validity of the scale.ResultsThe results show that the scale has a three-factor structure and support the reliability of the scale over time. However, doubts remain over the construct validity of the public and private self-awareness dimensions. As expected, the data indicate that SSAS was sensitive to situational variations, in line with previous studies.DiscussionThe discussion focuses on the arguments supporting the use of the original scale and the practical implications of the scale.  相似文献   

17.
In some situations where reliability must be estimated it is impossible to divide the measuring instrument into more than two separately scoreable parts. When this is the case, the parts may be homogeneous in content but clearly unequal in length. The resultant scores will not be essentially τ-equivalent, and hence total test reliability cannot be satisfactorily estimated via Cronbach's coefficient alpha. Limitation on the number of parts rules out Kristof's three-part approach. A technique is developed for estimating reliability in such situations. The approach is shown to function very well when applied to five achievement tests.  相似文献   

18.
Abstract

A scale to measure defensiveness about the marital relationship and another to measure defensiveness about the sexual relationship of couples were developed for each sex. Defensiveness was defined as the tendency to endorse socially desirable items which are unlikely to occur and deny socially undesirable items which characterize most honest responders. The social desirability scale value of the items was empirically determined, and a traditional cross-validation design with two independent groups was used in the item analyses. Cronbach's alpha reliability coefficient ranged between. 75 and 93 for the male and female versions of the scales. The scales correlated higher with another defensiveness scale than with a social desirability scale. Clinicians' ratings of the items in the scales suggested that the scales were not diagnostic of sexual or marital psychopa-thology. Evidence is presented to support that these content specific scales surpass a global defensiveness scale as a measure of defensiveness regarding the sexual or marital relationship of couples.  相似文献   

19.
This study presents data from a cross-national validation of the Kirton (1976) Adaption–Innovation Inventory (KAI) with a Slovak population sample. The results drawn from a large general population sample were very close to those of both the original British work and Italian studies (Prato Previde, 1984; 1991). The internal consistency was high for both the full scale (Cronbach's alpha=0·84) and the subscales (O=0·78; E=0·76; R=0·74). The internal structure of the inventory was also examined using factor analysis. The comparison of English, Italian, and Slovak norms supports the notion that cognitive style is deeply embedded in personality. The Slovak data provide further support for the hypothesis raised by Prato Previde (1991) that cognitive style is independent of culture. While social culture affected responses to some of the KAI items, the overall scale remained unaffected. © 1998 John Wiley & Sons, Ltd.  相似文献   

20.
Abstract

A 32-item questionnaire aimed at assessing patient's satisfaction about every day life is presented. In the Satisfaction Profile (SAT-P) patients are asked to evaluate their own satisfaction level on 32 daily life aspects concerning the last month. 732 participants were enrolled into the study: 490 in-patients suffering from different types of chronic diseases (e.g., chronic heart failure, severe respiratory failure, coronary heart disease) and 242 healthy persons of working age. SAT-P validity was confirmed by comparing its scores with the NHP, EPQ and STAI-X2 scores. The factor analysis extracted 5 factors which corresponded to the hypothesised areas (54% of variance explained). Test-retest reliability and internal consistency were confirmed: Pearson's coefficients were ranging from 0.45 to 0.93 and Cronbach's alpha coefficient was 0.92. SAT-P responsiveness, evaluated by comparing baseline and 6 months follow up scores from 45 chronic heart failure patients, resulted to be satisfatory, although further studies are needed. These results, together with the “user-friendly” structure, the brief administration and scoring time, the simple graphic representation, suggest to consider the SAT-P a useful complementary tool in HRQoL assessment. The Italian, English and French versions are available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号