期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Estimating Coefficients of Equivalence and Stability for Job Performance Ratings: The importance of controlling for transient error on criterion measurement

下载免费PDF全文

Jesús F. Salgado 《International Journal of Selection & Assessment》2015,23(1):37-44

Previous research on measurement error in job performance ratings estimated reliability using coefficients: alpha, test–retest, and interrater correlation. None of these three coefficients control for the four main sources of error in performance ratings. For this reason, coefficient of equivalence and stability (CES) has been suggested as the ideal estimate of reliability. This article presents the estimates of CES for a time interval of 1, 2, and 3 years. The values obtained for a single rater were .51, .48, and .44, respectively. For two raters, the values were .59, .55, and .51. The findings suggest that previous reliability estimates based on alpha, test–retest, and interrater coefficients overestimated the reliability of job performance ratings. In the present study, the interrater coefficient overestimates reliability by 13.6–25.4% for an interval time of 1–3 years, as it does not control for transient error. Results also showed that the importance of transient error increases as the length of the interval between the measures increases. Based on the results, it is suggested that corrected validities based on interrater reliability underestimate the magnitude of the validity. The implications of these findings for future efforts to estimate criterion reliability and predictor validity are discussed. 相似文献

2.

Integrative Self-Knowledge Scale: correlations and incremental validity of a cross-cultural measure developed in Iran and the United States

Ghorbani N Watson PJ Hargis MB 《The Journal of psychology》2008,142(4):395-412

The authors used Iranian (N = 723) and American (N = 900) samples to develop an Integrative Self-Knowledge Scale for measuring a temporally integrated understanding of processes within the self. They administered this new instrument, the Mindfulness Scale (K. W. Brown & R. M. Ryan, 2003), the Reflective and Experiential Self-Knowledge Scales (N. Ghorbani, M. N. Bing, P. J. Watson, H. R. Davison, & D. L. Lebreton, 2003), and additional sample-specific measures to 3 separate groups of university students in each society. The Integrative Self-Knowledge Scale displayed internal reliability and measurement equivalence, along with convergent, criterion, discriminant, and incremental validity. This new instrument may be useful in promoting cross-cultural research in positive psychology. 相似文献

3.

概化理论研究及应用前景 总被引：9，自引：0，他引：9

刘桔《心理科学》2003,26(3):433-437

1972年,Cronbach和他的同事们提出概化理论之后,概化理论在行为与心理测量领域得到了广泛的应用,较之经典测量理论,它的优势逐渐地显露：(1)测量的多种误差来源可以在同一个分析中分别估计;(2)可以指导决策者选择最优测量方案;(3)提供可靠性系数：概化系数(G系数)和依存性指标(φ系数)用于不同的决策任务;(4)排除了严格平行测验的假设。概化理论以它的精确性和可藏性受到了信度测量领域研究者们的青睐,本文旨在对概化理论的基本框架、产生、发展及应用前景进行详细论述。相似文献

4.

Estimation of Transient Error in Cognitive Ability Scales

Charlie L. Reeve Eric D. Heggestad Elisa George 《International Journal of Selection & Assessment》2005,13(4):316-320

In response to Schmidt, Le, and Ilies' (2003) call for further assessment of transient error variance in measures of key individual difference variables, the current study examines transient error estimates for eight speeded cognitive ability scales and a composite scale. One hundred twenty‐three participants were tested on two occasions with a 1‐week interval between administrations. Estimates of transient error ranged from a low of .01 to a high of .18, with an average value of .09. The coefficient of equivalence, based on the split‐half method, resulted in overestimates of reliability by as little as 1.90% to as much as 25.96%, with an average 12.96%. 相似文献

5.

Functional equivalence of spatial representations derived from vision and language: evidence from allocentric judgments

Avraamides MN Loomis JM Klatzky RL Golledge RG 《Journal of experimental psychology. Learning, memory, and cognition》2004,30(4):804-814

Past research (e.g., J. M. Loomis, Y. Lippa, R. L. Klatzky, & R. G. Golledge, 2002) has indicated that spatial representations derived from spatial language can function equivalently to those derived from perception. The authors tested functional equivalence for reporting spatial relations that were not explicitly stated during learning. Participants learned a spatial layout by visual perception or spatial language and then made allocentric direction and distance judgments. Experiments 1 and 2 indicated allocentric relations could be accurately reported in all modalities, but visually perceived layouts, tested with or without vision, produced faster and less variable directional responses than language. In Experiment 3, when participants were forced to create a spatial image during learning (by spatially updating during a backward translation), functional equivalence of spatial language and visual perception was demonstrated by patterns of latency, systematic error, and variability. 相似文献

6.

心理健康素质测评系统·中国成年人生活信念量表的编制

张秀阁梁宝勇《心理与行为研究》2012,10(5):340-346

本研究的目的是编制适合中国社会文化特点的生活信念量表。通过搜集整理已有相关文献,并考虑《心理健康素质测评系统》的整体结构,确定了该量表的合理性和可控性两维度的理论构想。通过参考国外同类量表项目以及在心理学专家中征集项目的方式形成初始量表,并通过预测筛选确定量表的最终项目。测试结果表明,生活信念量表具有较高的重测信度、内部一致性信度和较高的结构效度、内容效度、聚合效度以及同时效度。结论：生活信念量表具备令人满意的心理测量学特征,可以被用来评估中国成年人的生活信念。相似文献

7.

Person—environment congruence,self‐efficacy,and environmental identity in relation to job satisfaction: a career decision theory perspective

Stacie Vernick Perdue Robert C. Reardon Gary W. Peterson 《Journal of Employment Counseling》2007,44(1):29-39

This study explored the relationship between person—environment congruence, self‐efficacy, and environmental identity and job satisfaction. Participants were 198 employees of a multinational telecommunications corporation. The predictor domain included the Iachan Index (R. Iachan, 1984), the Mahalanobis Distance Index (L. J. Cronbach & G. C. Gleser, 1953), the Self‐Efficacy Scale (M. Sherer et al., 1982, 2000), and the Environmental Identity Scale (G. D. Gottfredson & J. L. Holland, 1996; J. L. Holland, 1997). The criterion domain included 6 components of job satisfaction. A canonical correlation analysis identified 2 significant roots labeled organizational mission satisfaction and work task satisfaction. Implications for career decision making are discussed. 相似文献

8.

Assessing obsessive compulsive symptoms and cognitions on the internet: evidence for the comparability of paper and Internet administration

Coles ME Cook LM Blake TR 《Behaviour research and therapy》2007,45(9):2232-2240

Administration of psychological questionnaires via the Internet has gained popularity in recent years and touts many advantages. However, before questionnaires that were originally developed as paper-and-pencil measures can be confidently administered over the Internet, it is necessary to document the equivalence of the paper and computer-generated versions [American Psychological Association. (1986). Guidelines for computer-based tests and interpretations. Washington, DC: American Psychological Association; Cohen, R.J., Swerdlik, M.E., & Smith, D.K. (1992). Psychological testing and assessment (2nd ed.). Mountain View, CA: Mayfield Publishing; Cronbach, L.J. (1990). Essentials in psychological testing (5th ed.). New York: Harper Collins; Meier, S. (1994). The chronic crisis in psychological measurement and assessment: A historical survey. San Diego: Academic Press; Schulenberg, S.E., & Yutrzenka, B.A. (2001). Equivalence of computerized and conventional versions of the Beck Depression Inventory- II (BDI-II). Current Psychology: Developmental, Learning, Personality, Social, 20, 216-230]. The current study tested this equivalence for the Obsessive Compulsive Inventory [Foa, E.B., Kozak, M.J., Salkovskis, P.M, Coles, M.E., & Amir, N. (1998). The validation of a new obsessive compulsive disorder scale: The obsessive-compulsive inventory. Psychological Assessment, 10(3), 206-214] and the Obsessive Beliefs Questionnaire-44 [Obsessive Compulsive Cognitions Working Group. (2005). Psychometric validation of the obsessive belief questionnaire and interpretation of intrusions inventory-Part 2: Factor analyses and testing of a brief version. Behaviour Research and Therapy, 43, 1527-1543] in an unselected student sample. Study results support the equivalence of these measures of obsessive compulsive disorder (OCD) symptoms and beliefs independent of administration method (paper versus secure project website). These findings create new opportunities for conducting OCD-related research online. 相似文献

9.

Using Generalizability Theory to Disattenuate Correlation Coefficients for Multiple Sources of Measurement Error

Walter P. Vispoel Carrie A. Morris Murat Kilinc 《Multivariate behavioral research》2013,48(4):481-501

Over the years, research in the social sciences has been dominated by reporting of reliability coefficients that fail to account for key sources of measurement error. Use of these coefficients, in turn, to correct for measurement error can hinder scientific progress by misrepresenting true relationships among the underlying constructs being investigated. In the research reported here, we addressed these issues using generalizability theory (G-theory) in both traditional and new ways to account for the three key sources of measurement error (random-response, specific-factor, and transient) that affect scores from objectively scored measures. Results from 20 widely used measures of personality, self-concept, and socially desirable responding showed that conventional indices consistently misrepresented reliability and relationships among psychological constructs by failing to account for key sources of measurement error and correlated transient errors within occasions. The results further revealed that G-theory served as an effective framework for remedying these problems. We discuss possible extensions in future research and provide code from the computer package R in an online supplement to enable readers to apply the procedures we demonstrate to their own research. 相似文献

10.

Test “reliability”: Its meaning and determination

Lee J. Cronbach 《Psychometrika》1947,12(1):1-16

The concept of test reliability is examined in terms of general, group, and specific factors among the items, and the stability of scores in these factors from trial to trial. Four essentially different definitions of reliability are distinguished, which may be called the hypothetical self-correlation, the coefficient of equivalence, the coefficient of stability, and the coefficient of stability and equivalence. The possibility of estimating each of these coefficients is discussed. The coefficients are not interchangeable and have different values in corrections for attentuation, standard errors of measurement, and other practical applications. 相似文献

11.

Reliability and validity of a measure of emotional intelligence in an Iranian sample

Yousefi F 《Psychological reports》2006,98(2):541-548

The purpose of this study was to investigate the reliability and validity of the Mayer-Salovey-Caruso Emotional Intelligence Test, Version 2.0, in the Iranian culture. The sample included 353 students (168 male, 185 female) from senior high schools in Shiraz, ranging in age between 16 and 18 years (M=17.1), SD=.5), and 394 students (113 male, 281 female) from Shiraz University, ranging in age between 19 and 25 years (M=21.3, SD= 1.7). The subscale-total score correlations were in the upper fifties. Cronbach coefficient alpha was .86 for the full score and ranged from .58 to .86 for the 4 subscales of the test. The factor analysis supported 1- and 2-factor solutions of the emotional intelligence domain. The results generally supported the reliability of the test at the total score level for research in the Iranian culture. 相似文献

12.

Does computerizing paper-and-pencil job attitude scales make a difference? New IRT analyses offer insight

Donovan MA Drasgow F Probst TM 《The Journal of applied psychology》2000,85(2):305-313

相似文献

13.

A coefficient alpha for test-retest data

Green SB 《心理学方法》2003,8(1):88-101

Transient errors are caused by variations in feelings, moods, and mental states over time. If these errors are present, coefficient alpha is an inflated estimate of reliability. A true-score model is presented that incorporates transient errors for test-retest data, and a reliability estimate is derived. This estimate, referred to as the test-retest alpha, is less than coefficient alpha if transient error is present and is less susceptible to effects due to item recall than a test-retest correlation. An assumption underlying the test-retest alpha is essential tau equivalency of items. A test-retest split-half coefficient is presented as an alternative to the test-retest alpha when this assumption is violated. The test-retest alpha is the mean of all possible test-retest split-half coefficients. 相似文献

14.

A comparison of linear and nonlinear relations between organizational commitment and work outcomes

Luchak AA Gellatly IR 《The Journal of applied psychology》2007,92(3):786-793

The authors compared linear and nonlinear relations between affective and continuance commitment and 3 commonly studied work outcomes (turnover cognitions, absenteeism, and job performance), observed in 3 separate research settings. Using a linear model, they replicated the common observation in the literature that affective commitment is more strongly related to work outcomes than continuance commitment. Introducing a higher order continuance commitment term into the same equations, however, they found that the linear model seriously understated the magnitude of continuance commitment's effect on all 3 criterion measures. These findings are consistent with recent developments that identify different motivational mindsets associated with affective and continuance commitment (J. P. Meyer, T. E. Becker, & C. Vandenberghe, 2004). 相似文献

15.

Transient error or specificity? An alternative to the staggered equivalent split-half procedure

Vautier S Jmel S 《心理学方法》2003,8(2):225-238

The data-based partial and complete reliability coefficients defined by G. Becker (2000) in his staggered equivalent split-half procedure (SESHP) were compared with the model-based specificity and consistency reliability coefficients defined by R. Steyer, M. J. Schmitt, and M. Eid (1999) in their latent state-trait model (LSTM). Partial reliability is based on coefficient alpha, which contains transient error. State and trait anxiety, measured using the State-Trait Anxiety Inventory on 2 occasions among French adults, illustrated both approaches. Theoretical and empirical analyses demonstrated that the specificity and consistency coefficients offer a testable alternative to the SESHP coefficients. In addition, dynamics of the state residual variance could be modeled and estimated in LSTM. 相似文献

16.

Reliability of a Longitudinal Sequence of Scale Ratings

Annouschka Laenen Ariel Alonso Geert Molenberghs Tony Vangeneugden 《Psychometrika》2009,74(1):49-64

Reliability captures the influence of error on a measurement and, in the classical setting, is defined as one minus the ratio of the error variance to the total variance. Laenen, Alonso, and Molenberghs (Psychometrika 73:443–448, 2007) proposed an axiomatic definition of reliability and introduced the R _T coefficient, a measure of reliability extending the classical approach to a more general longitudinal scenario. The R _T coefficient can be interpreted as the average reliability over different time points and can also be calculated for each time point separately. In this paper, we introduce a new and complementary measure, the so-called R _Λ, which implies a new way of thinking about reliability. In a longitudinal context, each measurement brings additional knowledge and leads to more reliable information. The R _Λ captures this intuitive idea and expresses the reliability of the entire longitudinal sequence, in contrast to an average or occasion-specific measure. We study the measure’s properties using both theoretical arguments and simulations, establish its connections with previous proposals, and elucidate its performance in a real case study. The authors are grateful to J&J PRD for kind permission to use their data. We gratefully acknowledge support from Belgian IUAP/PAI network “Statistical Techniques and Modeling for Complex Substantive Questions with Complex Data.” 相似文献

17.

Connectionist and diffusion models of reaction time. 总被引：8，自引：0，他引：8

R Ratcliff T Van Zandt G McKoon 《Psychological review》1999,106(2):261-300

Two connectionist frameworks, GRAIN (J. L. McClelland, 1993) and brain-state-in-a-box (J. A. Anderson, 1991), and R. Ratcliff's (1978) diffusion model were evaluated using data from a signal detection task. Dependent variables included response probabilities, reaction times for correct and error responses, and shapes of reaction-time distributions. The diffusion model accounted for all aspects of the data, including error reaction times that had previously been a problem for all response-time models. The connectionist models accounted for many aspects of the data adequately, but each failed to a greater or lesser degree in important ways except for one model that was similar to the diffusion model. The findings advance the development of the diffusion model and show that the long tradition of reaction-time research and theory is a fertile domain for development and testing of connectionist assumptions about how decisions are generated over time. 相似文献

18.

Test reliability and correction for attenuation

JOHNSON HG 《Psychometrika》1950,15(2):115-119

Evidence is cited to show that specificity, or lack of equivalence, in the comparable forms of tests has a tendency to lower the value of reliability coefficients but has no tendency to lower the value of observed trait coefficients. This implies that the greater the lack of equivalence, the higher will be coefficients corrected for attenuation. Errors of measurement are supposed to reduce the magnitude of observed trait coefficients. Since specificity does not lower the correlation between two tests and since the split-half and equivalent-form reliability coefficients treat specificity as error, it follows that these two coefficients cannot legitimately be used in Spearman's correction-for-attenuation formula. 相似文献

19.

Test of the cross-cultural generalizability of a model of sexual harassment 总被引：3，自引：0，他引：3

Wasti SA Bergman ME Glomb TM Drasgow F 《The Journal of applied psychology》2000,85(5):766-778

Sexual harassment research has been primarily limited to examination of the phenomena in U.S. organizations; attempts to explore the generalizability of constructs and theoretical models across cultures are rare. This study examined (a) the measurement equivalence of survey scales in U.S. and Turkish samples using mean and covariance structure analysis and (b) the generalizability of the L. F. Fitzgerald, F. Drasgow, C. L. Hulin, M. J. Gelfand, and V. J. Magley (1997) model of sexual harassment to the Turkish context using structural equations modeling. Analyses used questionnaire data from 336 Turkish women and 455 women from the United States. The results indicate that, in general, the survey scales demonstrate measurement equivalence and the pattern of relationships in the Fitzgerald et al. model generalizes to the Turkish culture. These results support the usefulness of the model for explaining sexual harassment experiences in a variety of organizational and cultural contexts. 相似文献

20.

MEASUREMENT ERROR IN RESEARCH ON HUMAN RESOURCES and FIRM PERFORMANCE: HOW MUCH ERROR IS THERE AND HOW DOES IT INFLUENCE EFFECT SIZE ESTIMATES? 总被引：8，自引：1，他引：7

BARRY GERHART PATRICK M. WRIGHT GARY C. MC MAHAN SCOTT A. SNELL 《Personnel Psychology》2000,53(4):803-834

Studies of the relationship between human resource (HR) practices and firm performance typically use a single respondent to assess firm level HR practices or HR effectiveness. However, previous research in other substantive areas suggests that rater differences are a potentially important source of measurement error. We demonstrate analytically the potential consequences of both random and systematic measurement error in research on HR and firm performance. However, our main focus is on random error and we show how generalizability theory can be applied to obtain better estimates of reliability by simultaneously recognizing multiple sources (e.g., items, raters) of random measurement error. These more inclusive reliability estimates, in turn, offer the possibility of more precisely quantifying substantive relationships in the HR and firm performance literature. In our sample, reliabilities (as estimated by generalizability coefficients) for single-rater assessments of HR variables were generally below .50. This degree of measurement error, if present in substantive studies on HR and firm performance, could lead to considerable bias, given that an unstandardized regression coefficient is corrected for measurement error in the independent variable by dividing by its reliability coefficient (not its square root). We also found only limited convergent validity between HR and line managers ratings of a second type of HR measure, HR effectiveness. In general, our findings suggest that future researchers need to devote greater attention to measurement error and construct validity issues. Our study provides an example of how generalizability theory can be useful in this pursuit. 相似文献