The assumptions and statistical implications of six methods of test use in em- ployment contexts were examined by using an actual distribution of test scores of 3,377 candidates for jobs as firefighters in a large U.S. city. The six methods examined were: (a) strict top-down referral in order of test score; (b) within-group percentile referral; (c) fixed bands, using random referral within bands; (d) fixed bands, using nonrandom, diversity-based referral within bands; (e) sliding bands, using random referral within bands; and (f) sliding bands, using nonrandom, diversity-based referral within bands. The six strate- gies yielded significant differences on two related criteria: the percentages of minority and nonminority candidates referred for selection, and the relative level of adverse impact produced by each referral strategy. Average test scores of selectees at two different selection ratios did not differ significantly across the six referral strategies. Implications of the findings are discussed in terms of their impact on merit hiring and the achievement of economic and social goals. In general, use of any strategy other than strict top-down referral results in some loss in utility. However, if the goal is to optimize social and ec- onomic objectives simultaneously, then tradeoffs are necessary. In some situ- ations, use of the sliding band (with diversity-based referral) may be best suited to this purpose.  相似文献   

In this article, I first demonstrate that statistical significance testing of differ- ences between predictor scores (whether based on the standard error of mea- surement or any other statistic) is irrelevant to, and inconsistent with, the traditional, optimizing selection model. Second, I demonstrate that all band- ing procedures used in (or advocated for) personnel selection, including the sliding-band procedures advocated by Cascio, Outtz, Zedeck, and Goldstein (1991-this issue), are fatally flawed logically. I show that, when the number of applicants is large, all banding procedures logically lead to the absurd con- clusion that the only justifiable form of selection is random selection. Third, I present evidence that the empirical data set used by Cascio et al. to evaluate different selection strategies is anomalous and leads to results very different from those to be expected in typical and representative data. Specifically, the effect is to produce misleadingly small differences between the strategies in mean test scores of selectees and, therefore, in selection utility. In particular, selection utility losses from all forms of banding, in comparison to top-down selection, are understated. Finally, I show that, apart from the lethal logical flaw in banding procedures, Cascio et al. misinterpreted the meaning and na- ture of statistical significance testing.  相似文献   

探索性因素分析在测验编制中局限性的模拟实验   总被引:6,自引:0,他引:6  
刘红云  孟庆茂 《心理科学》2002,25(2):177-179
本文主要用模拟研究的方法,通过生成拟合优度验证性因素分析的数据,来考察探索性因素分析在测验编制中的局限性。结果表明探索性因素分析作为纯数据基础上的一种统计方法,在因素问相关程度较大时.得到与理论假设不一致的结论。本文还就测验中会聚效度的一些限制作了初步探讨,结合具体情况介绍了中等相关限定条件的实质。  相似文献   

When using latent class analysis to explore multivariate categorical data an important question is -- how many classes are appropriate for this data? An obvious candidate to answer the question is the likelihood ratio test of c[SUB0] against c[SUB1] classes. In this paper this test is investigated by Monte Carlo methods; results confirm that the usually assumed null distribution is inappropriate.  相似文献   

This research demonstrates the effect of framing on applicants' reactions to two personnel selection methods: undergraduate grade point average and personnel interview scores. Presenting a selection situation framed positively (to accept applicants) caused applicants to rate both selection methods more favorably relative to presenting them with an identical selection situation framed negatively (to reject the remaining applicants). Framing affected reactions that emphasized distributive justice aspects of the selection situation and procedural justice aspects. The results are consistent with Prospect theory and with Fairness Heuristic theory. The paper offers a theoretical explanation for the effect of framing on applicants' reactions to personnel selection methods, discusses the implications of this effect, and suggests directions for future research.  相似文献   

This study expands upon Steiner and Gilliland's selection fairness research. Professionals (N = 114) from Mumbai, India rated 12 employee selection methods on favorability and provided the bases for those ratings. In line with previous research, interviews and resumes were rated most favorably, while graphology and honesty tests were rated least favorably. Perceived face validity, opportunity to perform, and widespread use of selection methods were highly correlated with favorability ratings, while interpersonal warmth, scientific evidence, and respectful of privacy exhibited weak correlations with favorability ratings. Work sample tests, which have previously been rated favorably, were rated unfavorably. Exploratory analysis showed that participants viewed assessment centers favorably and online information unfavorably. Outcome favorability was highly correlated with favorability ratings.  相似文献   

应聘者反应是个体在选拔情境下由公平知觉导致的一种针对组织的态度或行为后果.应聘者反应的首个理论模型是Gilliland的选拔公正模型,随后,启发模型、整合模型和信任模型相继建立,极大丰富了应聘者反应的理论内涵.然而,应聘者反应的测量明显滞后于理论的发展,主要表现在测量工具不统一、测量结构混乱、信效度研究匮乏等方面.未来的研究方向有:(1)加强现有模型中论证不充分的环节;(2)进一步丰富跨文化研究;(3)从一般性问题向特定情境的研究进行深入;(4)结合跨领域内容,朝着多元化的方向发展.  相似文献   

The method of selecting among job applicants using statistically based banding has been proposed over the last 10 years as a way to increase workforce diversity. The method continues to be reviewed by academics and considered by practitioners. Although the goal of increasing workforce diversity is important, statistical banding of scores remains controversial. We present a set of unique, statistically and theoretically based criticisms of a form of banding (top‐score‐referenced banding) that is widely used in hundreds of jobs in the public sector throughout the United States. We suggest that even within the premises of such banding, the wrong formula is used to estimate the standard error of measurement and standard error of the difference. One consequence is that too many individuals are labeled as essentially equal with respect to test scores. A related consequence is that test scores within a single band are statistically different and should therefore be treated as such for selection purposes. A more logically and statistically defensible procedure for responding to diversity concerns is to continue to attend to adverse impact issues at each step of the recruiting and test development process.  相似文献   

The authors conducted a Monte Carlo simulation of 8 statistical tests for comparing dependent zero-order correlations. In particular, they evaluated the Type I error rates and power of a number of test statistics for sample sizes (Ns) of 20, 50, 100, and 300 under 3 different population distributions (normal, uniform, and exponential). For the Type I error rate analyses, the authors evaluated 3 different magnitudes of the predictor-criterion correlations (rho(y,x1) = rho(y,x2) = .1, .4, and .7). For the power analyses, they examined 3 different effect sizes or magnitudes of discrepancy between rho(y,x1) and rho(y,x2) (values of .1, .3, and .6). They conducted all of the simulations at 3 different levels of predictor intercorrelation (rho(x1,x2) = .1, .3, and .6). The results indicated that both Type I error rate and power depend not only on sample size and population distribution, but also on (a) the predictor intercorrelation and (b) the effect size (for power) or the magnitude of the predictor-criterion correlations (for Type I error rate). When the authors considered Type I error rate and power simultaneously, the findings suggested that O. J. Dunn and V. A. Clark's (1969) z and E. J. Williams's (1959) t have the best overall statistical properties. The findings extend and refine previous simulation research and as such, should have greater utility for applied researchers.  相似文献   

The present study explored the effects of 2 variables, affirmative action (AA) attitude and gender, on reactions to 3 test score use (TSU) methods: top‐down selection. banding with random selection, and banding with preferences. In a study of 94 upper‐division and graduate business students, AA attitude was associated with different reactions to TSU methods in terms of fairness and organizational attractiveness. Moreover, women with negative AA attitudes and men rated banding with preferences lower than the other two methods, but women with positive AA attitudes did not Results are discussed in terms of applicant reactions models, implications for organizations and future research.  相似文献   

Companies frequently use preselection methods in order to identify eligible candidates before conducting assessment centers (ACs). The present study was the first to investigate current practices of preselection in German companies that use ACs for internal selection. Results of a survey among 109 German companies show that companies typically apply general qualification criteria (e.g., work experience), appraisals, or unstructured interviews for preselection, but rely less on trait‐oriented methods (e.g., intelligence tests) and structured interviews. In their subjective evaluations, however, companies rate structured interviews and trait‐oriented methods as particularly valid methods. The results also show that the number of preselection methods used is positively correlated with company size, diagnostic skills of the responsible persons in the preselection, and DIN (Deutsches Institut für Normierung [German Institute for Standardization]) 33430 certification status. The implications of these findings for practice and research are discussed.  相似文献   

Recent research has seen intraindividual variability become a useful technique to incorporate trial-to-trial variability into many types of psychological studies. Intraindividual variability, as measured by individual standard deviations (ISDs), has shown unique prediction to several types of positive and negative outcomes (Ram, Rabbit, Stollery, & Nesselroade, 2005). One unanswered question regarding measuring intraindividual variability is its reliability and the conditions under which optimal reliability is achieved. Monte Carlo simulation studies were conducted to determine the reliability of the ISD as compared with the intraindividual mean. The results indicate that ISDs generally have poor reliability and are sensitive to insufficient measurement occasions, poor test reliability, and unfavorable amounts and distributions of variability in the population. Secondary analysis of psychological data shows that use of individual standard deviations in unfavorable conditions leads to a marked reduction in statistical power, although careful adherence to underlying statistical assumptions allows their use as a basic research tool. (PsycINFO Database Record (c) 2012 APA, all rights reserved).  相似文献   

作假普遍存在于人事选拔各个阶段,并对最终选拔结果造成影响.研究者对于作假的内涵界定有较大差异,主要是由于研究者对作假结构、变异来源和作假水平有不同理解.根据不同作假定义可衍生出多种作假测量方法,常用的有基线差值法、认知模式法、嵌入量表法和行为模式法四类.从测量指标、次数和内容三个方面分析归纳这四类测量方法,其作假识别效用与选拔中作假测量的可行性各异.今后的研究应完善现有作假测量方法,开发作假动机测量工具,加强作假的过程性控制研究,并深入探索作假的个体差异.  相似文献   

Monte Carlo procedures are used to study the sampling distribution of the Hoyt reliability coefficient. Estimates of mean, variance, and skewness are made for the case of the Bower-Trabasso concept identification model. Given the Bower-Trabasso assumptions, the Hoyt coefficient of a particular concept identification experiment is shown to be statistically unlikely.  相似文献   

The use of propensity scores in psychological and educational research has been steadily increasing in the last 2 to 3 years. However, there are some common misconceptions about the use of different estimation techniques and conditioning choices in the context of propensity score analysis. In addition, reporting practices for propensity score analyses often lack important details that allow other researchers to confidently judge the appropriateness of reported analyses and potentially to replicate published findings. In this article we conduct a systematic literature review of a large number of published articles in major areas of social science that used propensity scores up until the fall of 2009. We identify common errors in estimation, conditioning, and reporting of propensity score analyses and suggest possible solutions.  相似文献   

