首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
效度概化的理论研究述评   总被引:1,自引:0,他引:1  
效度概化(validity generalization:VG)是心理测量理论和元分析的结合体,也是元分析在人事测评领域的重要应用。它深刻地挑战了人事测评效度的情景特异论,在工业与组织心理学的发展历程当中具有里程碑意义。该文系统地介绍了效度概化理论的逻辑思路、主要模型、关键理论语与应用,以及研究趋势。  相似文献   

2.
概化理论广泛应用于各种心理测评实践中。当有预算限制时,概化理论需要考虑如何设计一个测量可靠性相对较高且可行性也相对较强的测量程序,这就要求通过某些途径估计最佳样本量。拉格朗日乘法是概化理论预算限制下最佳样本量估计较为成熟的方法。探讨了概化理论预算限制下最佳样本量估计的一些影响因素,如受总预算舍入的影响等,也提出了一些后续改善的建议,如推导出拉格朗日乘法的统一公式等  相似文献   

3.
评价中心技术是现代人事测评的一种主要形式,主要通过多种模拟任务来测评高级管理人才。然而,少有研究能证明它具有理想的结构效度。该研究运用多质多法分析了某金融企业一次真实的评价中心测评项目,发现测评的结构效度并不理想。接着,通过多元概化理论的分析,探讨了结构效度不理想的原因,并指出了评价中心任务与所测维度之间的关系,以及对各维度的测量信度。最后,根据分析结果,探讨了优化评价中心结构效度和提升整体测量信度的途径。  相似文献   

4.
经典真分数理论信度观和概化理论信度观是当代心理测量学中的两大信度理论。该文简要地介绍了两种信度观的基本思想,并在此基础上,对两种信度理论的特征进行系统的分析和比较。文章认为:概化理论信度观在理论假设、测量误差源的综合分析技术、测量条件的推广研究等方面,比经典真分数理论具有更多的优越性,但概化理论信度观注重所测心理特质的单维性和方差分量估计上的负值现象等局限性,大大限制了概化理论在实际测量中的应用。文章指出,就目前的发展现状而言,没有哪一种信度理论能够包容或取代另一种信度理论,由此提出了当代信度理论合理使用的几点建议。  相似文献   

5.
经典真分数理论与概化理论信度观评析   总被引:7,自引:0,他引:7  
经典真分数理论信度观和概化理论信度观是当代心理测量学中的两大信度理论。该简要地介绍了两种信度观的基本思想,并在此基础上,对曲种信度理论的特征进行系统的分析和比较。章认为:概化理论信度观在理论假设、测量误差源的综合分析技术、测量条件的推广研究等方面,比经典真分数理论具有史多的优越性,但概化理论信度观沣重所测心埋特质的单维性和方差分量估计上的负值现象等局限性,大火限制了概化理论在实际测量中的应用。章指出,就目前的发展现状而言,没有哪一种信度理论能够包容或取代另一种信度理论,由此提出了当代信度理论合理使用的几点建议。  相似文献   

6.
概化理论研究及应用前景   总被引:9,自引:0,他引:9  
刘桔 《心理科学》2003,26(3):433-437
1972年,Cronbach和他的同事们提出概化理论之后,概化理论在行为与心理测量领域得到了广泛的应用,较之经典测量理论,它的优势逐渐地显露:(1)测量的多种误差来源可以在同一个分析中分别估计;(2)可以指导决策者选择最优测量方案;(3)提供可靠性系数:概化系数(G系数)和依存性指标(φ系数)用于不同的决策任务;(4)排除了严格平行测验的假设。概化理论以它的精确性和可藏性受到了信度测量领域研究者们的青睐,本文旨在对概化理论的基本框架、产生、发展及应用前景进行详细论述。  相似文献   

7.
用多元概化理论分析由75名应聘者和7名面试官组成的结构化面试中多种变异来源,结果表明:面试中5个测评因子的概化系数约为0.81~0.88之间,说明该面试7名考官对应聘者评分并不太一致,但勉强可以接受,产生误差的主要来源是应聘者与面试官交互作用影响;继续增加面试官数量可有效提高概化系数,但是并不符合实际,对面试官在评分标准上进行统一培训才是解决问题的最佳方法。概化理论具有经典测量理论不可及的优势,适合广泛应用于结构化面试的数据分析中。  相似文献   

8.
采用项目反应理论(IRT)的多侧面Rasch模型(MFRM),分析评价中心技术中无领导小组讨论(LGD)的测评结果,探讨被试能力水平、评委评分宽严度、评分内部一致性、维度难度和评定等级等问题,进而讨论各种偏差。通过 MFRM 分析人事测评结果,可深入了解被试能力的真实差异、甑别维度难度、探查测评误差源,从而完善测评试题编制、评估或诊断评委合格性、提高测评维度与测评目的匹配性,为拓展项目反应理论在人事测评中的应用提供独特视角。  相似文献   

9.
人事测评效度验证的多方法途径   总被引:1,自引:0,他引:1  
本文围绕我国人事测评研究与应用中存在的方法论问题,结合人事心理学理论与方法的最新发展,讨论了人事测评效度验证的若干模型与具体方法。文章在此基础上提出了人事测评效度验证的多方法途径及其对人力资源管理的意义。  相似文献   

10.
基于概化理论的方差分量变异量估计   总被引:2,自引:0,他引:2  
黎光明  张敏强 《心理学报》2009,41(9):889-901
概化理论广泛应用于心理与教育测量实践中, 方差分量估计是进行概化理论分析的关键。方差分量估计受限于抽样, 需要对其变异量进行探讨。采用蒙特卡洛(Monte Carlo)数据模拟技术, 在正态分布下讨论不同方法对基于概化理论的方差分量变异量估计的影响。结果表明: Jackknife方法在方差分量变异量估计上不足取; 不采取Bootstrap方法的“分而治之”策略, 从总体上看, Traditional方法和有先验信息的MCMC方法在标准误及置信区间这两个变异量估计上优势明显。  相似文献   

11.
三种心理测量理论的信度观   总被引:5,自引:0,他引:5  
目前,心理测量领域中主要存在三大理论派别。本文分别对这三种理论即经典测验理论、可概括性理论和项目反应理论作了简要介绍,着重分析这三种理论的信度观。文章讨论了这三种信度观的理论基础和研究方法,比较了它们的异同,指出经典测验理论存在的一些不足及概化理论和项目反应理论所作的改进。概化理论是对经典测验理论的扩展,它用多维的信度指标(概化系数)替代了经典测验理论的信度系数,项目反应理论则从信息量的角度出发,用项目信息函数、测验信息函数等指标更具体深入地反映项目、测验的测量可靠程度。  相似文献   

12.
黎光明  秦越 《心理学报》2022,54(10):1262-1276
概化理论在心理与教育测量领域应用较广。如何使测量程序在预算限制的情况下达到较优的可靠性是研究者需要考虑的重要问题, 这个问题可以转换为最佳样本量估计的问题。提出了一种基于进化算法的估计概化理论下最佳样本量的新方法——约束进化算法, 并采用模拟研究的方法比较了微分优化法、拉格朗日法、柯西不等式法等三种传统方法与约束进化算法的优劣。结果表明:在两侧面交叉设计、两侧面嵌套设计和三侧面交叉设计中都证明了约束进化算法更具优越性, 建议研究者在今后的研究中优先使用。  相似文献   

13.
用多元概化理论对普通话的测试   总被引:5,自引:0,他引:5  
杨志明  张雷 《心理学报》2002,34(1):51-56
用多元概化理论 (MGT)研究了国家语委编制的普通话测验。在G研究中 ,利用香港人普通话测试的数据 ,估计了引起分数变异的各种来源的方差与协方差分量。在D研究中 ,首先估计了该测验 3个部分的全域分数和各自的概化系数等技术指标 ,然后估计了全域合成分数及其概化系数、信噪比等指标。结果表明 ,该测验的信度从总体上讲是较高的 ,把三个部分的全域分数进行合成也是合理的 ,但从细节上看其第 3部分的信度较低。另外 ,当评分者个数为 3、试题数量为 2 8时 ,测验的第 1、2部分的信度已经较高 ,因此 ,在实测时减少这两部分的题量并不会有太大问题  相似文献   

14.
SPSS and SAS programs for generalizability theory analyses   总被引:1,自引:0,他引:1  
The identification and reduction of measurement errors is a major challenge in psychological testing. Most investigators rely solely on classical test theory for assessing reliability, whereas most experts have long recommended using generalizability theory instead. One reason for the common neglect of generalizability theory is the absence of analytic facilities for this purpose in popular statistical software packages. This article provides a brief introduction to generalizability theory, describes easy to use SPSS, SAS, and MATLAB programs for conducting the recommended analyses, and provides an illustrative example, using data (N= 329) for the Rosenberg Self-Esteem Scale. Program output includes variance components, relative and absolute errors and generalizability coefficients, coefficients for D studies, and graphs of D study results.  相似文献   

15.
The generalizability of behaviors across observational conditions is a critical issue in behavioral assessment. Generalizability theory was used to examine two aspects of audio recorded parent-child interactions recorded over 6 days of home measurement and 1 day of laboratory measurement in a behavioral treatment program for childhood obesity. Families audiotaped parent-child home meetings during which they reviewed self-monitored diet and exercise records that were coded for the following types of interactions: praise statements, negative statements, prompts for new behaviors, and statements promoting problem solving. A similar meeting was audiotaped in our laboratory. The first question explored was the number of measurements needed to generalize to the universe of the six home measures. Results showed an increase in generalizability over measurements for each behavioral category. Using generalizability coefficients of .60 or more, praise, negative comments and prompts, respectively, could be reliably observed based on 1, 4, or 4 days of measurement. Second, the effects of setting (laboratory versus home) were assessed for 1 day of measurement in each environment. Again using generalizability coefficients of .60, generalizability analysis showed that the lab setting could not be generalized to the home setting based on 1 day of measurement, with generalizability coefficients ranging from .27 for negative comments to .57 for praise. Results suggest that 4 days of behavioral assessment in the home can be used to establish generalizable data for all the dependent measures studied. However, generalizability coefficients suggested that 1 day of laboratory measurement was not adequate to generalize to typical home behavior.This research was supported in part by Grant NIH HD 23713 awarded to the third author.  相似文献   

16.
"青少年学生生活满意度量表"的概化理论研究   总被引:2,自引:0,他引:2  
何立国  周爱保 《心理科学》2006,29(5):1199-1202,1218
概化理论是用统计调整技术分析测量误差的一种测量理论,它侧重于从宏观方面讨论实测时的测量条件与结论推广应用范围之间的关系来探讨测量的外部效度问题。本文用概化理论对青少年学生生活满意度量表(CASLSS)进行了研究,得到以下研究结果:(1)对于生活满意度的维度数目,就我国青少年学生而言取6到8个维度较为合适,当对CASLSS取2个维度时,CASLSS只适合作常模参照性解释,而不适合作标准参照性解释;(2)CASLSS的各分量表和总量表的信度较高,且它不仅可以作常模参照性解释,还适合作标准参照性解释;(3)CASLSS的环境满意度因子相对其它五个因子而言,量表特性稍差,是今后改进该量表的主要方向。CASLSS无论是各个因子还是总量表均具有非常优良的量表特性,值得在实际的工作和研究中加以推广应用。  相似文献   

17.
Generalizability Theory (GT) offers increased utility for assessment research given the ability to concurrently examine multiple sources of variance, inform both relative and absolute decision making, and determine both the consistency and generalizability of results. Despite these strengths, assessment researchers within the fields of education and psychology have been slow to adopt and utilize a GT approach. This underutilization may be due to an incomplete understanding of the conceptual underpinnings of GT, the actual steps involved in designing and implementing generalizability studies, or some combination of both issues. The goal of the current article is therefore two-fold: (a) to provide readers with the conceptual background and terminology related to the use of GT and (b) to facilitate understanding of the range of issues that need to be considered in the design, implementation, and interpretation of generalizability and dependability studies. Given the relevance of this analytic approach to applied assessment contexts, there exists a need to ensure that GT is both accessible to, and understood by, researchers in education and psychology. Important methodological and analytical considerations are presented and implications for applied use are described.  相似文献   

18.
《Military psychology》2013,25(2):91-110
The Armed Services have undertaken a Joint Services Project to develop prototype methods for measuring job performance and, if feasible, to link enlistment standards to on-the-job performance. As one step toward meeting the first purpose of the project, our study assessed the reliability of scores on four measures of job performance of machinist mates in the Navy: a hands-on performance test, a paper-and-pencil job knowledge test, job task perform- ance ratings, and global ratings. Generalizability theory was used to estimate the reliability of the measures and to improve the design of each. The generalizability analyses investigated the consistency of performance scores over observers, tasks (equipment operation, casualty control, preventative maintenance), and types of raters (supervisor, peer, and self). Recommenda- tions about the optimal designs of studies for decision making (e.g., pre- dicting hands-on performance scores from the other measures of job perform- ance) were based on the findings.  相似文献   

19.
This research was a study of the reliability of clinical judgment findings (multitrait) across three different information sources (psychometric tests, structured interview, and psychometric tests and interview). Subjects (N = 74) were middle and senior executives of Western Canadian technical companies; clinicians (N = 3) were trained and experienced industrial psychologists. The study investigated the similarity of clinical evaluation of personological characteristics (based on an 18-factor multitrait paradigm) across the three different information sources. Subjects were independently rated by a single clinician on 18 criterion factors in each of the three information source categories. Test information source categories required the administration of approximately 12 hrs of standardized psychological assessment questionnaires to each of the 74 subjects. Interview source category involved a 1.5-hr structured interview per subject. Combined condition pooled both test and interview conditions. Generalizability of the findings was maximized by the undertaking of the experiment in a natural situation thus increasing ecological validity. Statistical treatments used were designed to assess the similarity of a clinician's evaluation of a subject based on the different category of information available about that client. Convergence (intrarater reliability) indexes range from a high of .64 to a low of .05. Results indicate a varying degree of convergence of multitrait clinical ratings dependent on clinician and trait being rated. Results are discussed in terms of implications for practitioners involved in executive personnel selection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号