首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
SPSS and SAS programs for generalizability theory analyses   总被引:1,自引:0,他引:1  
The identification and reduction of measurement errors is a major challenge in psychological testing. Most investigators rely solely on classical test theory for assessing reliability, whereas most experts have long recommended using generalizability theory instead. One reason for the common neglect of generalizability theory is the absence of analytic facilities for this purpose in popular statistical software packages. This article provides a brief introduction to generalizability theory, describes easy to use SPSS, SAS, and MATLAB programs for conducting the recommended analyses, and provides an illustrative example, using data (N= 329) for the Rosenberg Self-Esteem Scale. Program output includes variance components, relative and absolute errors and generalizability coefficients, coefficients for D studies, and graphs of D study results.  相似文献   

2.
从多元概化理论看高考综合能力测试的改进   总被引:10,自引:0,他引:10  
杨志明  张雷  马世晔 《心理学报》2004,36(2):195-200
通过多元概化理论的研究发现,高考综合能力测试(2001,广东)的总体信度达到了可以接受的水平(0.784)。但测验中各部分对总方差的贡献程度与预定的赋分比例有较大差距。其中,地理和政治的贡献度偏低,化学和历史的贡献度偏高。这表明有(历史和化学)偏科特点的考生得到了较高的综合分数。又经决策(D)研究发现,增加地理部分的题量会反常地降低测验的总体信度,这说明有不少高分考生答错或主动放弃了地理科题目。因此,如何有效控制各部分的实际贡献程度、避免负面导向是当前高考综合能力测试亟待解决的问题。  相似文献   

3.
方差分量估计是进行概化理论分析的关键。采用MonteCarlo模拟技术,探讨心理与教育测量数据分布对概化理论各种方法估计方差分量的影响。数据分布包括正态、二项和多项分布,估计方法包括Traditional、Jackknife、Bootstrap和MCMC方法。结果表明:(1)Traditional方法估计正态分布和多项分布数据的方差分量相对较好,估计二项分布数据需要校正,Jackknife方法准确地估计了三种分布数据的方差分量,校正的Bootstrap方法和有先验信息的MCMC方法(MCMCinf)估计三种分布数据的方差分量结果较好;(2)心理与教育测量数据分布对四种方法估计概化理论方差分量有影响,数据分布制约着各种方差分量估计方法性能的发挥,需要加以区分地使用。  相似文献   

4.
Generalizability of scores influenced by multiple sources of variance   总被引:3,自引:0,他引:3  
Generalizability theory concerns the adequacy with which a universe score can be inferred from a set of observations. In this paper the theory is applied to a universe in which observations are classifiable according to two independent variable aspects of the measuring procedure. Several types of universe scores are developed and the variance components ascertained for each type. The composition of expected observed-score variance and the adequacy of inference to a particular type of universe score is a function of the procedure used in gathering data. A generalizability study provides estimates of variance components which can be used in designing an efficient procedure for a particular decision purpose.This study was conducted under Grant M-1839 from the National Institute of Mental Health while the authors were on the staff of the University of Illinois. Dr. Rajaratnam shared responsibility for the technical report of July, 1961 on which this paper is based. The present revision was made subsequent to her death in 1963. The present addresses of the other authors are: Goldine C. Gleser, Department of Psychiatry, Central Clinic, Cincinatti, 29, Ohio; Lee J. Cronbach, School of Education, Stanford University.  相似文献   

5.
三种心理测量理论的信度观   总被引:5,自引:0,他引:5  
目前,心理测量领域中主要存在三大理论派别。本文分别对这三种理论即经典测验理论、可概括性理论和项目反应理论作了简要介绍,着重分析这三种理论的信度观。文章讨论了这三种信度观的理论基础和研究方法,比较了它们的异同,指出经典测验理论存在的一些不足及概化理论和项目反应理论所作的改进。概化理论是对经典测验理论的扩展,它用多维的信度指标(概化系数)替代了经典测验理论的信度系数,项目反应理论则从信息量的角度出发,用项目信息函数、测验信息函数等指标更具体深入地反映项目、测验的测量可靠程度。  相似文献   

6.
The generalizability of behaviors across observational conditions is a critical issue in behavioral assessment. Generalizability theory was used to examine two aspects of audio recorded parent-child interactions recorded over 6 days of home measurement and 1 day of laboratory measurement in a behavioral treatment program for childhood obesity. Families audiotaped parent-child home meetings during which they reviewed self-monitored diet and exercise records that were coded for the following types of interactions: praise statements, negative statements, prompts for new behaviors, and statements promoting problem solving. A similar meeting was audiotaped in our laboratory. The first question explored was the number of measurements needed to generalize to the universe of the six home measures. Results showed an increase in generalizability over measurements for each behavioral category. Using generalizability coefficients of .60 or more, praise, negative comments and prompts, respectively, could be reliably observed based on 1, 4, or 4 days of measurement. Second, the effects of setting (laboratory versus home) were assessed for 1 day of measurement in each environment. Again using generalizability coefficients of .60, generalizability analysis showed that the lab setting could not be generalized to the home setting based on 1 day of measurement, with generalizability coefficients ranging from .27 for negative comments to .57 for praise. Results suggest that 4 days of behavioral assessment in the home can be used to establish generalizable data for all the dependent measures studied. However, generalizability coefficients suggested that 1 day of laboratory measurement was not adequate to generalize to typical home behavior.This research was supported in part by Grant NIH HD 23713 awarded to the third author.  相似文献   

7.
运用多元概化理论对两届临床医学硕士研究生内科临床实践能力考核进行评价比较。结果表明,两届研究生内科临床实践能力考核的信度均较高,可靠性指数分别为0.78878、0.67985,考核内容较全面。比较发现,01级学生考核的信度要高于02级,考核专家以3-5位为宜。  相似文献   

8.
Previous research on the effect of class size on student ratings of instruction has primarily investigated the effect of class size on the favorableness of these ratings rather than its effect on their reliability (dependability). A few studies have used "generalizability theory" to demonstrate the relative effect of class size on the dependability of student ratings of instruction. The purpose of the present study was to test the validity of the findings of these studies in a different cultural setting using a different student ratings questionnaire. Using a random-effect analysis of variance to estimate the variance components for a design in which students were nested within classes and crossed with items, it was found that the variance component for class size was appreciably larger than that for items. At least 20 students were needed to obtain a generalizability coefficient for relative decisions of .70 or more. Increasing the number of students has a greater effect on generalizability coefficients than increasing the number of items.  相似文献   

9.
概化理论是现代心理与教育测量理论之一,可应用在各种人事测评中,如表现性评价、多源评估、心理测验、结构化面试、水平测试、工作分析、评价中心等.与经典测量理论相比,概化理论应用于人事测评,表现出较强的优势,能够同时考察多种因素、确定多个维度权重等,其应用对象主要包括两大类,即企业和机构.概化理论应用于人事测评,存在应用领域、样本数据、测评效度和微观分析等问题.  相似文献   

10.
使用“高校教师教学水平评价问卷”,要求566名学生对19名教师进行评价,对收集到的数据作不同的概化设计,包括t×i、(st)×i、(st)×(iv)和(st)×(iv)×o四种设计。基于概化理论,结合预算限制,统一LaGrange乘法公式,自行推导不同设计的最佳样本量公式,联合估计的方差分量,计算出不同设计的最佳样本量。结果表明:(1)LaGrange乘法统一公式表现出较强的通用性,能够适用于预算限制下各种概化设计;(2)评价场合是影响高校教师教学水平评价一个相当重要的因素;(3)(st)×(iv)×o是高校教师教学水平评价概化理论预算限制下最优概化设计;(4)高校教师教学水平评价概化理论预算限制下,每位教师最佳评价学生人数为20人,每个维度最佳评价题目数为3题。  相似文献   

11.
ABSTRACT Although peer raters of personality traits do tend to agree, the strength of their consensus is often modest. This article focuses on methods for analyzing determinants of consensus. Variance components methods adapted from generalizability theory have some untapped potential for understanding gradations in consensus. The methods allow explicit analysis of how social categories of targets might affect judgments of raters from the same or different social categories. Limitations of the variance components approach are also discussed. The methods are illustrated with artificial data.  相似文献   

12.
以概化理论为基础,探究影响高校教师教学水平评价结果的因素。采用《高校教师教学水平评价量表(学生用)》收集评价数据,用mGENOVA对数据进行分析。结果发现:(1)与在第一学期末进行教学水平评价相比,在第二学期初进行教学水平评价的结果可靠性更高;(2)评价每位教师的教学水平仅需抽查20名学生即可保证评价结果的可靠性;(3)不同专业类型的学生对评价指标的侧重点不同,继而影响评价结果的可靠性;(4)学生对理科课程的评价可靠性较高,对文科课程的评价可靠性较低。  相似文献   

13.
黎光明  张敏强 《心理学报》2013,45(1):114-124
Bootstrap方法是一种有放回的再抽样方法, 可用于概化理论的方差分量及其变异量估计。用Monte Carlo技术模拟四种分布数据, 分别是正态分布、二项分布、多项分布和偏态分布数据。基于p×i设计, 探讨校正的Bootstrap方法相对于未校正的Bootstrap方法, 是否改善了概化理论估计四种模拟分布数据的方差分量及其变异量。结果表明:跨越四种分布数据, 从整体到局部, 不论是“点估计”还是“变异量”估计, 校正的Bootstrap方法都要优于未校正的Bootstrap方法, 校正的Bootstrap方法改善了概化理论方差分量及其变异量估计。  相似文献   

14.
The concept of self is central to personhood, but personality research has largely ignored the relevance of recent advances in self-concept theory: multidimensionality of self-concept (focusing instead on self-esteem, an implicit unidimensional approach), domain specificity (generalizability of trait manifestations over different domains), and multilevel perspectives in which social-cognitive processes and contextual effects drive self-perceptions at different levels (individual, group/institution, and country) aligned to Bronfenbrenner's ecological model. Here, we provide theoretical and empirical support for psychological comparison processes that influence self-perceptions and their relation to distal outcomes. Our meta-theoretical integration of social and dimensional comparison theories synthesizes five seemingly paradoxical frame-of-reference and contextual effects in self-concept formation that occur at different levels. The effects were tested with a sample of 485,490 fifteen-year-old students (68 countries/regions, 18,292 schools). Consistent with the dimensional comparison theory, the effects on math self-concept were positive for math achievement but negative for verbal achievement. Consistent with the social comparison theory, the effects on math self-concept were negative for school-average math achievement (big-fish-little-pond effect), country-average achievement (paradoxical cross-cultural effect), and being young relative to year in school but positive for school-average verbal achievement (big-fish-little-pond effect—compensatory effect). We demonstrate cross-cultural generalizability/universality of support for predictions and discuss implications for personality research. © 2020 European Association of Personality Psychology  相似文献   

15.
The development is reported of an SR-inventory of achievement-related behaviour for the purpose of managerial selection. SR-inventories stem from interactional personality psychology. As the design of an SR-inventory is two-facetted, Cronbach et al.'s generalizability theory forms a suitable framework to investigate it. Using data of 404 Dutch respondents — mostly applicants — several generalizability analyses have been performed to conclude under which circumstances the inventory can be a useful tool. Furthermore, confirmatory factor analysis has been used to substantiate the suggested SR-structure of the instrument. The relationship with other personality factors has been investigated to classify the instrument in the domain of personality assessment.  相似文献   

16.
17.
Recently a controversy has arisen among behavioral decision theory researchers concerning the generalizability of research based upon student subject samples to the behavior of expert decision makers. This study compared the influence of framing and performance constraints (goals or limits) on the ability of expert and amateur negotiators to reach integrative agreements in a negotiation task novel to both. The results suggested that while experts did outperform amateurs on comparable competitive market simulations, the patterns of their performance as influenced by framing and performance constraints were consistent.  相似文献   

18.
Some developments in multivariate generalizability   总被引:2,自引:0,他引:2  
This article is concerned with estimation of components of maximum generalizability in multifacet experimental designs involving multiple dependent measures. Within a Type II multivariate analysis of variance framework, components of maximum generalizability are defined as those composites of the dependent measures that maximize universe score variance for persons relative to observed score variance. The coefficient of maximum generalizability, expressed as a function of variance component matrices, is shown to equal the squared canonical correlation between true and observed scores. Emphasis is placed on estimation of variance component matrices, on the distinction between generalizability- and decision-studies, and on extension to multifacet designs involving crossed and nested facets. An example of a two-facet partially nested design is provided.Appreciation is expressed to the Office of Research in Medical Education, University of Texas Medical Branch, for permitting use of their data.  相似文献   

19.
孙晓敏  张厚粲 《心理科学》2005,28(3):646-649
随着素质教育的推进.表现性评价受到越来越多的重视。影响表现性评价结果信度的一个重要因素是评分者之间的不一致。文章使用模拟数据,在对比评分者一致性的相关法、一致性百分比法和概化系数等各种估计方法的基础上,提出概化理论在表现性评价中评分者信度问题上的应用是理论和实践发展的有益方向。  相似文献   

20.
各种心理调查、心理实验中, 数据的缺失随处可见。由于数据缺失, 给概化理论分析非平衡数据的方差分量带来一系列问题。基于概化理论框架下, 运用Matlab 7.0软件, 自编程序模拟产生随机双面交叉设计p×i×r缺失数据, 比较和探讨公式法、REML法、拆分法和MCMC法在估计各个方差分量上的性能优劣。结果表明:(1) MCMC方法估计随机双面交叉设计p×i×r缺失数据方差分量, 较其它3种方法表现出更强的优势; (2) 题目和评分者是缺失数据方差分量估计重要的影响因素。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号