首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 359 毫秒
1.
HSK主观考试评分的Rasch实验分析   总被引:1,自引:0,他引:1  
主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,设计了基于该模型的HSK主观考试评分质量控制应用框架,利用HSK作文评分数据进行了实验验证。  相似文献   

2.
关丹丹 《心理学探新》2014,34(5):437-440
为了评价和改进硕士研究生入学考试一般能力测试的写作评分,研究者采用概化理论和多面Rasch分析对113位考生的写作样本的评分误差来源、评分信度等进行了探讨.概化理论研究显示,评分者和题目对评分准确性影响不大,以两道写作题的考试设计而言,评分者为2人即可保证评分信度在0.75以上.多面Rasch分析显示,评分者宽严度的估计值及其误差均在可接受的范围内,评分者之间在宽严度上不存在显著差异,且评分者自身在评分时总体上比较稳定.但个别评分者在特定考生特定题目上表现出特殊偏向.概化理论和多面Rasch分析丰富了写作评分研究的量化指标,证实了硕士研究生入学考试一般能力测试的写作评分具有较高的信度.  相似文献   

3.
多面Rasch模型理论及其在结构化面试中的应用   总被引:1,自引:0,他引:1  
针对影响面试效度的各种误差来源,该文引入了一种新颖的面试结果处理方法:多面Rasch模型。这一模型在结构化面试中的应用不但有利于有效测量被试的能力水平,而且为识别问题评委、进一步完善评分规则、实现面试等值等问题都提供了全新的解决思路。文章在对结构化面试信、效度研究进展进行综述的基础上,介绍了多面Rasch模型的理论及其在结构化面试中的应用框架。  相似文献   

4.
刘昊  刘肖岑  冯晓霞 《心理科学》2013,36(2):484-488
本研究的目的在于应用Rasch模型编制和分析数学入学准备测验,从而分析Rasch模型的有效性和优势。自编数学入学准备测试,对150名平均年龄为6.6岁的儿童进行测查,应用Rasch模型对题目和评分等级做出修正并分析结果。结果表明修正后的测试具有较好的信效度,较好地拟合了Rasch模型,评分等级设置合理,测试的整体难度相对较低。儿童的Rasch分数和性别无关,但受到年龄、家庭社会经济地位的影响。相对于经典测量理论而言,应用Rasch模型进行入学准备测试的编制和分析具有优势。  相似文献   

5.
创造力测评中的评分者效应(rater effects)是指在创造性测评过程中, 由于评分者参与而对测评结果造成的影响.评分者效应本质上源于评分者内在认知加工的不同, 具体体现在其评分结果的差异.本文首先概述了评分者认知的相关研究, 以及评分者,创作者,社会文化因素对测评的影响.其次在评分结果层面梳理了评分者一致性信度的指标及其局限, 以及测验概化理论和多面Rasch模型在量化,控制该效应中的应用.最后基于当前研究仍存在的问题, 指出了未来可能的研究方向, 包括深化评分者认知研究,整合不同层面评分者效应的研究, 以及拓展创造力测评方法和技术等.  相似文献   

6.
分别采用四维度和十五维度Rasch模型分析包含项目内多维度结构的科学测验数据,估计两种维度结构下维度分数的信度.结果表明,对比相应的单维模型而言,四维度与十五维度Rasch模型均能够极大提高各内容维度上分数估计的信度.四维度与十五维度Rasch模型拟合结果的比较表明,对于总长度固定的测验,维度数目的增加能够补偿子维度长度减少引起的信度损失.但是这一作用必须以维度间较高的相关性为前提.  相似文献   

7.
该研究应用GT和多面Rasch模型对结构化面试数据进行分析,并提出一些建议针对某辅导员招聘面试数据,运用GT从宏观上分析应聘者、考官和项目所带来的总体误差大小,在此基础上,运用多面Rasch模型从微观上进一步探查考官严厉度、应聘者能力差异、项目难易度及侧面偏差.结果表明:1)GT分析表明应聘者产生的变异较大(90.65%),说明面试可靠性较高,且当考官数为2时可靠性已较好.2)多面Rasch模型分析出了各侧面效应中的非拟合因素及交互效应中的偏差因素,表明面试误差主要来自考官间严厉度的差异及其自身一致性的不稳定。将GT与多面Rasch模型相结合分析面试数据不仅能测查出评价过程各方面的问题因素,并能更好地作整体把握。  相似文献   

8.
晏子 《心理科学进展》2010,18(8):1298-1305
Rasch模型是在国外学术界受到广泛关注和深入研究的一个潜在特质模型。该模型为解决心理科学领域内测量的客观性问题提供了一个可行性很高的解决方案。而国内关于Rasch模型的理论探讨和应用研究却并不多见。不同于一般项目反应理论, Rasch模型要求所收集的数据必须符合模型的先验要求, 而不是使用不同的参数去适应数据的特点。Rasch模型的主要特点(包括个体与题目共用标尺、线性数据、参数分离)确保了客观测量的实现。未来关于Rasch模型的研究方向包括多维度Rasch模型、测验的等值与链接、计算机自适应性考试, 大型应用测量系统(比如Lexile系统)等等。  相似文献   

9.
采用Rosenberg自尊量表(RSES)对425名在校大学生进行施测,应用项目反应理论的Rasch模型对项目指标进行分析及DIF检验。结果表明,Rosenberg自尊量表具有单维性,量表的信度为0.84; 除项目8以外,其他项目拟合指标良好,较适用来区分中等及偏低自尊水平的个体,项目功能差异检验发现在项目1和项目5上存在DIF,表现为男生自尊水平要高于女生。相对于经典测量理论,应用Rasch模型分析Rosenberg自尊量表具有优势,为进一步的完善和使用该自尊量表提供依据。  相似文献   

10.
多维题组效应Rasch模型   总被引:2,自引:0,他引:2  
首先, 本文诠释了“题组”的本质即一个存在共同刺激的项目集合。并基于此, 将题组效应划分为项目内单维题组效应和项目内多维题组效应。其次, 本文基于Rasch模型开发了二级评分和多级评分的多维题组效应Rasch模型, 以期较好地处理项目内多维题组效应。最后, 模拟研究结果显示新模型有效合理, 与Rasch题组模型、分部评分模型对比研究后表明:(1)测验存在项目内多维题组效应时, 仅把明显的捆绑式题组效应进行分离而忽略其他潜在的题组效应, 仍会导致参数的偏差估计甚或高估测验信度; (2)新模型更具普适性, 即便当被试作答数据不存在题组效应或只存在项目内单维题组效应, 采用新模型进行测验分析也能得到较好的参数估计结果。  相似文献   

11.
Jürgen Rost 《Psychometrika》1988,53(3):327-348
A general approach for analyzing rating data with latent class models is described, which parallels rating models in the framework of latent trait theory. A general rating model as well as a two-parameter model with location and dispersion parameters, analogous to Andrich's Dislocmodel are derived, including parameter estimation via the EM-algorithm. Two examples illustrate the application of the models and their statisticalcontrol. Model restrictions through equality constrains are discussed and multiparameter generalizations are outlined.  相似文献   

12.
This paper describes a problem-solving framework In which aspects of mathematical decision theory are incorporated into symbolic problem-solving techniques currently predominant in artificial intelligence. The utility function of decision theory IS used to reveal tradeoffs among competing strategies for achieving various goals, taking into account such factors as reliability, the complexity of steps in the strategy, and the value of the goal. The utility function on strategies can therefore be used as a guide when searching for good strategies. It is also used to formulate solutions to the problems of how to acquire a world model, how much planning effort is worthwhile, and whether verification tests should be performed. These techniques are illustrated by application to the classic monkey and bananas problem.  相似文献   

13.
This paper formalizes and provides static and dynamic estimators for a scaling model for rating chess players. The model was suggested by the work of Arpad Elo, the inventor of the chess rating system in current use by both the United States and international chess federations. The model can be viewed as a Thurstone Case V model that permits draws (ties). A related model based on a linear approximation is also analyzed. In the chess application, possibly changing ability parameters are estimated sequentially from sparse data structures that often involve many fewer than M(M ? 1)2 observations on the M players to the rated. In contrast, psychological applications of paired-comparison scaling generally use models with no draw provision to estimate static parameters from a systematically obtained data structure such as a replicated “round robin” involving all M entities to be scaled. In the paper, both static and sequential estimators are provided and evaluated for a number of different data structures. Sampling theory for the estimators is developed. The application of rating systems to track temporally changing ability parameters may prove useful in many areas of psychology.  相似文献   

14.
比较是社会判断的核心过程。近期,Mussweile提出了选择通达理论模型,该模型区分了比较中两个基本的假设过程,整合了比较中的多种结果,为人们更好地理解判断的本质提供了一个新视角。本文主要介绍了选择通达的过程、结果及其存在的普遍性,并讨论了参照点运用和选择通达机制之间的区别和联系。  相似文献   

15.
The polytomous unidimensional Rasch model with equidistant scoring, also known as the rating scale model, is extended in such a way that the item parameters are linearly decomposed into certain basic parameters. The extended model is denoted as the linear rating scale model (LRSM). A conditional maximum likelihood estimation procedure and a likelihood-ratio test of hypotheses within the framework of the LRSM are presented. Since the LRSM is a generalization of both the dichotomous Rasch model and the rating scale model, the present algorithm is suited for conditional maximum likelihood estimation in these submodels as well. The practicality of the conditional method is demonstrated by means of a dichotomous Rasch example with 100 items, of a rating scale example with 30 items and 5 categories, and in the light of an empirical application to the measurement of treatment effects in a clinical study.Work supported in part by the Fonds zur Förderung der Wissenschaftlichen Forschung under Grant No. P6414.  相似文献   

16.
Two different item response theory model frameworks have been proposed for the assessment and control of response styles in rating data. According to one framework, response styles can be assessed by analysing threshold parameters in Rasch models for ordinal data and in mixture‐distribution extensions of such models. A different framework is provided by multi‐process item response tree models, which can be used to disentangle response processes that are related to the substantive traits and response tendencies elicited by the response scale. In this tutorial, the two approaches are reviewed, illustrated with an empirical data set of the two‐dimensional ‘Personal Need for Structure’ construct, and compared in terms of multiple criteria. Mplus is used as a software framework for (mixed) polytomous Rasch models and item response tree models as well as for demonstrating how parsimonious model variants can be specified to test assumptions on the structure of response styles and attitude strength. Although both frameworks are shown to account for response styles, they differ on the quantitative criteria of model selection, practical aspects of model estimation, and conceptual issues of representing response styles as continuous and multidimensional sources of individual differences in psychological assessment.  相似文献   

17.
Cronbach's alpha系数作为信度估计指标存在诸多弊端.为了克服其不足,研究者提出了多种信度估计,而流行的统计软件尚未直接提供这些参数,以致在实践中并未被广泛采用.为了缩小理论和实践的差距,文章通过具体实例给出几种常用的信度估计(合成信度,单个指标信度和ωh)的Mplus程序.  相似文献   

18.
In this paper, a mathematical theory of instruction applicable in the educational environment is developed from concepts of psychological learning theory. Within the framework of optimization and control theory, the dynamics of the interaction between instructor and learner are modelled, and the trade-off between instruction cost and learner achievement is formulated so that optimal instruction inputs can be determined. One important aspect of the classroom environment that is characterized by the theory is the interaction between an instructor and a group of learners with various learning abilities.A basic dynamic model that relates learner achievement and instruction cost is developed from learning theory concepts. This model, which applies to the individual learner situation, is analyzed in detail to determine instruction intensity inputs that match the learner's characteristics in order to maximize an objective that measures both achievement and cost.This basic model is used as a building block to describe how individual learner achievement depends on instruction pacing. To determine optimal instruction pacing the concept of gain, which is essentially learner achievement per unit time, is introduced. In this extended model, instruction pacing is intimately related with the concept of learner aptitude. This relationship leads immediately to the consideration of instruction pacing for a group of learners with various aptitudes and thus optimal instruction pacing is determined for nonhomogenous groups.Throughout the development of the theory, hypothetical examples are presented to demonstrate many of the implications of the theory. One of the contributions of the theory is the definition of the concepts of learner aptitude and instruction pacing within a framework that structures the empirical investigation of these concepts by means of experimental research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号