首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 46 毫秒
1.
采用项目反应理论(IRT)的多侧面Rasch模型(MFRM),分析评价中心技术中无领导小组讨论(LGD)的测评结果,探讨被试能力水平、评委评分宽严度、评分内部一致性、维度难度和评定等级等问题,进而讨论各种偏差。通过 MFRM 分析人事测评结果,可深入了解被试能力的真实差异、甑别维度难度、探查测评误差源,从而完善测评试题编制、评估或诊断评委合格性、提高测评维度与测评目的匹配性,为拓展项目反应理论在人事测评中的应用提供独特视角。  相似文献   

2.
国家公务员结构化面试中评委偏差的IRT分析   总被引:6,自引:1,他引:6       下载免费PDF全文
孙晓敏  张厚粲 《心理学报》2006,38(4):614-625
使用项目反应理论(IRT)中的多面Rasch模型,对两组共12名评委在国家公务员结构化面试中的评委偏差进行了分析。提出并验证了两种评委偏差:评委之间在宽严程度上的差异和评委自身的一致性问题。结果发现:不同评委之间在宽严程度上差异显著,且不同评委评定行为的跨考生、跨维度、跨性别、跨时间的自身一致性也存在差异。研究表明,这种进入到评委个体层次的分析突破了经典测量理论(CTT)定位于评委群体进行分析的局限,针对每位评委的偏差行为提供了详细具体的诊断信息,从而为评委的针对性培训和评委库的建立提供了现代测量学的新方法  相似文献   

3.
该研究应用GT和多面Rasch模型对结构化面试数据进行分析,并提出一些建议针对某辅导员招聘面试数据,运用GT从宏观上分析应聘者、考官和项目所带来的总体误差大小,在此基础上,运用多面Rasch模型从微观上进一步探查考官严厉度、应聘者能力差异、项目难易度及侧面偏差.结果表明:1)GT分析表明应聘者产生的变异较大(90.65%),说明面试可靠性较高,且当考官数为2时可靠性已较好.2)多面Rasch模型分析出了各侧面效应中的非拟合因素及交互效应中的偏差因素,表明面试误差主要来自考官间严厉度的差异及其自身一致性的不稳定。将GT与多面Rasch模型相结合分析面试数据不仅能测查出评价过程各方面的问题因素,并能更好地作整体把握。  相似文献   

4.
多面Rasch模型在结构化面试中的应用   总被引:1,自引:0,他引:1       下载免费PDF全文
孙晓敏  薛刚 《心理学报》2008,40(9):1030-1040
使用项目反应理论中的多面Rasch模型,对66名考生在结构化面试中的成绩进行分析,剔除了由于评委等具体测量情境因素引入的误差对原始分数的影响,得到考生的能力估计值以及个体水平的评分者一致性信息。对基于考生能力估计值和考生面试分得到的决策结果进行比较,发现测量误差的确对决策造成影响,对个别考生的影响甚至相当巨大。进一步使用Facets偏差分析以及评委宽严程度的Facets分析追踪误差源。结果表明,将来自不同面试组的被试进行面试原始成绩的直接比较,评委的自身一致性和评委彼此之间在宽严程度上的差异均将导致误差。研究表明,采用Facets的考生能力估计值作为决策的依据将提高选拔的有效性。同时,Facets分析得到的考生个体层次的评分者一致性指标,以及评委与考生的偏差分析等研究结果还可以为面试误差来源的定位提供详细的诊断信息  相似文献   

5.
主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,介绍了国外相关的典型应用,并且讨论了该模型的应用条件。  相似文献   

6.
关丹丹 《心理学探新》2014,34(5):437-440
为了评价和改进硕士研究生入学考试一般能力测试的写作评分,研究者采用概化理论和多面Rasch分析对113位考生的写作样本的评分误差来源、评分信度等进行了探讨.概化理论研究显示,评分者和题目对评分准确性影响不大,以两道写作题的考试设计而言,评分者为2人即可保证评分信度在0.75以上.多面Rasch分析显示,评分者宽严度的估计值及其误差均在可接受的范围内,评分者之间在宽严度上不存在显著差异,且评分者自身在评分时总体上比较稳定.但个别评分者在特定考生特定题目上表现出特殊偏向.概化理论和多面Rasch分析丰富了写作评分研究的量化指标,证实了硕士研究生入学考试一般能力测试的写作评分具有较高的信度.  相似文献   

7.
HSK主观考试评分的Rasch实验分析   总被引:1,自引:0,他引:1  
主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,设计了基于该模型的HSK主观考试评分质量控制应用框架,利用HSK作文评分数据进行了实验验证。  相似文献   

8.
国内外考试改革和大型测评实践越来越强调主观题的作用,则评分者信度研究又重新成为一个备受关注的议题。研究在Wang和Liu(2007)的广义多水平侧面模型基础上,提出并探讨了等级反应多水平侧面模型。结果表明:在评分者固定效应和随机效应两种实验条件下,各偏差值的均值与标准差均较小,说明模型在当前实验条件下,各参数估计值的返真性和稳健性均较好,可以检测出评分者效应,由此,后续可进一步加入评分者效应的影响因素,使其发展为可同时检测评分者效应及其影响因素的完整模型。  相似文献   

9.
分别采用四维度和十五维度Rasch模型分析包含项目内多维度结构的科学测验数据,估计两种维度结构下维度分数的信度.结果表明,对比相应的单维模型而言,四维度与十五维度Rasch模型均能够极大提高各内容维度上分数估计的信度.四维度与十五维度Rasch模型拟合结果的比较表明,对于总长度固定的测验,维度数目的增加能够补偿子维度长度减少引起的信度损失.但是这一作用必须以维度间较高的相关性为前提.  相似文献   

10.
探讨了康春花,孙小坚和曾平飞(2016)提出的等级反应多水平侧面模型(GR-MLFM)在包含被试及评分者层面预测变量(完整模型)下的返真性和适用性。结果表明:(1)GR-MLFM完整模型具有逻辑上和数理上的合理性,可用于主观题的评分情境,能较好地检测出评分者效应、影响因素及其影响程度;(2)在数学问题解决的评分实践中,评分员存在两种类型的评分倾向(宽松和严格效应),但绝大多数评分员的宽严度不明显;评分者的责任心可正向预测其严格程度,自信心可正向预测其宽松程度,而情绪稳定性和评分经验的预测作用不显著。  相似文献   

11.
变量的潜在结构是连续的还是分类的不应被随意指定,错误的设定可能导致不正确的结论。本研究的目的在于从实证的角度探索网络成瘾的潜在结构。研究采用了来自中国杭州的2511名初中生对Young网络成瘾量表的有效作答数据,比较了Rasch模型、潜在类别模型和混合Rasch模型与数据的拟合情况。结果表明:2个潜在类别的混合Rasch模型可以最好地反映网络成瘾的潜在结构,说明网络成瘾包含两个存在着质的差异的群体,并且每个群体内的个体间存在量的差异。研究进一步比较了基于混合Rasch模型的分类与传统划界分数分类的区别,结果表明:Young的网络成瘾标准可能具有很小的误判率和较高的漏判率;基于Young网络成瘾测验的修订需要考虑有针对性地增加部分题目。  相似文献   

12.
    
This study examines the associations between prenatal attachment and child development, socioemotional behavioral problems, and competence at early childhood. It also inquires whether maternal depression and anxiety at the prenatal period and at early childhood are associated with child outcomes. The study consisted of 83 mothers and their children. Data regarding the prenatal attachment, depression, and anxiety were collected during Weeks 28 to 40 of gestation. When the children were 21 to 31 months old, the Brief Infant and Toddler Social Emotional Assessment (BITSEA) and the Ankara Developmental Screening Inventory (ADSI) were applied to children along with Beck Depression Inventory (BDI) and the Beck Anxiety Inventory (BAI) administered to mothers. Results showed that prenatal attachment scores significantly correlated with BITSEA-Competency subscale scores and ADSI total scores at early childhood, r(83) = 0.246, P = .025, and r(82) = 0.316, P = .004, respectively. Prenatal attachment levels were found to be the predictors of both behavioral and emotional competence and development at early childhood, b = 0.081, t(83) = 2.273, P = .014, and b = 0.281, t(83) = 3.225, P = .002, respectively. In addition, prenatal attachment was shown to be even a stronger predictor of development than was worsening maternal depression at early childhood, b = −0.319, t(83) = 2.140, P = .035. Our results indicate that fostering prenatal attachment may be beneficial for better infant outcomes at early childhood.  相似文献   

13.
This study assessed how exposure to domestic violence (DV) during early childhood and increases in exposure over time influenced toddlers' behavior and peer problems, physical health, and cognitive abilities in middle childhood. Data from three waves of the survey component of “Welfare, Children, and Families: A Three-City Study” were assessed. Thirty-five percent of the 2- to 4-year-olds had mothers who reported DV victimization; 16% reported an increase in DV victimization over 2 years. Opposing past literature, none of the middle childhood outcomes were significantly influenced by early DV exposure. However, increases in mother's DV victimization from 1999 to 2001 significantly increased children's internalizing and externalizing problems and marginally decreased their school engagement in middle childhood in 2005.  相似文献   

14.
刘昊  刘肖岑  冯晓霞 《心理科学》2013,36(2):484-488
本研究的目的在于应用Rasch模型编制和分析数学入学准备测验,从而分析Rasch模型的有效性和优势。自编数学入学准备测试,对150名平均年龄为6.6岁的儿童进行测查,应用Rasch模型对题目和评分等级做出修正并分析结果。结果表明修正后的测试具有较好的信效度,较好地拟合了Rasch模型,评分等级设置合理,测试的整体难度相对较低。儿童的Rasch分数和性别无关,但受到年龄、家庭社会经济地位的影响。相对于经典测量理论而言,应用Rasch模型进行入学准备测试的编制和分析具有优势。  相似文献   

15.
    
This article attempts to present emotioncy as a potential source of test bias to inform the analysis of test item performance. Emotioncy is defined as a hierarchy, ranging from exvolvement (auditory, visual, and kinesthetic) to involvement (inner and arch), to emphasize the emotions evoked by the senses. This study hypothesizes that when individuals have high levels of emotioncy for specific words, their test performance may systematically change, resulting in test bias. To this end, 355 individuals were asked to take a 40-item vocabulary test along with the emotioncy scale. Mixed Rasch model was employed to flag differential item functioning items. Results illustrated that the test takers with high emotioncy toward specific words outperformed the ones in the low-emotioncy group, characterizing emotioncy as a potential source of test bias.  相似文献   

16.
    
Little research has been done on the effects of peer raters’ quality characteristics on peer rating qualities. This study aims to address this gap and investigate the effects of key variables related to peer raters’ qualities, including content knowledge, previous rating experience, training on rating tasks, and rating motivation. In an experiment where training and motivation interventions were manipulated, 24 classes with 838 high school students were randomly assigned to study conditions. Inter-rater error, intra-rater error and criterion error indices for peer ratings on four selected essays were analyzed using hierarchical linear models. Results indicated that peer raters’ content knowledge, previous rating experience, and rating motivation were associated with rating errors. This study also found some significant interactions between peer raters’ quality characteristics. Implications for in-person and online peer assessments as well as future directions are discussed.  相似文献   

17.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号