首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
测验垂直等值是指将测试同一心理特质的不同水平的测验转换到同一个分数量尺上的过程。IRT与MIRT是实现垂直等值的主要方法。IRT无需假设被试的能力分布, 参数估计不依赖于样本, 是构建垂直量表的有效方法, 但测验不满足单维假设时其应用受到限制。MIRT结合IRT和因素分析的特点对IRT进行了拓展, 可更有效估计多维测验的项目参数和被试能力参数, 在垂直等值中有重要应用。已有研究主要探讨IRT和MIRT在垂直等值应用中的适用性、标定方法和参数估计方法, 比较研究两种方法的特性。未来研究应纳入更多变量条件进行比较研究, 拓展方法的应用。  相似文献   

罗莲 《心理学探新》2008,28(2):69-74
该文介绍了一种新的等值方法一核等值法。首先介绍了核等值法的研究过程、它的主要特点以及五个步骤(前平滑处理、估计分数概率、连续化、等值、计算等值标准误)。之后,介绍了核等值法与其他传统的观察分等值方法的差异,最后是对核等值法的评价。  相似文献   

刘铁川  戴海琦  赵玉 《心理科学》2012,35(2):446-451
设置铆题来链接不同测验形式是一种常用的等值设计。但受到曝光等因素影响,铆题功能在不同施测时间会发生改变。本研究采用MH检验和logistic回归考察我国一大型考试等值的铆题质量,结果发现,有22个铆题发生参数漂移,铆题的难度参数和区分度参数可能发生漂移;这些铆题中大部分在二次使用时无法通过模型拟合检验;若不删除参数发生漂移的铆题导致较大的系统等值误差,应将铆题参数漂移检验作为等值中的一步必要工作。  相似文献   

探究带宽选择方法、样本量、题目数量、等值设计、数据模拟方式对项目反应理论观察分数核等值的影响。通过两种数据模拟方式,获得研究数据,并计算局部与全域评价指标。研究发现,在随机组设计中,带宽选择方法表现相似;考生样本量和题目数量影响甚微。在非等组设计中,惩罚法与Silverman经验准则表现优异;增加题目量可降低百分相对误差和随机误差;增加样本量导致百分相对误差变大,随机误差减小。数据模拟方式可影响等值评价。未来应重点关注等值系统评估。  相似文献   

不同定义平行测验等值的群体不变性   总被引:1,自引:0,他引:1  
群体不变性是等值的一个重要假设,即对不同的考生子群体等值函数一致。本研究对不同平行测验定义下线性等值的群体不变性进行了理论分析和模拟研究,模拟研究REMSD指标通过六种不同加权方式计算。结果显示,严格平行测验在信度较低时REMSD指标更大;子群体均值差异和信度差异对REMSD的影响存在明显的交互作用;REMSD指标在期望权重等权下的最大,在分数权重采用子群体比例加权最小。最后对结果进行了讨论,对REMSD权重使用及进一步研究给出了建议。  相似文献   

等级反应模型项目特征曲线法等值研究   总被引:2,自引:0,他引:2  
主、客观题并用的测验建项目反应理论题库需作多级模型项目参数等值,本研究推演了等级反应模型下项目特征曲线等值方法并在实际等值试验中获得成功.  相似文献   

在非等组铆测验设计中,铆题量占测验长度的多大比例比较合适,这个比例随测验长度的增大可否发生变化?这些是实际工作者和研究者非常关心的问题。该文在固定被试数和测验长度的条件下,探查铆题量所占测验长度比例(简称铆题比例)的变化对等值精度的影响,讨论了在实际等值中如何在等值精度和铆题比例之间取得平衡的问题。并在模拟研究的条件下,给出了几个反应实际等值精度的指标。  相似文献   

Mattick and Clarke's (1998) Social Interaction Anxiety Scale (SIAS) and Social Phobia Scale (SPS) are commonly used self-report measures that assess 2 dimensions of social anxiety. Given the need for short, readable measures, this research proposes short forms of both scales. Item-level analyses of readability characteristics of the SIAS and SPS items led to the selection of 6 items from each scale for use in the short forms. The SIAS and SPS short forms had reading levels at approximately the 6th and 5th grade level, respectively. Results using nonclinical (Study 1: N = 469) and clinical (Study 2: N = 145) samples identified these short forms as being factorially sound, possessing adequate internal consistency, and having strong convergence with their full-length counterparts. Moreover, these short forms showed convergence with other measures of social anxiety, showed divergence from measures assessing related constructs, and predicted concurrent interpersonal functioning. Recommendations for the use of these short forms are discussed.  相似文献   

采用锚测验非等组设计的数据收集方案,对4种基于经典测量理论的等值方法进行了比较研究。研究数据取自TIMSS1999数据库,兼用等值标准误和交叉验证方法作为各等值方法比较的检验标准,利用CIPE程序对实验数据进行分析。研究结果表明,针对本研究所设置的等值情境,线性等值优于等百分位等值,其中Tucker线性方法比Levine观察分数线性方法更好一些,Braun-Holland线性方法不宜采用,频数估计等百分位方法等值误差较大,亦不足取。  相似文献   

经典测量理论等值的误差研究   总被引:3,自引:0,他引:3  
1 引言  等值 ,是以铆测验或铆被试组为桥梁建立两份同特质测验结果之间的比较关系。许多因素会影响等值的准确性 ,由于被试抽样给等值带来的误差叫等值抽样误差。它指的是 ,由于等值所用被试样本是从其总体中进行了不可避免的有一定程度偏性的抽样而得到的 ,据此建立的等值关系也就具有一定程度的偏差 ,这种偏差即是等值抽样误差。通过从总体中重复抽样、以一个完全拟合数据条件的等值方法进行等值 ,那么 ,等值结果分布的平均数即是真正的等值分数 ,而分布的标准差即是等值抽样标准误。本文将对等值抽样误差问题进行探讨。2 研究方法2 …  相似文献   

基于经典测验理论(CTT)的等值方法主要有线性等值和等百分位等值两种。在不同情境下,不同的等值方法会产生不同的等值结果。本研究以真分数等值为依据,用蒙特卡洛模拟研究方法,综合比较了各种题目难度分布条件下和各种样本容量条件下两种CTT等值方法的等值结果。研究结果表明:(1)线性等值的误差受题目难度分布影响较大,等百分位等值的误差几乎不受题目难度分布影响。(2)线性等值的误差几乎不受样本容量的影响,等百分位等值的误差受样本容量影响较大。(3)不论题目难度分布如何,只要样本容量足够大,等百分位等值的效果都比线性等值更好。  相似文献   

Hongwen Guo 《Psychometrika》2010,75(3):438-453
After many equatings have been conducted in a testing program, equating errors can accumulate to a degree that is not negligible compared to the standard error of measurement. In this paper, the author investigates the asymptotic accumulative standard error of equating (ASEE) for linear equating methods, including chained linear, Tucker, and Levine, under the nonequivalent groups with anchor test (NEAT) design. A recursive formula for the ASEE is provided for a series of equatings that makes use of only historical summary statistics. This formula can serve as a new tool to measure the magnitude of equating errors that have accumulated over a series of equatings, and to help monitor and design testing programs.  相似文献   

李晶  张侃 《心理科学》2007,30(2):268-271
学习通过文字或图片方式显示的空间布局材料之后,采用空间定位任务.分别检验在不同朝向和相对位置的条件下,对想象空间方位判断的差异。结果表明,使用图片进行空间布局的学习,记忆后再各自以同样显示形式进行空间定位任务时,判断的速度比使用文字的反应时要快;同时,无论是文字还是图片形式呈现.在进行想象转向时.都会呈现朝向效应和相对位置效应。  相似文献   

A situational judgment test (SJT) and a Big 5 personality test were administered to 203 participants under instructions to respond honestly and to fake good using a within‐subjects design. Participants indicated both the best and worst response (i.e., Knowledge) and the most likely and least likely response (i.e., Behavioral Tendency) to each situation. Faking effect size for the SJT Behavioral Tendency response format was (d=.34) when participants responded first under honest instructions and (d=.15) when they responded first under faking instructions. Those for the Big 5 dimensions ranged from d=.26 to d=1.0. For the Knowledge response format results were inconsistent. Honest condition Knowledge SJT scores were more highly correlated with cognitive ability (r=.56) than were Behavioral Tendency SJT scores (r=.38). Implications for researchers and practitioners are discussed.  相似文献   

本研究考察了三种不同的自我评价方式对高中二年级学生写作活动的影响。结果表明,自我评价对写作成绩有一定改善,但短期内对一般性的学习动机与效能感无显著影响。写作结果自我评价方式有助于写作结果自我评价准确性的提高,单独的写作过程自我评价方式短期内对写作活动无显著促进作用,写作结果与写作过程的综合自我评价方式显著改善了写作活动。  相似文献   

在项目反应理论框架下,根据已有文献提出了开发新的测验等值准则的方法,即许多准则都可以看成是通过对锚题上作答反应概率分布进行变换而导出。据此揭示了两个著名的等值准则——Haebara方法和Stocking-Lord方法之间的联系,并且导出了一个新的等值准则——余弦等值准则。为了讨论余弦准则的行为表现,开展了一系列Monte-Carlo模拟研究。模拟结果表明,余弦准则在多级评分模型GPCM上表现比Haebara方法和Stocking--Lord方法都好,而对GRM和2PLM,其表现不如Haebara,但可以和Stocking-Lord方法相提并论。这一发现提醒我们等值准则的选用是否恰当,不仅与等值系数所落的范围有关,而且还与项目反应函数(IRF)有更密切的关系  相似文献   

To reduce faking on personality tests, applicants may be warned that a social desirability scale is embedded in the test. Although this procedure has been shown to substantially reduce faking, there is no data that addresses how such a warning may influence applicant reactions toward the selection procedure or the relationships among personality constructs. Using an organizational justice framework, this study examines the effect of warning on procedural justice perceptions. Additionally, the extent to which warning changes the relationships among personality variables, socially desirable responding, and organizational justice variables, was explored. The results suggest that warning did not negatively affect test‐taker reactions. However, the relationships among the justice measures and the personality variables and socially desirable responding differed across the warned and unwarned groups. The organizational justice model fit best and there was less multicollinearity among the personality variables in the warned condition, compared to the unwarned condition. Thus, providing a warning appears to have positive consequences when using personality measures.  相似文献   

The purpose of this study was to gather evidence on the validity of the Vando R-A Scale, a paper-and-pencil measure of perceptual reactance. The Vando R-A Scale and Petrie's kinesthetic aftereffect measure of perceptual reactance were administered to 46 participants drawn from university undergraduates. The Vando R-A Scale was not a valid measure of perceptual reactance. The continued use of the Vando R-A Scale as an alternate measure of perceptual reactance is contraindicated.  相似文献   

Little is known about how assessment center exercises might be designed to better elicit job-relevant behavior. This study uses trait activation theory as a theoretical lens for increasing the number of behaviors that can be observed in assessment centers. Two standardized exercise stimuli (specific exercise instructions and role-player prompts) are proposed, and their effects on the observability of candidate behavior are examined. Results showed a significant effect of role-player prompts in increasing both the general number of behavioral observations and the number of behavioral observations related to three out of four dimensions. Specific exercise instructions did not have effects on observability. Implications for trait activation theory and assessment center practice are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号