共查询到20条相似文献,搜索用时 15 毫秒
1.
测验垂直等值是指将测试同一心理特质的不同水平的测验转换到同一个分数量尺上的过程。IRT与MIRT是实现垂直等值的主要方法。IRT无需假设被试的能力分布, 参数估计不依赖于样本, 是构建垂直量表的有效方法, 但测验不满足单维假设时其应用受到限制。MIRT结合IRT和因素分析的特点对IRT进行了拓展, 可更有效估计多维测验的项目参数和被试能力参数, 在垂直等值中有重要应用。已有研究主要探讨IRT和MIRT在垂直等值应用中的适用性、标定方法和参数估计方法, 比较研究两种方法的特性。未来研究应纳入更多变量条件进行比较研究, 拓展方法的应用。 相似文献
2.
该文介绍了一种新的等值方法一核等值法。首先介绍了核等值法的研究过程、它的主要特点以及五个步骤(前平滑处理、估计分数概率、连续化、等值、计算等值标准误)。之后,介绍了核等值法与其他传统的观察分等值方法的差异,最后是对核等值法的评价。 相似文献
3.
4.
探究带宽选择方法、样本量、题目数量、等值设计、数据模拟方式对项目反应理论观察分数核等值的影响。通过两种数据模拟方式,获得研究数据,并计算局部与全域评价指标。研究发现,在随机组设计中,带宽选择方法表现相似;考生样本量和题目数量影响甚微。在非等组设计中,惩罚法与Silverman经验准则表现优异;增加题目量可降低百分相对误差和随机误差;增加样本量导致百分相对误差变大,随机误差减小。数据模拟方式可影响等值评价。未来应重点关注等值系统评估。 相似文献
5.
6.
等级反应模型项目特征曲线法等值研究 总被引:2,自引:0,他引:2
主、客观题并用的测验建项目反应理论题库需作多级模型项目参数等值,本研究推演了等级反应模型下项目特征曲线等值方法并在实际等值试验中获得成功. 相似文献
7.
8.
Thomas A. Fergus David P. Valentiner Patrick B. McGrath Stephanie L. Gier-Lonsway Hyun-Soo Kim 《Journal of personality assessment》2013,95(3):310-320
Mattick and Clarke's (1998) Social Interaction Anxiety Scale (SIAS) and Social Phobia Scale (SPS) are commonly used self-report measures that assess 2 dimensions of social anxiety. Given the need for short, readable measures, this research proposes short forms of both scales. Item-level analyses of readability characteristics of the SIAS and SPS items led to the selection of 6 items from each scale for use in the short forms. The SIAS and SPS short forms had reading levels at approximately the 6th and 5th grade level, respectively. Results using nonclinical (Study 1: N = 469) and clinical (Study 2: N = 145) samples identified these short forms as being factorially sound, possessing adequate internal consistency, and having strong convergence with their full-length counterparts. Moreover, these short forms showed convergence with other measures of social anxiety, showed divergence from measures assessing related constructs, and predicted concurrent interpersonal functioning. Recommendations for the use of these short forms are discussed. 相似文献
9.
10.
经典测量理论等值的误差研究 总被引:3,自引:0,他引:3
1 引言 等值 ,是以铆测验或铆被试组为桥梁建立两份同特质测验结果之间的比较关系。许多因素会影响等值的准确性 ,由于被试抽样给等值带来的误差叫等值抽样误差。它指的是 ,由于等值所用被试样本是从其总体中进行了不可避免的有一定程度偏性的抽样而得到的 ,据此建立的等值关系也就具有一定程度的偏差 ,这种偏差即是等值抽样误差。通过从总体中重复抽样、以一个完全拟合数据条件的等值方法进行等值 ,那么 ,等值结果分布的平均数即是真正的等值分数 ,而分布的标准差即是等值抽样标准误。本文将对等值抽样误差问题进行探讨。2 研究方法2 … 相似文献
11.
基于经典测验理论(CTT)的等值方法主要有线性等值和等百分位等值两种。在不同情境下,不同的等值方法会产生不同的等值结果。本研究以真分数等值为依据,用蒙特卡洛模拟研究方法,综合比较了各种题目难度分布条件下和各种样本容量条件下两种CTT等值方法的等值结果。研究结果表明:(1)线性等值的误差受题目难度分布影响较大,等百分位等值的误差几乎不受题目难度分布影响。(2)线性等值的误差几乎不受样本容量的影响,等百分位等值的误差受样本容量影响较大。(3)不论题目难度分布如何,只要样本容量足够大,等百分位等值的效果都比线性等值更好。 相似文献
12.
Hongwen Guo 《Psychometrika》2010,75(3):438-453
After many equatings have been conducted in a testing program, equating errors can accumulate to a degree that is not negligible
compared to the standard error of measurement. In this paper, the author investigates the asymptotic accumulative standard
error of equating (ASEE) for linear equating methods, including chained linear, Tucker, and Levine, under the nonequivalent
groups with anchor test (NEAT) design. A recursive formula for the ASEE is provided for a series of equatings that makes use
of only historical summary statistics. This formula can serve as a new tool to measure the magnitude of equating errors that
have accumulated over a series of equatings, and to help monitor and design testing programs. 相似文献
13.
14.
学习通过文字或图片方式显示的空间布局材料之后,采用空间定位任务.分别检验在不同朝向和相对位置的条件下,对想象空间方位判断的差异。结果表明,使用图片进行空间布局的学习,记忆后再各自以同样显示形式进行空间定位任务时,判断的速度比使用文字的反应时要快;同时,无论是文字还是图片形式呈现.在进行想象转向时.都会呈现朝向效应和相对位置效应。 相似文献
15.
Nhung T. Nguyen Michael D. Biderman Michael A. McDaniel 《International Journal of Selection & Assessment》2005,13(4):250-260
A situational judgment test (SJT) and a Big 5 personality test were administered to 203 participants under instructions to respond honestly and to fake good using a within‐subjects design. Participants indicated both the best and worst response (i.e., Knowledge) and the most likely and least likely response (i.e., Behavioral Tendency) to each situation. Faking effect size for the SJT Behavioral Tendency response format was (d=.34) when participants responded first under honest instructions and (d=.15) when they responded first under faking instructions. Those for the Big 5 dimensions ranged from d=.26 to d=1.0. For the Knowledge response format results were inconsistent. Honest condition Knowledge SJT scores were more highly correlated with cognitive ability (r=.56) than were Behavioral Tendency SJT scores (r=.38). Implications for researchers and practitioners are discussed. 相似文献
16.
17.
在项目反应理论框架下,根据已有文献提出了开发新的测验等值准则的方法,即许多准则都可以看成是通过对锚题上作答反应概率分布进行变换而导出。据此揭示了两个著名的等值准则——Haebara方法和Stocking-Lord方法之间的联系,并且导出了一个新的等值准则——余弦等值准则。为了讨论余弦准则的行为表现,开展了一系列Monte-Carlo模拟研究。模拟结果表明,余弦准则在多级评分模型GPCM上表现比Haebara方法和Stocking--Lord方法都好,而对GRM和2PLM,其表现不如Haebara,但可以和Stocking-Lord方法相提并论。这一发现提醒我们等值准则的选用是否恰当,不仅与等值系数所落的范围有关,而且还与项目反应函数(IRF)有更密切的关系 相似文献
18.
Lynn A. Mcfarland 《International Journal of Selection & Assessment》2003,11(4):265-276
To reduce faking on personality tests, applicants may be warned that a social desirability scale is embedded in the test. Although this procedure has been shown to substantially reduce faking, there is no data that addresses how such a warning may influence applicant reactions toward the selection procedure or the relationships among personality constructs. Using an organizational justice framework, this study examines the effect of warning on procedural justice perceptions. Additionally, the extent to which warning changes the relationships among personality variables, socially desirable responding, and organizational justice variables, was explored. The results suggest that warning did not negatively affect test‐taker reactions. However, the relationships among the justice measures and the personality variables and socially desirable responding differed across the warned and unwarned groups. The organizational justice model fit best and there was less multicollinearity among the personality variables in the warned condition, compared to the unwarned condition. Thus, providing a warning appears to have positive consequences when using personality measures. 相似文献
19.
The purpose of this study was to gather evidence on the validity of the Vando R-A Scale, a paper-and-pencil measure of perceptual reactance. The Vando R-A Scale and Petrie's kinesthetic aftereffect measure of perceptual reactance were administered to 46 participants drawn from university undergraduates. The Vando R-A Scale was not a valid measure of perceptual reactance. The continued use of the Vando R-A Scale as an alternate measure of perceptual reactance is contraindicated. 相似文献
20.
Little is known about how assessment center exercises might be designed to better elicit job-relevant behavior. This study uses trait activation theory as a theoretical lens for increasing the number of behaviors that can be observed in assessment centers. Two standardized exercise stimuli (specific exercise instructions and role-player prompts) are proposed, and their effects on the observability of candidate behavior are examined. Results showed a significant effect of role-player prompts in increasing both the general number of behavioral observations and the number of behavioral observations related to three out of four dimensions. Specific exercise instructions did not have effects on observability. Implications for trait activation theory and assessment center practice are discussed. 相似文献