题目位置效应(Item Position Effect, IPE)是指在剔除随机误差的影响之后, 同一道题目在不同测验间因题目位置的变化而导致题目参数的变化。IPE的存在会严重威胁依赖于项目反应理论参数不变性特征的相关应用, 比如测验等值和计算机化自适应测验。目前关于这一领域的研究主要集中于对IPE的检测, 而对所检测到的效应进行进一步的解释, 则是今后的研究重点。另外, 在不同的研究情境下深入探讨IPE, 对于基础研究领域和实践领域都具有重要意义。  相似文献   

条件性推理测验关注人们如何解决表面上看上去类似于传统归纳推理测验的问题,其真实目的是根据反应者是否将基于某种内隐偏差的解决方案视为合理的,进而评估反应者的人格倾向.有证据表明该方法可以有效防止自陈问卷的有意扭曲等相关问题,获得更可靠的结果.这种测评思路在成就动机和攻击性两个领域的研究中已经获得初步成效,测验的信度和效度都较为理想.然而也需指出,这一新思路尚处在发展中,还有一些问题如施测方法、测验构建和思路拓展等需作进一步探讨.  相似文献   

计算机形式的测验能够记录考生在测验中的题目作答时间(Response Time, RT),作为一种重要的辅助信息来源,RT对于测验开发和管理具有重要的价值,特别是在计算机化自适应测验(Computerized Adaptive Testing, CAT)领域。本文简要介绍了RT在CAT选题方面应用并作以简评,分析了这些技术在实践中的可行性。最后,探讨了当前RT应用于CAT选题存在的问题以及可以进一步开展的研究方向。  相似文献   

个人拟合指标是考察心理测验中偏差得分模式的新方法。研究中考察了G、C、MCI、U3、U、W、ECI6、L等8个拟合指标对艾森克人格问卷信效度的影响,以及各指标与正反向题回答不一致项目数的相关。结果表明,删除不同比例拟合程度不好的个体后,测验的信效度明显提高。同时PFS可鉴别人格测验中的默认反应偏差。各指标中l对测验信效度的改善效果最为理想。  相似文献   

人格测验的参照情境效应是指,在一般人格测验的基础上,设置某种特定的参照情境,进而使测验的效标关联效度得以提高的现象。在过去十余年里,参照情境效应的考察重心从早期的效度证据搜集逐渐转向内部机理的探讨。研究者试图通过参照情境与效标的逻辑关联、参照情境的被试间变异及被试内变异来解释现象背后的测量学原理。在构念层面则提出“人格和角色认同层级模型”,以此说明参照情境效应的人格机制问题。然而,该主题的探索尚处于初始阶段,未来研究可从参照情境的操作范式、参照情境效应的调节机制等方面继续寻求突破。  相似文献   

通过测试正反向陈述的NEO-FFI和EPQ,探讨了中国高中生中默认、极端化、折中化、弹性反应风格的特点,及题目陈述方向的改变对其人格测验信效度的影响。结果发现反应风格在中国高中生中确实存在,折中化和弹性风格对测验的影响最为严重,其次为极端化风格,默认风格则可能不算一种偏差。NEO-FFI量表在使用反向陈述题目时信效度下降,说明由于教育水平低而使高中生理解反向题时存在困难  相似文献   

为认知诊断测验制定的题目属性向量平衡(IAVB)策略强调测验必须体现认知模型,并将题目属性向量而不是以单个属性作为考察单位。该策略克服严格属性平衡(SAB)策略仅适用于独立结构的不足,且在每个题目考察属性个数(大致)相同的条件下,以模式判准率(PMR)为衡量标准,该策略优于非IAVB策略。特别地,若属性层级结构为独立结构时,采用IAVB策略的测验最优,SAB策略次之,两种策略均未采用则最差。另IAVB矩阵可显著提高PMR。  相似文献   

选择题中的作答选项能提供额外诊断信息, 为充分利用选项信息, 研究提出认知诊断计算机自适应测验(CD-CAT)中两种处理选择题选项信息的非参数选题策略和变长终止规则。模拟研究的结果发现:(1)定长条件下两种非参数选题策略的分类准确性整体要高于参数选题策略; (2)两种非参数选题策略较参数选题策略具有更加均衡的题库使用情况; (3)非参数选题策略在两种新的变长终止规则下具有更高的分类准确率; (4)两种非参数选题策略均适用于选择题CD-CAT情境, 使用者可任选其一进行测验分析。  相似文献   

人格测验中作假的控制方法   总被引:2,自引:0,他引:2  
被试很容易对人格测验作假,这严重影响了人格测验的有效性。目前测评专家已经提出了一些应对作假的方法,它们可被分为事前控制技术和事后识别技术两大类。前者包括迫选式量表,警告及假渠道技术等,后者包括作假识别量表,IRT及反应时识别技术等。目前,在人格测验中嵌套使用作假识别量表,以及在测验指导语中加入警告是比较有效的两种方法,迫选式量表的发展也值得期待。由于研究者对作假的内部发生机制了解较少,这制约了IRT与反应时识别技术的发展。  相似文献   

随着心理与教育测量研究的发展和科技的进步,计算机化(大规模)测验逐渐受到人们的关注。为探究在计算机化多维测验中如何利用作答时间数据来辅助评估多维潜在能力,以及为我国义务教育阶段教育质量监测提供数据分析方法上的理论支持。本研究以2012年和2015年国际学生能力评估(PISA)计算机化数学测验数据为例,提出了一种可同时利用作答时间和作答精度数据的联合作答与时间的多维Rasch模型。根据新模型对PISA数据的分析结果,表明引入作答时间数据,不仅有助于提高模型参数的估计精度,还有助于数据分析者利用被试的作答时间信息来做进一步的决策和干预(e.g., 对异常作答行为或预备知识的诊断)。  相似文献   

Using confirmatory factor analyses, we examined method effects on Rosenberg's Self-Esteem Scale (RSES; Rosenberg, 1965) in a sample of older European adults. Nine hundred forty nine community-dwelling adults 60 years of age or older from 5 European countries completed the RSES as well as measures of depression and life satisfaction. The 2 models that had an acceptable fit with the data included method effects. The method effects were associated with both positively and negatively worded items. Method effects models were invariant across gender and age, but not across countries. Both depression and life satisfaction predicted method effects. Individuals with higher depression scores and lower life satisfaction scores were more likely to endorse negatively phrased items.  相似文献   

Including equal numbers of positively and negatively keyed items is common in Five-Factor Model (FFM) personality measures. Much literature has demonstrated the presence of positive and negative keying factors in low-stakes testing situations, but there is a dearth of research investigating these factors in high-stakes testing. To address this gap, we investigated whether an FFM measure used in high-stakes testing was influenced by positive and negative keying factors. We also examined the overlap of the positive and negative keying factors with social desirability, rule-consciousness, acquiescence, and cognitive ability. Confirmatory factor analysis supported the inclusion of distinct factors associated with positively and negatively keyed items and suggested that the keying factors accounted for a substantial portion of variation in responses to FFM items. Social desirability and rule-consciousness were found to have significant relations with both keying factors, whereas acquiescence was only related to the negative keying factor. Implications for the construct validity of FFM measures used in high-stakes testing and directions for future research are discussed.  相似文献   

The aim of this study was to investigate the relationship between different personality variables and pathological gambling (PG). The NEO-FFI and measures of impulsivity and sensation-seeking were administered to a sample of pathological gamblers (n = 90) and to a contrast group of non-pathological gamblers (n = 66) matched on sex and age. Gender, age, education level and the personality variables were entered into crude and adjusted logistic regression analyses with PG-status as the dependent variable. The results showed that educational level and all personality variables were significant predictors of PG in the crude analyses, however only four of the 12 significant predictor variables (Neuroticism, Openness, Impulsivity, and need for Stimulus Intensity) remained significant in the adjusted analysis. All predictor variables accounted for 71% of the variance in PG-status. Clinical implications of the findings are discussed.  相似文献   

已有研究发现大学生的心理健康等心理状态随社会变迁而发生变化,那么作为个体内在特质的人格是如何随年代变化呢?本研究对2004至2013年65篇采用五因素人格量表的研究进行了横断历史分析,以揭示47029名大学生人格特质的变化趋势。结果发现:(1)2004至2013年期间,大学生的人格五个因子得分与年代之间均呈现显著正相关,大学生人格特质发生了整体变化。(2)大学生的神经质、外向性、开放性和严谨性得分在10年期间均上升1个标准差以上,其d值在1.06~1.30之间,宜人性提高0.57个标准差。大学生在变得更加外向、开放、严谨和宜人的同时,情绪稳定性也更差。(3)男女大学生人格特质的变化既有共同趋势,又存在明显差异,男生的五个因子得分都显著上升,但是女生的开放性无变化,宜人性却有所下降。  相似文献   

Reference-group effects (discovered in cross-cultural settings) occur when responses to self-report items are based not on respondents’ absolute level of a construct but rather on their level relative to a salient comparison group. In this article, we examine the impact of reference-group effects on the assessment of self-reported personality and attitudes. Two studies illustrate that a reference-group effect can be induced by small changes to instruction sets, changes that mirror the instruction sets of commonly used measures of personality. Scales that specified different reference groups showed substantial reductions in criterion-related validities for academic performance, self-reported counterproductive behaviors, and self-reported health outcomes relative to reference-group-free versions of those scales.  相似文献   


The authors investigated 3 aspects of the learned helplessness (LH) phenomenon: the induction of helplessness in humans by a new instrumental task, the effects of a therapy technique that relies on direct retroactive reevaluation of the helplessness experience, and the role of personality characteristics in both helplessness induction and therapy. The sample consisted of 92 Turkish Bo?aziçi University undergraduates, 42 men and 50 women. The authors exposed 2 experimental groups to an LH induction by presenting them with an unsolvable maze task; 1 group received therapy afterward, and the other group did not. There were also 2 control groups: a group that received only a solvable version of the maze and another group that received no treatment. Before the experimental procedure, all participants completed the Turkish version of the NEO-Five Factor Inventory (FFI). The authors evaluated picture-rating and anagram-solving performances to differentiate the cognitive and emotional deficits of LH. Results of the factorial analyses of variance and the Wilcoxon signed ranks test supported the success of both the helplessness induction and the therapy technique. Although no significant gender differences were found in the effects of the helplessness-induction and therapy procedures, correlation analyses revealed that individual differences, particularly in the interaction between gender and personality characteristics, can have an important impact on LH and on the capacity to benefit from therapy.  相似文献   

Two studies examined whether the middle response option in graphic rating scales indicates a moderate standing on a trait/item, or rather a “dumping ground” for unsure or non-applicable (N/A) responses. Study One identified middle response-option dysfunction. Study Two indicated that respondents use the middle response option as an N/A proxy, even under implicit ‘skip if you do not know’ instructional sets. Although middle response category ‘misuse’ did not adversely affect reliability and validity in these studies, it is recommended that assessment developers (especially in on-line administration contexts) regularly include an N/A response option when administering graphic rating scales.
