首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper discusses the item selection problem when the item responses follow a linear multiple factor model. Because of this restrictive assumption, not too unrealistic in situations such as mental testing, it is possible to select optimal sets of items without going through all possible combinations. A method proposed by Elfving to accomplish this is analyzed and then demonstrated through the use of two illustrations. The common and often used procedure of observing the magnitude of the correlation coefficient as an index in item selection is shown to have some merit in the single-factor case.Work performed under contract AF 41(657)-244 with the School of Aviation Medicine, Randolph AFB, Texas.  相似文献   

2.
Two studies examined faking of a 25-item biodata questionnaire. The first study investigated potential and actual faking of the form using three groups: a group told to make themselves look as good as possible, a group told to complete the form honestly, and a group completing the instrument in a real selection situation. Subjects were 58 current employees and 231 job applicants. Results indicated that subjects could fake the instrument when instructed to do so. Also, some faking appeared to be occurring in practice, although results depended upon the composition of the comparison group. Only eight items appeared to be fakable, and only three of these seemed to be faked in practice. In Study 2, 26 business majors rated the biodata items on eight dimensions of item type. Results showed that the three items faked in practice were less historical, objective, discrete, verifiable, and external than other items, and were more job relevant.  相似文献   

3.
With reference to a questionnaire aimed at assessing the performance of Italian nursing homes on the basis of the health conditions of their patients, we investigate two relevant issues: dimensionality of the latent structure and discriminating power of the items composing the questionnaire. The approach is based on a multidimensional item response theory model, which assumes a two-parameter logistic parameterization for the response probabilities. This model represents the health status of a patient by latent variables having a discrete distribution and, therefore, it may be seen as a constrained version of the latent class model. On the basis of the adopted model, we implement a hierarchical clustering algorithm aimed at assessing the actual number of dimensions measured by the questionnaire. These dimensions correspond to disjoint groups of items. Once the number of dimensions is selected, we also study the discriminating power of every item, so that it is possible to select the subset of these items which is able to provide an amount of information close to that of the full set. We illustrate the proposed approach on the basis of the data collected on 1,051 elderly people hosted in a sample of Italian nursing homes.  相似文献   

4.
多维题组效应Rasch模型   总被引:2,自引:0,他引:2  
首先, 本文诠释了“题组”的本质即一个存在共同刺激的项目集合。并基于此, 将题组效应划分为项目内单维题组效应和项目内多维题组效应。其次, 本文基于Rasch模型开发了二级评分和多级评分的多维题组效应Rasch模型, 以期较好地处理项目内多维题组效应。最后, 模拟研究结果显示新模型有效合理, 与Rasch题组模型、分部评分模型对比研究后表明:(1)测验存在项目内多维题组效应时, 仅把明显的捆绑式题组效应进行分离而忽略其他潜在的题组效应, 仍会导致参数的偏差估计甚或高估测验信度; (2)新模型更具普适性, 即便当被试作答数据不存在题组效应或只存在项目内单维题组效应, 采用新模型进行测验分析也能得到较好的参数估计结果。  相似文献   

5.
余嘉元 《心理学报》2002,34(5):80-86
运用联结主义中的级连相关模型对于小样本条件下的连续记分项目反应理论 (IRT)模型的项目参数和被试能力进行了估计。一组被试对于一组项目的反应矩阵作为级连相关模型的输入 ,这组被试的能力θ或该组项目的参数a、b和c作为该模型的输出 ,对神经网络进行训练使之具备了估计θ,a ,b或c的能力。计算机模拟的实验表明 ,如果测验中有少量项目取自于题库 ,就可以运用联结主义方法对IRT参数和被试能力进行较好的估计  相似文献   

6.
An experiment was conducted to investigate the effects of item order and questionnaire content on faking good or intentional response distortion. It was hypothesized that intentional response distortion would either increase towards the end of a long questionnaire, as learning effects might make it easier to adjust responses to a faking good schema, or decrease because applicants' will to distort responses is reduced if the questionnaire lasts long enough. Furthermore, it was hypothesized that certain types of questionnaire content are especially vulnerable to response distortion. Eighty‐four pre‐selected pilot applicants filled out a questionnaire consisting of 516 items including items from the NEO five factor inventory (NEO FFI), NEO personality inventory revised (NEO PI‐R) and business‐focused inventory of personality (BIP). The positions of the items were varied within the applicant sample to test if responses are affected by item order, and applicants' response behaviour was additionally compared to that of volunteers. Applicants reported significantly higher mean scores than volunteers, and results provide some evidence of decreased faking tendencies towards the end of the questionnaire. Furthermore, it could be demonstrated that lower variances or standard deviations in combination with appropriate (often higher) mean scores can serve as an indicator for faking tendencies in group comparisons, even if effects are not significant.  相似文献   

7.
顾红磊  王才康 《心理科学》2012,35(5):1247-1253
本研究以中文版生活定向测验(LOT-R)为例,使用CTCM和CTCU方法进行建模,旨在探讨中文表述的量表中是否存在项目表述效应以及影响项目表述效应的特质因素。采用中文版生活定向测验等量表组成的问卷对334名大学生进行测试。结果表明,中文版LOT-R存在项目表述效应,是一种反向题的项目表述效应,也可称为语言标签效应;在分离了反向题项目表述效应后,发现中文版LOT-R只存在乐观因子,而不再有悲观因子。这表明乐观-悲观同属于一种人格特质,而非两种不同的人格特质;反向题项目表述效应是一种结构性误差,会导致量表结构的扭曲。研究还发现担心错误和父母期望与反向题项目表述效应存在显著的负相关,表明越是担心错误或父母期望水平越高的被试,他们对反向题的项目表述效应(语言标签效应)越小。社会赞许性对反向题项目表述效应没有影响。  相似文献   

8.
He  Yinhong  Chen  Ping 《Psychometrika》2020,85(1):35-55

The maintenance of item bank is essential for continuously implementing adaptive tests. Calibration of new items online provides an opportunity to efficiently replenish items for the operational item bank. In this study, a new optimal design for online calibration (referred to as D-c) is proposed by incorporating the idea of original D-optimal design into the reformed D-optimal design proposed by van der Linden and Ren (Psychometrika 80:263–288, 2015) (denoted as D-VR design). To deal with the dependence of design criteria on the unknown item parameters of new items, Bayesian versions of the locally optimal designs (e.g., D-c and D-VR) are put forward by adding prior information to the new items. In the simulation implementation of the locally optimal designs, five calibration sample sizes were used to obtain different levels of estimation precision for the initial item parameters, and two approaches were used to obtain the prior distributions in Bayesian optimal designs. Results showed that the D-c design performed well and retired smaller number of new items than the D-VR design at almost all levels of examinee sample size; the Bayesian version of D-c using the prior obtained from the operational items worked better than that using the default priors in BILOG-MG and PARSCALE; and Bayesian optimal designs generally outperformed locally optimal designs when the initial item parameters of the new items were poorly estimated.

  相似文献   

9.
马洁  刘红云 《心理科学》2018,(6):1374-1381
本研究通过高中英语阅读测验实测数据,对比分析双参数逻辑斯蒂克模型 (2PL-IRT)和加入不同数量题组的双参数逻辑斯蒂克模型 (2PL-TRT), 探究题组数量对参数估计及模型拟合的影响。结果表明:(1) 2PL-IRT模型对能力介于-1.50到0.50的被试,能力参数估计偏差较大;(2)将题组效应大于0.50的题组作为局部独立题目纳入模型,会导致部分题目区分度参数的低估和大部分题目难度参数的高估;(3)题组效应越大,将其当作局部独立题目纳入模型估计项目参数的偏差越大。  相似文献   

10.
Various definitions and different approaches for assessing the complex construct of parental involvement (PI) have led to inconsistent findings regarding the impact of PI on child development. To date, limited information is available regarding the measurement invariance of PI measures across time and groups (e.g., children’s gender, ethnicity, and socio-economic status), leaving a concern that group differences in PI might reflect item bias instead of true differences in PI. The present study aimed to obtain a set of optimal items for measuring PI from kindergarten through the elementary school years and investigate whether they could be used for parents from different groups. A Rasch measurement model was implemented to investigate item difficulty, step calibrations, and measurement invariance (differential item functioning; DIF, here). The results from the Early Childhood Longitudinal Study, Kindergarten Class of 1998–1999 data set showed that 20 items can be used to measure three dimensions of PI—namely school/home involvement, family educational investment, and family routines—across four time points. Administrative time, children’s gender, ethnicity, and social economic status showed different levels of effect on item difficulty for half of these items. Practitioners and researchers should be cautious when using these items and are suggested to freely estimate the item parameters of DIF items as well as add more items to the PI scale to improve reliability.  相似文献   

11.
Responding to items on a personality questionnaire can evoke a variety of feelings, from discomfort to indifference to pleasure. Harrison Gough reported that when he wrote items for the California Psychological Inventory (CPI; Gough & Bradley, 1996), he tried to make the items as ego-syntonic as possible. Ego-syntonic items are those “which a respondent finds congenial, and on which giving an opinion is a rewarding act” (Gough & Bradley, 1996, p. 10). The present study asked 79 respondents to report how they felt after answering each CPI item. Average affect ratings were above neutral for a majority of items, indicating that Gough had some success in writing ego-syntonic items. Differences in item ego-syntonicity were attributable to other item characteristics. Respondents disliked responding to relatively odd and ambiguous items, items with linguistic negations, and items referring to negative feelings and situations. As predicted by Gough, respondents enjoyed responding to items on the communality scale, items with which most people agree. They also enjoyed items that referred to positive emotions and attitudes and to items indicating extraversion, conscientiousness, low neuroticism, and openness to experience. Highly ego-syntonic items were found to be more valid than less ego-syntonic items. Individuals who reported disliking many items were found to be socially anxious. The relation between reports of liking or disliking items, identity, and reputation are discussed, and further research on item response dynamics and validity is proposed.  相似文献   

12.
自由讨论条件下群体决策质量的影响因素   总被引:6,自引:2,他引:4  
通过实验室实验考察了自由讨论条件下群体决策质量的影响因素,并对Stasser所提出的信息取样模型进行了验证,结果发现:(1)部分证实Stasser的信息取样模型。在信息不分享的条件下,如果讨论前群体成员的偏好比较一致时,群体的确倾向于讨论分享信息和群体所偏好的候选人的信息;但如果讨论前的偏好不一致或任务难度较低时,这一结论难以成立。(2)自由讨论条件下,群体规模的增加会增加分享信息的讨论量,而对非分享信息的讨论程度则无显著影响。而在任务难度方面,只有任务难度较大的情况下才有分享信息的讨论优势。  相似文献   

13.
14.
This paper describes several simulation studies that examine the effects of capitalization on chance in the selection of items and the ability estimation in CAT, employing the 3-parameter logistic model. In order to generate different estimation errors for the item parameters, the calibration sample size was manipulated (N = 500, 1000 and 2000 subjects) as was the ratio of item bank size to test length (banks of 197 and 788 items, test lengths of 20 and 40 items), both in a CAT and in a random test. Results show that capitalization on chance is particularly serious in CAT, as revealed by the large positive bias found in the small sample calibration conditions. For broad ranges of theta, the overestimation of the precision (asymptotic Se) reaches levels of 40%, something that does not occur with the RMSE (theta). The problem is greater as the item bank size to test length ratio increases. Potential solutions were tested in a second study, where two exposure control methods were incorporated into the item selection algorithm. Some alternative solutions are discussed.  相似文献   

15.
Noventa  Stefano  Spoto  Andrea  Heller  Jürgen  Kelava  Augustin 《Psychometrika》2019,84(2):395-421
Psychometrika - Knowledge space theory (KST) structures are introduced within item response theory (IRT) as a possible way to model local dependence between items. The aim of this paper is...  相似文献   

16.
为编制适用于老年人的夫妻依恋测量问卷,将18份常用的成人依恋问卷的题目作为最初的项目来源,经过预试初步筛选得到85个项目。对434名60~88岁的社区老年人进行施测,并对施测结果进行项目分析和探索性因素分析,初步形成《老年人夫妻依恋问卷》。该问卷包含18个项目,分属于安全、焦虑和拒绝三个维度。该问卷具有良好的信度和效度,可用于测量城市社区老年人的夫妻依恋状况。运用《老年人夫妻依恋问卷》对样本的依恋模式进行考察,发现老年人夫妻依恋模式的分布特点表现为:安全型依恋模式的人数比例最高,其次是焦虑型依恋,最少是拒绝型依恋。  相似文献   

17.
Ul Hassan  Mahmood  Miller  Frank 《Psychometrika》2019,84(4):1101-1128

Item calibration is a technique to estimate characteristics of questions (called items) for achievement tests. In computerized tests, item calibration is an important tool for maintaining, updating and developing new items for an item bank. To efficiently sample examinees with specific ability levels for this calibration, we use optimal design theory assuming that the probability to answer correctly follows an item response model. Locally optimal unrestricted designs have usually a few design points for ability. In practice, it is hard to sample examinees from a population with these specific ability levels due to unavailability or limited availability of examinees. To counter this problem, we use the concept of optimal restricted designs and show that this concept naturally fits to item calibration. We prove an equivalence theorem needed to verify optimality of a design. Locally optimal restricted designs provide intervals of ability levels for optimal calibration of an item. When assuming a two-parameter logistic model, several scenarios with D-optimal restricted designs are presented for calibration of a single item and simultaneous calibration of several items. These scenarios show that the naive way to sample examinees around unrestricted design points is not optimal.

  相似文献   

18.
Most studies using personality inventories do not take individual, subjective understandings of the items into account. The present study is one of the few to have investigated the quality of individuals’ psychological processes when making the Likert-like responses often used in psychological inventories. Respondents were asked to elaborate verbally on their Likert item responses to the 10-item short version of the Big Five Inventory. A common assumption about personality inventories is that there is a relatively homogenous understanding of the items and, in particular, the rating scales across respondents. However, our results suggest that the same item responses to a given item can reflect a variety of qualities across individuals’ understandings. At the same time, similar understandings and ways of relating to an item can lead to different item responses. Such findings have substantial implications for quantitative personality studies as well as quantitative survey or questionnaire studies, in general.  相似文献   

19.
A linking design typically consists of a data collection procedure together with an item linking procedure that places item parameters calibrated from multiple test forms onto a common scale. This study considered 2 potentially useful item response theory linking designs. The first one is characterized by selecting a single set of common items across all multiple test forms, the precalibrated item parameters of which are kept fixed while the unknown parameters of the other items are being estimated. This linking design will be referred to as the fixed common-precalibrated item parameter design. However, data collected under this design could also be analyzed by the characteristic curve method, which constituted an alternative linking procedure. In this study, the relative merits of the 2 linking designs were examined with respect to their robustness against 3 manipulated conditions-namely, when the common items have imprecise estimates, when there is a noticeable difference in the average item difficulty between the common and the noncommon items, and when the examinees are heterogeneous in terms of their abilities. A parameter recovery study was conducted to achieve this purpose. The results indicated that both linking designs were capable of producing accurate linking of items and equivalent estimation of ability parameters under the 3 conditions. When the 2 designs were actually utilized in the development of an item bank, it was found that both linking designs produced quite consistent solutions despite minor differences on some item and ability estimates. Condition under which a linking design is preferred over the other is also provided in the Discussion section of this article.  相似文献   

20.
顾红磊  温忠粦 《心理科学》2014,37(5):1245-1252
项目表述效应是指由项目表述方式的差异引起的与测量内容无关的系统变异,项目表述效应模型的统计本质是一种双因子模型。本研究以核心自我评价量表(CSES)为例,探讨项目表述效应对人格测验信效度的影响。采用核心自我评价量表、生活满意度量表和积极情感消极情感量表对340名“蚁族”进行测查。结果表明,CSES在核心自我评价特质以外,还存在一个反向题项目表述效应因子;忽视项目表述效应对CSES的同质性信度和效标关联效度有重要影响:高估CSES的同质性信度,低估核心自我评价与生活满意度、积极情感的正相关,高估核心自我评价与消极情感的负相关。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号