期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The area between two item characteristic curves 总被引：1，自引：0，他引：1

Nambury S. Raju 《Psychometrika》1988,53(4):495-502

Formulas for computing the exact signed and unsigned areas between two item characteristic curves (ICCs) are presented. It is further shown that when thec parameters are unequal, the area between two ICCs is infinite. The significance of the exact area measures for item bias research is discussed.The author expresses his appreciation to Jeffrey A. Slinde, Stephen Steinhaus, Audrey Qualls-Payne, Ivo Molenaar, and two anonymous reviewers for their very helpful and constructive comments. 相似文献

2.

Optimal item difficulty for the three-parameter normal ogive response model

John H. Wolfe 《Psychometrika》1981,46(4):461-464

In tailored testing, it is important to determine the optimal difficulty of the next item to present to the examinee. This paper shows that the difference that maximizes information for the three-parameter normal ogive response model is approximately 1.7 times the optimal difference –b for the three-parameter logistic model. Under the normal model, calculation of the optimal difficulty for minimizing the Bayes risk is equivalent to maximizing an associated information function.The views expressed herein, are those of the author and do not necessarily reflect those of the Department of the Navy. 相似文献

3.

Same face,same place,different memory: manner of presentation modulates the associative deficit in older adults

Amy A. Overman Nancy A. Dennis John M. McCormick-Huhn Abigail B. Steinsiek Luisa B. Cesar 《Neuropsychology, development, and cognition. Section B, Aging, neuropsychology and cognition》2019,26(1):44-57

One of the more severe and consequential memory impairments experienced by older adults is the loss of the ability to form and remember associations. Although the associative deficit is often assumed to be unitary, memory episodes may contain different types of associations (e.g., item–item, item–context). Research in younger adults suggests that these different association types may involve different neural mechanisms. This raises the possibility that different association types are not equally affected by aging. In order to investigate this, the current study directly compared memory across item–item and item–context associations in younger and older adults by manipulating the manner of presentation of the associations. Results indicate that the associative deficit in aging is not uniform and that aging has a greater impact on item–context compared to item–item associations. The results have implications for theories of associative memory, age-related cognitive decline, and the functional organization of the medial temporal lobe in aging. 相似文献

4.

Anchor Point Selection: Scale Alignment Based on an Inequality Criterion

Carolin Strobl Julia Kopf Lucas Kohler Timo von Oertzen Achim Zeileis 《应用心理检测》2021,45(3):214

For detecting differential item functioning (DIF) between two or more groups of test takers in the Rasch model, their item parameters need to be placed on the same scale. Typically this is done by means of choosing a set of so-called anchor items based on statistical tests or heuristics. Here the authors suggest an alternative strategy: By means of an inequality criterion from economics, the Gini Index, the item parameters are shifted to an optimal position where the item parameter estimates of the groups best overlap. Several toy examples, extensive simulation studies, and two empirical application examples are presented to illustrate the properties of the Gini Index as an anchor point selection criterion and compare its properties to those of the criterion used in the alignment approach of Asparouhov and Muthén. In particular, the authors show that—in addition to the globally optimal position for the anchor point—the criterion plot contains valuable additional information and may help discover unaccounted DIF-inducing multidimensionality. They further provide mathematical results that enable an efficient sparse grid optimization and make it feasible to extend the approach, for example, to multiple group scenarios. 相似文献

5.

Weakly parallel tests in latent trait theory with some criticisms of classical test theory

Fumiko Samejima 《Psychometrika》1977,42(2):193-198

A new concept of weakly parallel tests, in contrast to strongly parallel tests in latent trait theory, is proposed. Some criticisms of the fundamental concepts in classical test theory, such as the reliability of a test and the standard error of estimation, are given. 相似文献

6.

Copula Functions for Residual Dependency

Johan Braeken Francis Tuerlinckx Paul De Boeck 《Psychometrika》2007,72(3):393-411

相似文献

7.

自我报告测量中项目前后关系效应的研究

冯明《心理学探新》1999,(4)

本文指出了自我报告法中项目前后关系效应的普遍存在性及其危害性。讨论了用信息加工的观点对项目前后关系效应所作的认知上的理论解释,以及测量工具中促使产生这种项目前后关系效应的关键特征。同时也讨论了项目序列位置的作用。相似文献

8.

应用项目反应理论创建图形推理测验题库

肖玮苗丹民朱宁宁张青华《心理学报》2006,38(6):934-940

自编235个图形推理测验题目。采用铆测验等值设计,以72个联合型瑞文测验题目为铆题,对初中到大学各能力层次的1733名男性进行了测验。使用BILOG MG3.0（边际极大似然估计）对实测数据进行了分析,采用Logsitic 3参数模型。剔除数据与模型拟合不好的题目以及信息函数最大值小于0.3的题目,最终建立一个包含181道题目的题库。该题库可以用于淘汰智力较低的应征青年相似文献

9.

Decomposition of a Rasch partial credit item into independent binary and indecomposable trinary items

Huynh Huynh 《Psychometrika》1996,61(1):31-39

For each Rasch (Masters) partial credit item, there exists a set of independent Rasch binary and indecomposable trinary items for which the sum of the scores and the partial credit score have identical probability density functions. If each indecomposable trinary item is further expressed as the sum of two binary items, then the binary items are positively dependent and cannot be both of the Rasch type. This paper was written while the author was working with Steve Ferrara and Hillary Michaels on some technical aspects of the Maryland School Performance Assessment Program. The author had been puzzled by the fact that most MSPAP assessment items have three or less score categories. With a psychometric justification now being apparent, this paper is dedicated to both of them. 相似文献

10.

A general Bayesian multilevel multidimensional IRT model for locally dependent data

Ken A. Fujimoto 《The British journal of mathematical and statistical psychology》2018,71(3):536-560

Many item response theory (IRT) models take a multidimensional perspective to deal with sources that induce local item dependence (LID), with these models often making an orthogonal assumption about the dimensional structure of the data. One reason for this assumption is because of the indeterminacy issue in estimating the correlations among the dimensions in structures often specified to deal with sources of LID (e.g., bifactor and two-tier structures), and the assumption usually goes untested. Unfortunately, the mere fact that assessing these correlations is a challenge for some estimation methods does not mean that data seen in practice support such orthogonal structure. In this paper, a Bayesian multilevel multidimensional IRT model for locally dependent data is presented. This model can test whether item response data violate the orthogonal assumption that many IRT models make about the dimensional structure of the data when addressing sources of LID, and this test is carried out at the dimensional level while accounting for sampling clusters. Simulations show that the model presented is effective at carrying out this task. The utility of the model is also illustrated on an empirical data set. 相似文献

11.

A Speeded Item Response Model with Gradual Process Change

Yuri Goegebeur Paul De Boeck James A. Wollack Allan S. Cohen 《Psychometrika》2008,73(1):65-87

An item response theory model for dealing with test speededness is proposed. The model consists of two random processes, a problem solving process and a random guessing process, with the random guessing gradually taking over from the problem solving process. The involved change point and change rate are considered random parameters in order to model examinee differences in both respects. The proposed model is evaluated on simulated data and in a case study. The research reported in this paper was supported by IAP P5/24 and GOA/2005/04, both awarded to Paul De Boeck and Iven Van Mechelen, and by IAP P6/03, awarded to Iven Van Mechelen. Yuri Goegebeur’s research was supported by a grant of the Danish Natural Science Research Council. 相似文献

12.

Item information and discrimination functions for trinary pcm items

Wies Akkermans Eiji Muraki 《Psychometrika》1997,62(4):569-578

For trinary partial credit items the shape of the item information and the item discrimination function is examined in relation to the item parameters. In particular, it is shown that these functions are unimodal if ₂ – ₁ < 4 ln 2 and bimodal otherwi The locations and values of the maxima are derived. Furthermore, it is demonstrated that the value of the maximum is decreasing in ₂ – ₁. Consequently, the maximum of a unimodal item information function is always larger than the maximum of a bimodal one, and similarly for the item discrimination function.The work reported herein was partially supported under the National Assessment of Educational Progress (Grant No. R999G30002; CFDA No. 84.999G) as administered by the Office of Educational Research and Improvement, US Department of Education. 相似文献

13.

Ordinal test fidelity estimated by an item sampling model

Norman Cliff John R. Donoghue 《Psychometrika》1992,57(2):217-236

A test theory using only ordinal assumptions is presented. It is based on the idea that the test items are a sample from a universe of items. The sum across items of the ordinal relations for a pair of persons on the universe items is analogous to a true score. Using concepts from ordinal multiple regression, it is possible to estimate the tau correlations of test items with the universe order from the taus among the test items. These in turn permit the estimation of the tau of total score with the universe. It is also possible to estimate the odds that the direction of a given observed score difference is the same as that of the true score difference. The estimates of the correlations between items and universe and between total score and universe are found to agree well with the actual values in both real and artificial data.Part of this paper was presented at the June, 1989, Meeting of the Psychometric Society. The authors wish to thank several reviewers for their suggestions. This research was mainly done while the second author was a University Fellow at the University of Southern California. 相似文献

14.

CD-CAT初始阶段项目选取方法

高椿雷罗照盛郑蝉金喻晓锋彭亚风郭小军《心理科学》2017,40(2):485-491

CD-CAT是CDA同CAT的相结合的产物,适用于课堂教学,是教师补救教学、学生自我学习的重要工具。作为CD-CAT重要组成部分的初始阶段项目选取方法是影响测验判准率的重要因素。本文基于现有研究和CDA的项目区分度提出了四种新的初始阶段项目选取方法:CTTID法、CDI法、CTTIDR*法和CDIR*法。通过模拟研究发现,在定长的CD-CAT下,题库质量是HD-HV下,初始阶段结束时,CTTIDR*法的PCCR比现有的T阵法高了.2999,比PWKL高了.1707,其它题库下趋势相同。整个测验结束时CTTIDR*法的判准率仍然是最高的。在变长的CD-CAT下,最大后验概率大于.7、.8、.9下,CTTIDR*法的被试平均测验长度比T阵法分别缩短了2.6170、2.2347、1.7470道题。相似文献

15.

允许CAT题目检查的区块题目袋方法

林喆陈平辛涛《心理学报》2015,47(9):1188-1198

允许题目检查能够促进计算机化自适应测验(CAT)在实际中的应用。在不影响能力估计精度和测验公平性的前提下, 允许CAT题目检查能够缓解考生考试焦虑, 减少无关因素引起的测量误差。区块题目袋方法是连续区块方法与题目袋方法的结合, 不仅能允许CAT题目检查, 还能够弥补题目袋方法的不足。研究结果表明：(1)合理作答策略下, 区块题目袋方法的估计精度在低能力水平上要优于题目袋方法; (2)在应对类似Wainer作答策略时, 区块题目袋方法的估计精度在所有能力水平上均优于题目袋方法。(3)随着区块数的增加, 区块题目袋方法的能力估计精度越接近无修改的基线水平。相似文献

16.

兼顾测验效率和题库使用率的CD-CAT选题策略

下载免费PDF全文

汪文义丁树良宋丽红《心理科学》2014,37(1):212-216

CD–CAT中已有选题策略较注重测验效率,而对题库使用率不够重视。针对此问题,基于DINA模型,引入两种新的选题策略KLED和RHA,同时对HA进行模拟研究。结果显示：PWKL与KLED只在测验效率上具有优势;KLED若按属性向量分层,题库使用率有所提高,KLED比ED更容易推广到其他有显式表达的诊断模型场合;HA、RHA和RP–PWKL可较好兼顾测验效度和题库使用率,但RP-PWKL需设置项目的最大曝光率阈值。两种新选题方法在定长和变长CD-CAT都具有一定的应用价值。相似文献

17.

等级反应模型下项目特征曲线等值法在大型考试中的应用 总被引：2，自引：1，他引：1

周骏欧东明徐淑媛戴海琦漆书青《心理学报》2005,37(6):832-838

在中国最大的资格考试之一的经济专业资格考试中,为保证不同年度间考试的可比性、进行题库建设和为计算机自适应考试做准备,应用项目反应理论中等级反应模型下的项目特征曲线等值法,采用铆测验等值设计,实现了4个年度考试资料的项目参数和能力参数的等值,并成功地组建了经济专业题库。在此基础上,利用等值技术对不同年份试卷的划界分数进行了比较,为经济考试的合格标准制定、确保考试的公平性提供了实证依据。相似文献

18.

A semiparametric approach for item response function estimation to detect item misfit

Carmen Köhler Alexander Robitzsch Katharina Fährmann Matthias von Davier Johannes Hartig 《The British journal of mathematical and statistical psychology》2021,74(Z1):157-175

When scaling data using item response theory, valid statements based on the measurement model are only permissible if the model fits the data. Most item fit statistics used to assess the fit between observed item responses and the item responses predicted by the measurement model show significant weaknesses, such as the dependence of fit statistics on sample size and number of items. In order to assess the size of misfit and to thus use the fit statistic as an effect size, dependencies on properties of the data set are undesirable. The present study describes a new approach and empirically tests it for consistency. We developed an estimator of the distance between the predicted item response functions (IRFs) and the true IRFs by semiparametric adaptation of IRFs. For the semiparametric adaptation, the approach of extended basis functions due to Ramsay and Silverman (2005) is used. The IRF is defined as the sum of a linear term and a more flexible term constructed via basis function expansions. The group lasso method is applied as a regularization of the flexible term, and determines whether all parameters of the basis functions are fixed at zero or freely estimated. Thus, the method serves as a selection criterion for items that should be adjusted semiparametrically. The distance between the predicted and semiparametrically adjusted IRF of misfitting items can then be determined by describing the fitting items by the parametric form of the IRF and the misfitting items by the semiparametric approach. In a simulation study, we demonstrated that the proposed method delivers satisfactory results in large samples (i.e., N ≥ 1,000). 相似文献

19.

IRT与MIRT在测验垂直等值中的应用

王怡唐文清刘晶张敏强李明黎光明《心理科学进展》2014,22(5):881-888

测验垂直等值是指将测试同一心理特质的不同水平的测验转换到同一个分数量尺上的过程。IRT与MIRT是实现垂直等值的主要方法。IRT无需假设被试的能力分布, 参数估计不依赖于样本, 是构建垂直量表的有效方法, 但测验不满足单维假设时其应用受到限制。MIRT结合IRT和因素分析的特点对IRT进行了拓展, 可更有效估计多维测验的项目参数和被试能力参数, 在垂直等值中有重要应用。已有研究主要探讨IRT和MIRT在垂直等值应用中的适用性、标定方法和参数估计方法, 比较研究两种方法的特性。未来研究应纳入更多变量条件进行比较研究, 拓展方法的应用。相似文献

20.

结合题目作答时间的计算机化自适应测验选题方法

郭治辰汪大勋蔡艳涂冬波《心理科学》2021,(5):1241-1248

计算机形式的测验能够记录考生在测验中的题目作答时间（Response Time, RT），作为一种重要的辅助信息来源，RT对于测验开发和管理具有重要的价值，特别是在计算机化自适应测验（Computerized Adaptive Testing, CAT）领域。本文简要介绍了RT在CAT选题方面应用并作以简评，分析了这些技术在实践中的可行性。最后，探讨了当前RT应用于CAT选题存在的问题以及可以进一步开展的研究方向。相似文献