期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Test theory without an answer key 总被引：2，自引：0，他引：2

William H. Batchelder A. Kimball Romney 《Psychometrika》1988,53(1):71-92

A general model is presented for homogeneous, dichotomous items when the answer key is not known a priori. The model is structurally related to the two-class latent structure model with the roles of respondents and items interchanged. For very small sets of respondents, iterative maximum likelihood estimates of the parameters can be obtained by existing methods. For other situations, new estimation methods are developed and assessed with Monte Carlo data. The answer key can be accurately reconstructed with relatively small sets of respondents. The model is useful when a researcher wants to study objectively the knowledge possessed by members of a culturally coherent group that the researcher is not a member of.This research was supported by NSF Grant No. SES-8320173 to the authors. We gratefully acknowledge comments and suggestions from John Boyd, Tarow Indow, and Kathy Maher as well as the editor and several anonymous referees. 相似文献

2.

Separability of item and person parameters in response time models

Gerard J. P. Van Breukelen 《Psychometrika》1997,62(4):525-544

This paper discusses two forms of separability of item and person parameters in the context of response time (RT) models. The first is separate sufficiency: the existence of sufficient statistics for the item (person) parameters that do not depend on the person (item) parameters. The second is ranking independence: the likelihood of the item (person) ranking with respect to RTs does not depend on the person (item) parameters. For each form a theorem stating sufficient conditions, is proved. The two forms of separability are shown to include several (special cases of) models from psychometric and biometric literature. Ranking independence imposes no restrictions on the general distribution form, but on its parametrization. An estimation procedure based upon ranks and pseudolikelihood theory is discussed, as well as the relation of ranking independence to the concept of double monotonicity.I am indebted to Wim van der Linden for bringing Thissen's (1983) paper to my notice, and to Martijn Berger, Frans Tan, and the anonymous reviewers for their constructive comments on earlier drafts of this paper. 相似文献

3.

The role of drivers’ social interactions in their driving behavior: Empirical evidence and implications for car-following and traffic flow

《Transportation Research Part F: Traffic Psychology and Behaviour》2021

Models for describing the microscopic driving behavior rarely consider the “social effects” on drivers’ driving decisions. However, social effect can be generated due to interactions with surrounding vehicles and affect drivers’ driving behavior, e.g., the interactions result in imitating the behavior of peer drivers. Therefore, social environment and peer influence can impact the drivers’ instantaneous behavior and shift the individuals’ driving state. This study aims to explore empirical evidence for existence of a social effect, i.e., when a fast-moving vehicle passes a subject vehicle, does the driver mimic the behavior of passing vehicle? High-resolution Basic Safety Message data set (N = 151,380,578) from the Safety Pilot Model Deployment program in Ann Arbor, Michigan, is used to explore the issue. The data relates to positions, speeds, and accelerations of 63 host vehicles traveling in connected vehicles with detailed information on surrounding environment at a frequency of 10 Hz. Rigorous random parameter logit models are estimated to capture the heterogeneity among the observations and to explore if the correlates of social effect can vary both positively and negatively. Results show that subject drivers do mimic the behavior of passing vehicles –in 16 percent of passing events (N = 18,099 total passings occurred in freeways), subject vehicle drivers are observed to follow the passing vehicles accelerating. We found that only 1.2 percent of drivers normally sped up (10 km/hr in 10 s) during their trips, when they were not passed by other vehicles. However, if passed by a high speed vehicle the percentage of drivers who sped up is 16.0 percent. The speed change of at least 10 km/hr within 10 s duration is considered as accelerating threshold. Furthermore, the acceleration of subject vehicle is more likely if the speed of subject driver is higher and more surrounding vehicles are present. Interestingly, if the difference with passing vehicle speed is high, the likelihood of subject driver’s acceleration is lower, consistent with expectation that if such differences are too high, the subject driver may be minimally affected. The study provides new evidence that drivers’ social interactions can change traffic flow and implications of the study results are discussed. 相似文献

4.

《心理统计学》教学内容的新探索

胡竹菁董圣鸿张阔《心理学探新》2013,(5):402-408

本文在对当前国内外主要心理统计学教材进行比较的基础上,指出与上个世纪八十年代的心理统计学教材内容相比较,在内容上的新探索主要体现在（1）由“假设检验”的内容中发展出“统计检验力”和“效果大小”的统计指标和估计方法;（2）引进一般线性模型来统合方差分析和回归分析这两种统计方法;（3）适度增加一些“多元统计分析”的内容等三个方面.本文对前两个方面的新内容作了简要评述,并对教材内容的编排方面提出了新的思路. 相似文献

5.

多维项目反应理论补偿性模型参数估计:基于广义回归神经网络集合

下载免费PDF全文

王鹏孟维璇朱干成张登浩张利会董一萱司英栋《心理学探新》2019,(3):244-249

运用广义回归神经网络(GRNN)方法对小样本多维项目反应理论(MIRT)补偿性模型的项目参数进行估计,尝试解决传统参数估计方法样本数量要求较大的问题。MIRT双参数Logistic补偿模型被设置为二级计分的二维模型。首先,模拟二维能力参数、项目参数值与考生作答矩阵。其次,把通过主成分分析得到的前两个因子在每个题目上的载荷作为区分度的初始值以及题目通过率作为难度的初始值,这两个指标的初始值作为神经网络的输入。集成100个神经网络,其输出值的均值作为MIRT的项目参数估计值。最后,设置2×2种(能力相关水平:0.3和0.7; 两种估计方法:GRNN和MCMC方法)实验处理,对GRNN和MCMC估计方法的返真性进行比较。结果表明,小样本的情况下,基于GRNN集成方法的参数估计结果优于MCMC方法。相似文献

6.

Item Parameter Drift in Context Questionnaires from International Large-Scale Assessments

HyeSun Lee Kurt F. Geisinger 《International Journal of Testing》2019,19(1):23-51

The purpose of the current study was to examine the impact of item parameter drift (IPD) occurring in context questionnaires from an international large-scale assessment and determine the most appropriate way to address IPD. Focusing on the context of psychometric and educational research where scores from context questionnaires composed of polytomous items were employed for the classification of examinees, the current research investigated the impacts of IPD on the estimation of questionnaire scores and classification accuracy with five manipulated factors: the length of a questionnaire, the proportion of items exhibiting IPD, the direction and magnitude of IPD, and three decisions about IPD. The results indicated that the impact of IPD occurring in a short context questionnaire on the accuracy of score estimation and classification of examinees was substantial. The accuracy in classification considerably decreased especially at the lowest and highest categories of a trait. Unlike the recommendation from literature in educational testing, the current study demonstrated that keeping items exhibiting IPD and removing them only for transformation were appropriate when IPD occurred in relatively short context questionnaires. Using 2011 TIMSS data from Iran, an applied example demonstrated the application of provided guidance in making appropriate decisions about IPD. 相似文献

7.

Developing new online calibration methods for multidimensional computerized adaptive testing

下载免费PDF全文

Ping Chen Chun Wang Tao Xin Hua‐Hua Chang 《The British journal of mathematical and statistical psychology》2017,70(1):81-117

Multidimensional computerized adaptive testing (MCAT) has received increasing attention over the past few years in educational measurement. Like all other formats of CAT, item replenishment is an essential part of MCAT for its item bank maintenance and management, which governs retiring overexposed or obsolete items over time and replacing them with new ones. Moreover, calibration precision of the new items will directly affect the estimation accuracy of examinees’ ability vectors. In unidimensional CAT (UCAT) and cognitive diagnostic CAT, online calibration techniques have been developed to effectively calibrate new items. However, there has been very little discussion of online calibration in MCAT in the literature. Thus, this paper proposes new online calibration methods for MCAT based upon some popular methods used in UCAT. Three representative methods, Method A, the ‘one EM cycle’ method and the ‘multiple EM cycles’ method, are generalized to MCAT. Three simulation studies were conducted to compare the three new methods by manipulating three factors (test length, item bank design, and level of correlation between coordinate dimensions). The results showed that all the new methods were able to recover the item parameters accurately, and the adaptive online calibration designs showed some improvements compared to the random design under most conditions. 相似文献

8.

垂直量尺化中的参数标定方法及其性能比较

叶萌辛涛《心理科学进展》2014,22(10):1669-1678

在使用项目反应理论进行量尺化的语境下, 参数标定方法是垂直量尺化结果的一个至关重要的影响因素。目前的参数标定研究就部分标定方法的相对性能获得了较一致的结果, 就如何实现更佳量尺化也提出了很多新的标定方法。除了在既有框架内继续探索, 以形成完整的研究体系, 未来的研究应结合相关学科, 在深入了解学业增长的性质的基础上探究标定方法的性能, 并应考察特定的标定方法和特定的研究条件、量尺化语境之间的最佳匹配关系。相似文献

9.

多维项目反应理论等级反应模型 总被引：2，自引：0，他引：2

杜文久肖涵敏《心理学报》2012,44(10):1402-1407

基于因子分析和单维项目反应理论的多维项目反应理论是测量理论的新发展方向之一。但是, 多维项目反应理论仍处于不成熟的发展阶段, 多数研究也只是以二级评分为主。本文首先介绍了逻辑斯蒂形式的多维等级反应模型, 并以二维等级反应模型为例, 分析了模型的数学函数图像及其性质。然后, 推导出了多维等级反应模型的项目信息函数, 并结合实例进行了讨论。进一步地, 本文阐述了使用联合极大似然估计和马尔科夫链蒙特卡洛方法估计多维等级反应模型参数的思想。最后, 指出了一些有待研究的问题。相似文献

10.

项目反应理论新进展：基于3PLM和GRM的混合模型

下载免费PDF全文

涂冬波蔡艳戴海琦丁树良《心理科学》2011,34(5):1189-1194

IRT中的计量模型较多,不同计量模型适合不同特点的数据资料,实际工作者应根据实际情况选择适当的IRT模型来分析数据。我国是个考试、测评大国,测评的题型丰富多样,在实际应用IRT时,一个模型往往很难反应所有数据资料本身的特点,这时可考虑应用多个IRT模型（即“混合模型”）来分析,以达到对数据的最佳拟合。本文对混合模型的思想方法及原理、参数估计的实现、以及模型性能进行了研究,发现：（1）本文自主开发的混合模型参数估计程序Mix_Tu具有较高的返真性,且与国际知名测量软件Parscale相当。（2）在“项目异常”情况下,Mix_Tu程序对参数b和c的估计受数据异常程度的影响要大于Parscale程序,而对参数a的估计受数据异常程度的影响要小于Parscale程序,而在参数theta上两个程序相当。（3）在“被试异常”情况下,Mix_Tu程序对所有参数的估计受数据异常程度的影响均要小于Parscale程序,Mix_Tu程序表现的更为稳健。相似文献