首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
尽管多阶段测验(MST)在保持自适应测验优点的同时允许测验编制者按照一定的约束条件去建构每一个模块和题板,但建构测验时若因忽视某些潜在的因素而导致题目之间出现局部题目依赖性(LID)时,也会对MST测验结果带来一定的危害。为探究"LID对MST的危害"这一问题,本研究首先介绍了MST和LID等相关概念;然后通过模拟研究比较探讨该问题,结果表明LID的存在会影响被试能力估计的精度但仍为估计偏差较小,且该危害不限于某一特定的路由规则;之后为消除该危害,使用了题组反应模型作为MST施测过程中的分析模型,结果表明尽管该方法能够消除部分危害但效果有限。这一方面表明LID对MST中被试能力估计精度所带来的危害确实值得关注,另一方面也表明在今后关于如何消除MST中由LID造成危害的方法仍值得进一步探究的。  相似文献   

2.
摘 要 计算机化多阶段自适应测验是基于计算机技术的测验形式,它将题目集合作为测试单元,通过多阶段自适应的形式对被试进行测试和评分。近年来通过研究各种测验形式,发现其比计算机化自适应测验和传统纸笔测验突显出更大优势。与传统纸笔测验相比,其具有参数不变性、能力估计更精确等优势。与计算机化自适应测验相比,其具有可控制题目特性、被试可检查题目等优势。如何减小测量误差,使其应用更加便捷、有效,是未来研究的发展方向。  相似文献   

3.
马洁  刘红云 《心理科学》2018,(6):1374-1381
本研究通过高中英语阅读测验实测数据,对比分析双参数逻辑斯蒂克模型 (2PL-IRT)和加入不同数量题组的双参数逻辑斯蒂克模型 (2PL-TRT), 探究题组数量对参数估计及模型拟合的影响。结果表明:(1) 2PL-IRT模型对能力介于-1.50到0.50的被试,能力参数估计偏差较大;(2)将题组效应大于0.50的题组作为局部独立题目纳入模型,会导致部分题目区分度参数的低估和大部分题目难度参数的高估;(3)题组效应越大,将其当作局部独立题目纳入模型估计项目参数的偏差越大。  相似文献   

4.
多维题组效应Rasch模型   总被引:2,自引:0,他引:2  
首先, 本文诠释了“题组”的本质即一个存在共同刺激的项目集合。并基于此, 将题组效应划分为项目内单维题组效应和项目内多维题组效应。其次, 本文基于Rasch模型开发了二级评分和多级评分的多维题组效应Rasch模型, 以期较好地处理项目内多维题组效应。最后, 模拟研究结果显示新模型有效合理, 与Rasch题组模型、分部评分模型对比研究后表明:(1)测验存在项目内多维题组效应时, 仅把明显的捆绑式题组效应进行分离而忽略其他潜在的题组效应, 仍会导致参数的偏差估计甚或高估测验信度; (2)新模型更具普适性, 即便当被试作答数据不存在题组效应或只存在项目内单维题组效应, 采用新模型进行测验分析也能得到较好的参数估计结果。  相似文献   

5.
刘玥  刘红云 《心理学报》2012,44(2):263-275
题组模型可以解决传统IRT模型由于题目间局部独立性假设违背时所导致的参数估计偏差。为探讨题组随机效应模型的适用范围, 采用Monte Carlo模拟研究, 分别使用2-PL贝叶斯题组随机效应模型(BTRM)和2-PL贝叶斯模型(BM)对数据进行拟合, 考虑了题组效应、题组长度、题目数量和局部独立题目比例的影响。结果显示:(1) BTRM不受题组效应和题组长度影响, BM对参数估计的误差随题组效应和题组长度增加而增加。(2) BTRM具有一定的普遍性, 且当题组效应大, 题组长, 题目数量大时使用该模型能减少估计误差, 但是当题目数量较小时, 两个模型得到的能力估计误差都较大。(3)当局部独立题目的比例较大时, 两种模型得到的参数估计差异不大。  相似文献   

6.
In educational practice, a test assembly problem is formulated as a system of inequalities induced by test specifications. Each solution to the system is a test, represented by a 0–1 vector, where each element corresponds to an item included (1) or not included (0) into the test. Therefore, the size of a 0–1 vector equals the number of items n in a given item pool. All solutions form a feasible set—a subset of 2 n vertices of the unit cube in an n-dimensional vector space. Test assembly is uniform if each test from the feasible set has an equal probability of being assembled. This paper demonstrates several important applications of uniform test assembly for educational practice. Based on Slepian’s inequality, a binary program was analytically studied as a candidate for uniform test assembly. The results of this study establish a connection between combinatorial optimization and probability inequalities. They identify combinatorial properties of the feasible set that control the uniformity of the binary programming test assembly. Computer experiments illustrating the concepts of this paper are presented.  相似文献   

7.
A Bayesian random effects model for testlets   总被引:4,自引:0,他引:4  
Standard item response theory (IRT) models fit to dichotomous examination responses ignore the fact that sets of items (testlets) often come from a single common stimuli (e.g. a reading comprehension passage). In this setting, all items given to an examinee are unlikely to be conditionally independent (given examinee proficiency). Models that assume conditional independence will overestimate the precision with which examinee proficiency is measured. Overstatement of precision may lead to inaccurate inferences such as prematurely ending an examination in which the stopping rule is based on the estimated standard error of examinee proficiency (e.g., an adaptive test). To model examinations that may be a mixture of independent items and testlets, we modified one standard IRT model to include an additional random effect for items nested within the same testlet. We use a Bayesian framework to facilitate posterior inference via a Data Augmented Gibbs Sampler (DAGS; Tanner & Wong, 1987). The modified and standard IRT models are both applied to a data set from a disclosed form of the SAT. We also provide simulation results that indicates that the degree of precision bias is a function of the variability of the testlet effects, as well as the testlet design.The authors wish to thank Robert Mislevy, Andrew Gelman and Donald B. Rubin for their helpful suggestions and comments, Ida Lawrence and Miriam Feigenbaum for providing us with the SAT data analyzed in section 5, and to the two anonymous referees for their careful reading and thoughtful suggestions on an earlier draft. We are also grateful to the Educational Testing service for providing the resources to do this research.  相似文献   

8.
篇章形式的阅读测验是一种典型的题组测验,在进行项目功能差异(DIF)检验时需要采用与之匹配的DIF检验方法.基于题组反应模型的DIF检验方法是真正能够处理题组效应的DIF检验方法,能够提供题组中每个项目的DIF效应测量,是题组DIF检验方法中较有理论优势的一种,主要使用的方法是Rasch题组DIF检验方法.该研究将Rasch题组DIF检验方法引入篇章阅读测验的DIF检验中,对某阅读成就测验进行题组DIF检验,结果显示,该测验在内容维度和能力维度的部分子维度上出现了具有显著DIF效应的项目,研究从测验公平的角度对该测验的进一步修改及编制提出了一定的建议.研究中进一步将Rasch题组DIF检验方法与基于传统Rasch模型的DIF检验方法以及变通的题组DIF检验方法的结果进行比较,研究结果体现了进行题组DIF检验的必要性与优越性.研究结果表明,在篇章阅读测验中,能够真正处理题组效应的题组DIF检验方法更加具有理论优势且对于阅读测验的编制与质量的提高具有更重要的意义.  相似文献   

9.
For testlet response data, traditional item response theory (IRT) models are often not appropriate due to local dependence presented among items within a common testlet. Several testlet‐based IRT models have been developed to model examinees' responses. In this paper, a new two‐parameter normal ogive testlet response theory (2PNOTRT) model for dichotomous items is proposed by introducing testlet discrimination parameters. A Bayesian model parameter estimation approach via a data augmentation scheme is developed. Simulations are conducted to evaluate the performance of the proposed 2PNOTRT model. The results indicated that the estimation of item parameters is satisfactory overall from the viewpoint of convergence. Finally, the proposed 2PNOTRT model is applied to a set of real testlet data.  相似文献   

10.
本文将多维题组反应模型(MTRM)应用到多维题组测验的项目功能差异(DIF)检验中,通过模拟研究和应用研究探究MTRM在DIF检验中的准确性、有效性和影响因素,并与忽略题组效应的多维随机系数多项Logistic模型(MRCMLM)进行对比。结果表明:(1)随着样本量的增大,MTRM对有效DIF值检出率增高,错误率降低,在不同条件下结果的稳定性更高;(2)与MRCMLM相比,基于MTRM的DIF检验模型检验率更高,受到其他因素的影响更小;(3)当测验中题组效应较小时,MTRM与MRCMLM结果差异较小,但是MTRM模型拟合度更高。  相似文献   

11.
吴锐  丁树良  甘登文 《心理学报》2010,42(3):434-442
题组越来越多地出现在各类考试中, 采用标准的IRT模型对有题组的测验等值, 可能因忽略题组的局部相依性导致等值结果的失真。为解决此问题, 我们采用基于题组的2PTM模型及IRT特征曲线法等值, 以等值系数估计值的误差大小作为衡量标准, 以Wilcoxon符号秩检验为依据, 在几种不同情况下进行了大量的Monte Carlo模拟实验。实验结果表明, 考虑了局部相依性的题组模型2PTM绝大部分情况下都比2PLM等值的误差小且有显著性差异。另外, 用6种不同等值准则对2PTM等值并评价了不同条件下等值准则之间的优劣。  相似文献   

12.
13.
This paper focuses on the divergence behaviour of the successive geometric mean (SGM) method used to generate pairwise comparison matrices while solving a multiple stage, multiple objective (MSMO) optimization problem. The SGM method can be used in the matrix generation phase of our three‐phase methodology to obtain pairwise comparison matrix at each stage of an MSMO optimization problem, which can be subsequently used to obtain the weight vector at the corresponding stage. The weight vectors across the stages can be used to convert an MSMO problem into a multiple stage, single objective (MSSO) problem, which can be solved using dynamic programming‐based approaches. To obtain a practical set of non‐dominated solutions (also referred to as Pareto optimal solutions) to the MSMO optimization problem, it is important to use a solution approach that has the potential to allow for a better exploration of the Pareto optimal solution space. To accomplish a more exhaustive exploration of the Pareto optimal solution space, the weight vectors that are used to scalarize the MSMO optimization problem into its corresponding MSSO optimization problem should vary across the stages. Distinct weight vectors across the stages are tied directly with distinct pairwise comparison matrices across the stages. A pairwise comparison matrix generation method is said to diverge if it can generate distinct pairwise comparison matrices across the stages of an MSMO optimization problem. In this paper, we demonstrate the SGM method's divergence behaviour when the three‐phase methodology is used in conjunction with an augmented high‐dimensional, continuous‐state stochastic dynamic programming method to solve a large‐scale MSMO optimization problem. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

14.
Comments on The dissemination and implementation of evidence-based psychological treatments: A review of current efforts (see record 2010-02208-010) by Kathryn R. McHugh and David H. Barlow. The lead article in the February–March issue by McHugh and Barlow (2010) emphasized the need for “dissemination and implementation of evidence-based psychological treatments.” The authors identified a number of intervention programs as evidence based and in need of dissemination. One is multisystemic therapy (MST). They claimed that this program is among “the most successful dissemination efforts...pursued by treatment developers” (p. 79). McHugh and Barlow’s (2010) discussion of the implementation of MST in Hawaii is troubling, because it neglected to mention concerns about the perceived lack of cultural sensitivity of the MST program in that state.  相似文献   

15.
The challenges of specifying a complex and individualized treatment model and measuring fidelity thereto are described, using multisystemic therapy (MST) as an example. Relations between therapist adherence to MST principles and instrumental and ultimate outcome variables are examined, as are relations between clinical supervision and therapist adherence. The findings provide modest support for the associations between MST adherence measures and instrumental and ultimate outcomes. Results also show that adherence can be altered when clinical supervision and adherence monitoring procedures are fortified. The modest associations between adherence measures and youth outcomes argue for further refinement and validation of the MST adherence measure, especially in light of the well-established effectiveness of MST with challenging clinical populations and the increasing dissemination of MST programs.  相似文献   

16.
New findings regarding the mechanisms of action of electro-convulsive therapy (ECT) have led to novel developments in treatment technique to further improve this highly effective treatment for major depression. These new approaches include novel placements, optimization of electrical stimulus parameters, and new methods for inducing more targeted seizures(eg, magnetic seizure therapy [MST]). MST is the use of transcranial magnetic stimulation to induce a seizure. Magnetic fields pass through tissue unimpeded, providing more control over the site and extent of stimulation than can be achieved with ECT. This enhanced control represents a means of focusing the treatment on target cortical structures thought to be essential to antidepressant response and reducing spread to medial temporal regions implicated in the cognitive side effects of ECT. MST is at an early stage of development. Preliminary results suggest that MST may have some advantages over ECT in terms of subjective side effects and acute cognitive functioning. Studies designed to address the antidepressant efficacy of MST are underway. As with all attempts to improve convulsive therapy technique, the clinical value of MST will need to be established through controlled clinical trials. This article reviews the experience to date with MST, and places this work in the broader context of other means of optimizing convulsive therapy in the treatment of depression.  相似文献   

17.
Veterans with military sexual trauma (MST) are at risk for a variety of psychiatric conditions, including posttraumatic stress disorder (PTSD) and depression. Survivors of MST are also likely to experience diminished quality of life (QoL). Individuals with higher lifetime incidence of sexual trauma may also be at increased risk for poorer outcomes in QoL and psychiatric symptomatology. The differences in psychological sequelae among those who have experienced sexual trauma as children, and those whose sexual trauma exposure is limited to adulthood are relatively understudied. The majority of sexual trauma literature has focused primarily on civilian trauma, and comparatively few studies have specifically examined psychosocial sequelae (e.g., QoL) in veterans with MST. This study examined how childhood sexual abuse (CSA) affects overall QoL as well as severity of PTSD and depressive symptoms. Veterans who reported CSA had significantly greater depression symptom severity than veterans who did not. No significant differences in PTSD symptom severity or QoL were found between veterans who did and did not report CSA. Results highlight the need for further examination of the relationship between CSA and depression in veterans with MST-related PTSD who also report CSA.  相似文献   

18.
Several programming models are introduced with the consideration of available unascertained information. In this case, the so‐called unascertained information is some numerical values whose ranges are known but their exact values are not. These models resolve several vital weaknesses of the traditional programming methods to a certain degree. Our study includes considerations of linear and non‐linear programming models with grey parameters, grey 0–1 programming, and satisfactory and quasi‐optimal solutions of grey linear programmings. Finally, some practical applications are given in order to test the applicability of our theory. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

19.
In this article, we present a global optimization approach for generating efficient points for multiobjective concave fractional programming problems. The main work of the approach involves solving an instance of a concave multiplicative fractional program (W̄). Problem (W̄) is a global optimization problem for which no known algorithms are available. Therefore, to render the approach practical, we develop and validate a branch and bound algorithm for globally solving problem (W̄). To illustrate the performance of the global optimization approach, we use it to generate efficient points for a sample multiobjective concave fractional program. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

20.
篇章形式的阅读测验在语文学科考试与语言能力测试中占有越来越重要的地位。篇章阅读测验是一种典型的题组测验, 因此需要采用能够处理题组效应的统计方法进行分析。在进行项目功能差异(DIF)检验时, 也需要采用与之匹配的DIF检验方法。目前能够处理题组效应的DIF检验方法主要包括变通的题组DIF检验方法和基于题组反应模型的DIF检验方法, 基于题组反应模型的DIF检验方法由于实现过程繁琐, 目前只停留在理论探讨阶段。本研究将变通的题组DIF检验方法及其效应值指标引入篇章阅读测验的DIF检验中, 能够解决篇章阅读测验中DIF检验与测量的问题, 效应值指标能够为如何处理有DIF效应的题组项目提供重要依据。本研究首先选用非题组DIF检验方法与变通的题组DIF检验方法对一份试卷进行DIF检验, 两种方法的比较结果体现了进行题组DIF检验的必要性与优越性, 然后选用变通的题组DIF检验方法中有代表性的四种方法对某阅读成就测验进行题组DIF检验。研究结果表明, 在篇章阅读测验中, 能够处理题组效应的DIF检验方法较传统的DIF检验方法具有较大的优越性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号