首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
基于游戏的心理测评是指通过游戏或游戏化的活动, 对一个人的能力、人格等心理特性和行为进行量化测评。早期主要用于评估教育和训练的效果, 而后发展成对心理特性的测评。基于游戏的测评作为一项新技术在测评形式、测评过程和测评结果上均具有优势。目前基于游戏的测评形成了以证据中心设计为基础的范式, 用于指导建立测评工具和开展实证研究, 并在测评个体认知能力和非认知能力方面均有实践。然而当前该技术仍处于起步阶段, 未来研究可以在任务设计、结果分析及实践应用方面进一步拓展深入。  相似文献   

2.
    
In a cognitive diagnostic assessment (CDA), attributes refer to fine-grained knowledge points or skills. The Q -matrix is a central component of CDA, which specifies the relationship between items and attributes. Oftentimes, attributes and Q -matrix are defined by subject-matter experts, and assumed to be appropriate without any misspecifications. However, this assumption does not always hold in real applications. To address this concern, this paper proposes a residual-based statistic for validating the Q -matrix. Its performance is evaluated in a simulation study and compared against that of an existing method proposed in Liu, Xu and Ying (2012, Applied Psychological Measurement, 36, 548). Simulation results indicate that the proposed method leads to a higher recovery rate of the Q -matrix and is computationally more efficient. The advantage in computational efficiency is particularly pronounced when the number of attributes measured by the test reaches five or more. Results also suggest that the two methods have different tendencies in estimating the attribute vector for each item. In cases where the methods fail to recover the correct Q -matrix, the method in Liu et al. (2012, Applied Psychological Measurement, 36, 548) tends to overestimate the number of attributes measured by the items, whereas our method does not show that bias.  相似文献   

3.
    
Game-based assessment (GBA) is a specific use of educational games that employs game activities to elicit evidence for educationally valuable skills and knowledge. While this approach can provide individualized and diagnostic information about students, the design and development of assessment mechanics for a GBA is a nontrivial task. In this article, we describe the 10-step procedure that the design team of Physics Playground (formerly known as Newton's Playground) has established by adapting evidence-centered design to address unique challenges of GBA. The scaling method used for Physics Playground was Bayesian networks; thus this article describes specific actions taken for the iterative process of constructing and revising Bayesian networks in the context of the game Physics Playground.  相似文献   

4.
The assessment of higher-education student learning outcomes is an important component in understanding the strengths and weaknesses of academic and general education programs. This study illustrates the application of diagnostic classification models, a burgeoning set of statistical models, in assessing student learning outcomes. To facilitate understanding and future applications of diagnostic modeling, the log-linear cognitive diagnosis model used in this study is presented in a didactic manner. The model is applied in a context where undergraduate students were assessed along four learning outcomes related to psychosocial research across two time points. Results focus on implications and methods to aid stakeholders’ interpretation of the analyses. Contrasts to traditional measurement models and potential future applications are also discussed.  相似文献   

5.
丁树良  毛萌萌  汪文义  罗芬  CUI Ying 《心理学报》2012,44(11):1535-1546
构建正确的认知模型是成功进行认知诊断的关键之一,如果认知诊断测验不能完整准确地代表这个认知模型,这个测验的效度就存在问题.属性及其层级可以表示一个认知模型.在认知模型正确基础上,给出了一个计量公式以衡量认知诊断测验能够多大程度上代表认知模型;对于不止包含一个知识状态的等价类及其形成原因进行了分析,对Cui等人的属性层级相合性指标(HCI)提出修改建议,以更好地探查数据与专家给出的认知模型的一致性.  相似文献   

6.
刘彦楼  吴琼琼 《心理学报》2023,55(1):142-158
Q矩阵是CDM的核心元素之一,反映了测验的内部结构和内容设计,通常由领域专家根据经验进行主观界定,因此需要对可能存在的错误进行修正。本研究提出了一种新的Q矩阵修正方法——基于完整经验交叉相乘信息矩阵的Wald-XPD方法。采用Monte Carlo模拟检验了新方法的表现,并与同类方法进行了比较。研究表明:新开发的Wald-XPD方法在Q矩阵恢复率、保留正确标定属性的比例以及修正错误标定属性的比例这3个主要指标上均有较好的表现,且整体上优于其他方法,尤其是在修正错误标定的属性方面。通过实证数据展示了Wald-XPD方法在Q矩阵修正中的良好表现。总之,本研究为Q矩阵修正提供了有效的方法。  相似文献   

7.
Q矩阵是认知诊断评价的基础和核心要素, 它反映了测验的构念和内容设计, 直接影响着测验诊断分类的效果。本文采用Monte Carlo模拟, 研究了6种属性层级关系下, 不同的Q矩阵设计对于认知诊断效果的影响。用模式判准率的均值和标准差分别从分类准确性和稳定性的角度来评价诊断效果。实验结果表明:(1) 不同属性层级关系下, 分类准确性会随着测验长度的增加而提高, 但当测验长度增加到一定程度时, 会出现“天花板效应”; (2) Q矩阵中R*的个数(NR*)会影响测验的分类准确性及稳定性:NR*越大, 测验的分类稳定性越高, 当测验长度为属性个数的整数倍, 且NR*为测验长度相对属性个数的最大奇数倍时分类准确性最高; (3) Q矩阵中除R*以外的项目考察的属性个数会随着属性层级关系的不同对测验的分类准确性和稳定性产生不同的影响。根据实验结果, 本研究提出了进行诊断评价时Q矩阵优化设计的一些建议。  相似文献   

8.
陈孚  辛涛  刘彦楼  刘拓  田伟 《心理科学进展》2016,24(12):1946-1960
认知诊断模型界定了测验题目和所考察属性之间的关系, 通过被试的作答反应获取被试对属性或知识技能的掌握情况。认知诊断模型资料拟合检验可以从项目拟合、模型绝对拟合、模型相对拟合和个人拟合方等方面进行。通过对认知诊断拟合检验方法和统计量的详细介绍和评价, 可为认知诊断实践提供借鉴和参考。未来研究可在更丰富的研究条件下对各统计量的性能进行评价和对比, 完善已有的拟合检验方法, 提出新的拟合统计量。  相似文献   

9.
多级计分认知诊断模型的开发对认知诊断的发展具有重要作用,但对于多级计分模型下的Q矩阵修正还有待研究。本研究尝试对多级计分认知诊断Q矩阵修正进行研究,并聚焦更具诊断价值的基于项目类别水平的Q矩阵修正。将相对拟合统计量应用于多级计分认知诊断Q矩阵修正,并与已有方法 Stepwise方法(Ma&de la Torre,2019)进行比较。研究表明:BIC方法对多级计分认知诊断模型的Q矩阵修正具有较高的模式判准率和属性判准率,其对Q矩阵的恢复率也高于Stepwise方法, BIC方法修正后的Q矩阵与数据更加拟合;在复杂模型中,相对拟合指标BIC比AIC和-2LL表现更好,在实践中,使用者可以选择BIC法进行测验Q矩阵修正; Q矩阵修正效果受到被试人数的影响,增加被试人数可以提高Q矩阵修正的正确率。总之,本研究为多级计分认知诊断Q矩阵修正提供了重要的方法支持。  相似文献   

10.
Generating items during testing: Psychometric issues and models   总被引:2,自引:0,他引:2  
On-line item generation is becoming increasingly feasible for many cognitive tests. Item generation seemingly conflicts with the well established principle of measuring persons from items with known psychometric properties. This paper examines psychometric principles and models required for measurement from on-line item generation. Three psychometric issues are elaborated for item generation. First, design principles to generate items are considered. A cognitive design system approach is elaborated and then illustrated with an application to a test of abstract reasoning. Second, psychometric models for calibrating generating principles, rather than specific items, are required. Existing item response theory (IRT) models are reviewed and a new IRT model that includes the impact on item discrimination, as well as difficulty, is developed. Third, the impact of item parameter uncertainty on person estimates is considered. Results from both fixed content and adaptive testing are presented.This article is based on the Presidential Address Susan E. Embretson gave on June 26, 1999 at the 1999 Annual Meeting of the Psychometric Society held at the University of Kansas in Lawrence, Kansas. —Editor  相似文献   

11.
认知发展研究趋势的探讨   总被引:3,自引:0,他引:3  
该文阐明了认知科学、建构主义和新皮亚杰学派对认知发展研究的影响,着重论述了新皮亚杰学派对认知发展研究的新趋势及主要观点,并对这些趋势予以了较为详尽的评述,以期为认知发展心理学家重新审视和思考原有认知发展理论假设的合理性、提出新的假设和新的研究方向有所启示。  相似文献   

12.
基于“为学习而测评”理念,以促进学生学习为目的,本研究进行了基于认知诊断测评的个性化补救教学效果分析。首先,以“一元一次方程”章节为例,编制两份平行的认知诊断测评试卷。然后,通过对不同地区(城市和农村)七年级学生的施测与分析,发现城市学生对属性的掌握情况优于农村学生对属性的掌握情况。之后,选择农村学生为补救对象,通过对比基于认知诊断测评和传统教学两种个性化补救教学的效果,发现两种补救教学方法均能提高学习成绩,但前者的补救效果显著优于后者的。总之,本研究结果表明采用基于认知诊断测评的个性化补救教学能够有效促进学生学习,为实践者应用认知诊断测评促进学生学习提供了实践依据。  相似文献   

13.
属性不等权重的多级评分属性层级方法   总被引:1,自引:1,他引:0       下载免费PDF全文
本文给出基于属性不等权重的等级反应模型(Grade Response Model, GRM)的属性层级方法(Attribute Hierarchy Method, AHM), 简记为属性不等权重的GRM-AHM。在属性层级结构下, 本文利用贝叶斯网与最小二乘两种方法, 提出了被试掌握属性的条件概率与属性权重的计算方法, 发现并解决了属性在不同的项目内权重有可能不相等的问题。本研究进一步将认知诊断推广到多级评分的情形。试验证明, 属性不等权重的GRM-AHM具有较高的判准率。  相似文献   

14.
多分属性认知诊断模型(CDMs)比传统的二分属性CDMs提供更详细的诊断反馈信息,但现有大部分多分属性CDMs并不具备直接分析多级(或混合)评分数据的功能。本文基于等级反应模型对重参数化多分属性DINA模型进行多级评分拓广,开发一个可处理多级评分数据的等级反应多分属性DINA模型。首先通过实证数据分析呈现新模型的现实可应用性;然后通过模拟研究探究新模型的参数估计返真性。结果表明,新模型满足同时处理多分属性和多级评分数据的现实需求;且具备良好的心理计量学性能,但对测验质量有一定要求(如题目质量较高且测验Qp矩阵具有完备性等)。  相似文献   

15.
郭磊  周文杰 《心理学报》2021,53(9):1032-1043
充分挖掘选择题(Multiple-Choice, MC)的诊断信息受到了较多关注, 将干扰项信息考虑在内可以提升诊断精度。为了弥补参数模型基于大样本才能获得可靠估计的不足, 以及适用于班级水平的小样本诊断测验情境, 本研究提出了非参数的多选题诊断方法。模拟和实证研结果表明:(1)当MC测验中题目参数不存在较大差异时, ${{d}_{text{ph}-text{MC}}}$法在多数情况下表现优于参数类诊断模型。(2)当MC测验中题目参数存在较大差异时, ${{d}_{ph-MC}}$法的表现最优。(3)实证研究中非参数方法和参数类模型的分类一致性程度较高, ${{d}_{text{ph}-text{MC}}}$距离法估计得到的考生属性总体掌握程度与总分相关最高。最后, 基于MC诊断测验的特点提出了若干研究方向。  相似文献   

16.
    
The Scaling Individuals and Classifying Misconceptions (SICM) model is an advanced psychometric model that can provide feedback to examinees’ misconceptions and a general ability simultaneously. These two types of feedback are represented by a discrete and a continuous latent variable, respectively, in the SICM model. The complex structure of the SICM model brings difficulties in estimating both misconception profile and ability efficiently in a linear test. To overcome this challenge, this study proposes a flexible computerized adaptive test (FCAT) design as a new test delivery method to increase test efficiency by administering an individualized test to examinees. We propose three item selection methods and two transition criteria to determine adaptive steps based on the needs of estimating one or two latent variables. Through two simulation studies, we demonstrate how to select an appropriate item selection method for an adaptive step and what transition criterion should be used between two adaptive steps. Results reveal the combination of the item selection method and the transition criterion could improve the estimation accuracy of a specific latent variable to a different extent and thus provide further guidance in designing an FCAT.  相似文献   

17.
认知诊断评估旨在探讨个体内部的知识掌握结构,并提供关于学生优缺点的详细诊断信息,以促进个体的全面发展。当前研究者已开发了大量0-1评分的认知诊断模型,但对于多级评分认知诊断模型的研究还比较少。本文对已有的多级评分认知诊断模型进行了归纳,介绍了模型的假设,计量特征以及适用范围,为实际应用者和研究者在多级评分认知诊断模型的比较和选用上提供借鉴和参考。最后,对未来关于多级评分诊断模型的研究方向进行了展望。  相似文献   

18.
    
The Reduced Reparameterized Unified Model (Reduced RUM) is a diagnostic classification model for educational assessment that has received considerable attention among psychometricians. However, the computational options for researchers and practitioners who wish to use the Reduced RUM in their work, but do not feel comfortable writing their own code, are still rather limited. One option is to use a commercial software package that offers an implementation of the expectation maximization (EM) algorithm for fitting (constrained) latent class models like Latent GOLD or Mplus. But using a latent class analysis routine as a vehicle for fitting the Reduced RUM requires that it be re-expressed as a logit model, with constraints imposed on the parameters of the logistic function. This tutorial demonstrates how to implement marginal maximum likelihood estimation using the EM algorithm in Mplus for fitting the Reduced RUM.  相似文献   

19.
Q矩阵在认知诊断的模型参数估计和诊断分类中起着重要作用。本文通过研究Liu等人的方法, 设计了同时估计项目参数和Q矩阵的联合估计算法。在DINA模型下, 对项目参数未知时开展模拟研究。研究假设项目为20个, 考察的属性个数分别是3、4和5, 初始Q矩阵中分别存在3、4和5个属性界定错误的项目。结果表明, 联合估计算法能在错误的初始Q矩阵基础上以很高的概率得到正确的Q矩阵。另外, 当专家认定测验的属性个数存在错误时, 该方法推导的Q矩阵和模型参数能提供很好的鉴别Q矩阵错误的信息。  相似文献   

20.
分类一致性和准确性是认知诊断评估中的重要指标,前者反映信度问题,后者反映效度问题。已有研究提出的指标均是基于二分属性,而多分属性的后验概率分布和属性边际概率分布均不同于二分属性,需要构建新指标来衡量多分属性情景下的信效度。本研究基于二分思想,构建出二元式信息指标用于计算多分属性测验中的信效度,并通过实验设计考察了新指标在多种影响因素中的表现,验证了新指标的有效性。最后,为多分属性诊断测验的编制提供了建议,并提出未来研究方向。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号