首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
计算机化自适应测验选题策略述评   总被引:2,自引:0,他引:2  
毛秀珍  辛涛 《心理科学进展》2011,19(10):1552-1562
计算机化自适应测验(computerized adaptive testing, CAT)是基于测量理论和计算机技术的一种测验模式。它根据考生的作答反应自适应地选择测验项目。选题策略是CAT的重要组成部分之一, 关系到测量效率、测验安全和测验信、效度等重要问题。根据CAT是否具有非统计约束对传统CAT和认知诊断CAT的选题策略进行了分类介绍, 未来研究应进一步提高选题策略的综合表现、深入探讨多级评分项目和认知诊断CAT的选题策略。  相似文献   

2.
Content balancing is one of the most important issues in computerized classification testing. To adapt to variable-length forms, special treatments are needed to successfully control content constraints without knowledge of test length during the test. To this end, we propose the notions of ‘look-ahead’ and ‘step size’ to adaptively control content constraints in each item selection step. The step size gives a prediction of the number of items to be selected at the current stage, that is, how far we will look ahead. Two look-ahead content balancing (LA-CB) methods, one with a constant step size and another with an adaptive step size, are proposed as feasible solutions to balancing content areas in variable-length computerized classification testing. The proposed LA-CB methods are compared with conventional item selection methods in variable-length tests and are examined with different classification methods. Simulation results show that, integrated with heuristic item selection methods, the proposed LA-CB methods result in fewer constraint violations and can maintain higher classification accuracy. In addition, the LA-CB method with an adaptive step size outperforms that with a constant step size in content management. Furthermore, the LA-CB methods generate higher test efficiency while using the sequential probability ratio test classification method.  相似文献   

3.
Item calibration is an essential issue in modern item response theory based psychological or educational testing. Due to the popularity of computerized adaptive testing, methods to efficiently calibrate new items have become more important than that in the time when paper and pencil test administration is the norm. There are many calibration processes being proposed and discussed from both theoretical and practical perspectives. Among them, the online calibration may be one of the most cost effective processes. In this paper, under a variable length computerized adaptive testing scenario, we integrate the methods of adaptive design, sequential estimation, and measurement error models to solve online item calibration problems. The proposed sequential estimate of item parameters is shown to be strongly consistent and asymptotically normally distributed with a prechosen accuracy. Numerical results show that the proposed method is very promising in terms of both estimation accuracy and efficiency. The results of using calibrated items to estimate the latent trait levels are also reported.  相似文献   

4.
尽管多阶段测验(MST)在保持自适应测验优点的同时允许测验编制者按照一定的约束条件去建构每一个模块和题板,但建构测验时若因忽视某些潜在的因素而导致题目之间出现局部题目依赖性(LID)时,也会对MST测验结果带来一定的危害。为探究"LID对MST的危害"这一问题,本研究首先介绍了MST和LID等相关概念;然后通过模拟研究比较探讨该问题,结果表明LID的存在会影响被试能力估计的精度但仍为估计偏差较小,且该危害不限于某一特定的路由规则;之后为消除该危害,使用了题组反应模型作为MST施测过程中的分析模型,结果表明尽管该方法能够消除部分危害但效果有限。这一方面表明LID对MST中被试能力估计精度所带来的危害确实值得关注,另一方面也表明在今后关于如何消除MST中由LID造成危害的方法仍值得进一步探究的。  相似文献   

5.
A survey of microcomputer use in psychology showed equal frequency of use for teaching, research, and administration. Respondents with computer experience evaluated microcomputer contributions more highly than did those respondents without experience but with an interest in using computer systems. Apple IIes were the most popular machines, word processing the most popular use, and experimental and statistical psychology the most popular courses for using computers. Sixty percent of the users wrote their own software.  相似文献   

6.
与传统的纸笔测验(Paper And Pencil Based Test, P&P)相比计算机化自适应测验(Computerized Adaptive Testing, CAT)根据被试的作答反应自适应地选择题目, 它不仅缩短了测验长度, 还极大地提高了测验的准确性。然而, 目前绝大多数CAT不允许被试修改答案, 研究者主要担心修改答案会降低CAT的有效性。允许修改答案符合被试一贯的测验习惯, 修改之后的分数更能反映被试真实的水平, 从而能够进一步促进CAT在实际中的应用。现有的研究主要从三个方面提出了可修改答案CAT的控制方法:一是测验设计; 二是改进选题策略; 三是建构模型。未来的研究应进一步探讨这些方法之间的比较与结合, 以及对可修改答案认知诊断CAT (Cognitive Diagnostic CAT, CD-CAT)的研究。  相似文献   

7.
To date, applications of automated assessment techniques in personality testing have largely been limited to objective personality instruments with text stimuli; few assessment applications have involved graphic stimuli. Although projective personality instruments generally include ambiguous graphic or pictorial stimuli, computer applications with these procedures have been limited to automated scoring and interpretation, administration of sentence completion devices employing text stimuli, and the use of mechanical methods rather than computer graphics to display visual stimuli. In the present report, we describe a Macintosh HyperCard application for administering an objective personality test with visual stimuli, the Barron-Welsh Revised Art Scale of the Welsh Figure Preference Test. This test consists of a series of figural stimuli and a binary “like”/“dislike” response format, and it thus represents an administration procedure between standard objective self-report inventories involving text stimuli and a “true”/“false” response or variant, and tests such as the Rorschach or TAT that are both figural and free-response. The HyperCard language provides a variety of promising techniques useful for microcomputer test administration.  相似文献   

8.
Computerized adaptive testing in personality assessment can improve efficiency by significantly reducing the number of items administered to answer an assessment question. Two approaches have been explored for adaptive testing in computerized personality assessment: item response theory and the countdown method. In this article, the authors review the literature on each and report the results of an investigation designed to explore the utility, in terms of item and time savings, and validity, in terms of correlations with external criterion measures, of an expanded countdown method-based research version of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2), the MMPI-2 Computerized Adaptive Version (MMPI-2-CA). Participants were 433 undergraduate college students (170 men and 263 women). Results indicated considerable item savings and corresponding time savings for the adaptive testing modalities compared with a conventional computerized MMPI-2 administration. Furthermore, computerized adaptive administration yielded comparable results to computerized conventional administration of the MMPI-2 in terms of both test scores and their validity. Future directions for computerized adaptive personality testing are discussed.  相似文献   

9.
允许修改答案的认知诊断计算机化自适应测验(Reviewable Cognitive Diagnostic Computerized Adaptive Testing,RCD-CAT),有利于更准确诊断被试的知识状态,题目口袋法(Item Pocket,IP)为被试提供了缓存作答并修改的机会,改进的题目口袋法(Modified IP,MIP)对IP内修改的题目重新计分。模拟研究比较了IP、MIP、stocking Ⅰ和stocking Ⅱ在RCD-CAT效果,结果发现:stocking设计的效果最优,其中stocking Ⅱ的效果略优于stocking Ⅰ,IP法和MIP法判准率要低于传统CD-CAT,stocking设计在RCD-CAT具有较好的应用前景。  相似文献   

10.
Multidimensional computerized adaptive testing (MCAT) has received increasing attention over the past few years in educational measurement. Like all other formats of CAT, item replenishment is an essential part of MCAT for its item bank maintenance and management, which governs retiring overexposed or obsolete items over time and replacing them with new ones. Moreover, calibration precision of the new items will directly affect the estimation accuracy of examinees’ ability vectors. In unidimensional CAT (UCAT) and cognitive diagnostic CAT, online calibration techniques have been developed to effectively calibrate new items. However, there has been very little discussion of online calibration in MCAT in the literature. Thus, this paper proposes new online calibration methods for MCAT based upon some popular methods used in UCAT. Three representative methods, Method A, the ‘one EM cycle’ method and the ‘multiple EM cycles’ method, are generalized to MCAT. Three simulation studies were conducted to compare the three new methods by manipulating three factors (test length, item bank design, and level of correlation between coordinate dimensions). The results showed that all the new methods were able to recover the item parameters accurately, and the adaptive online calibration designs showed some improvements compared to the random design under most conditions.  相似文献   

11.
MST结合了纸笔测验和CAT的优势,现阶段在美国的许多大型考试中得到了应用。本文结合MST、认知诊断、CD-CAT和OMST的思想对CD-MST的可行性进行研究。CD-MST具有认知诊断和自适应的功能,能够使用较少的题目为被试提供即时的、准确的、丰富的诊断信息;同时它计算速度较快,允许考生返回检查和修改,更符合实际考试情境,且在测验的编制上更容易控制。本研究考察了选题策略和题库质量对不同测验设计的CD-MST的影响,并同CD-CAT进行了比较。通过模拟研究发现:MPWKL、GDI和SHE选题策略同样也适用于CD-MST的选题,在题库质量好的情况下这三种选题策略的判准率同CD-CAT持平。CD-MST的测验时间要比CD-CAT缩短2/3以上。  相似文献   

12.
Methods of cognitive diagnostic computerized adaptive testing (CD-CAT) under higher-order cognitive diagnosis models have been developed to simultaneously provide estimates of the attribute mastery statuses of examinees for formative assessment and estimates of a latent continuous trait for overall summative evaluation. In a typical CD-CAT environment, examinees are often subject to a time limit, and the examinees’ response times (RTs) for specific test items can be routinely recorded by custom-made programs. Because examinees are individually administered tailored sets of test items from the item pool, they may experience different levels of speededness during testing and different levels of risk of running out of time. In this study, RTs were considered during the item-selection procedure to control the test speededness and the RTs were treated as useful information for improving latent trait estimation in CD-CAT under the higher-order deterministic input, noisy ‘and’ gate (DINA) model. A modified posterior-weighted Kullback–Leibler (PWKL) method that maximizes the item information per time unit and a shadow-test method that assembles a provisional test subject to a specified time constraint were developed. Two simulation studies were conducted to assess the effects of the proposed methods on the quality of CD-CAT for fixed- and variable-length exams. The results show that, compared with the traditional PWKL method, the proposed methods preserve a lower risk of running out of time while ensuring satisfactory attribute estimation and providing more accurate estimates of the latent trait and speed parameters. Finally, several suggestions for future research are proposed.  相似文献   

13.
涂冬波  蔡艳  戴海琦 《心理科学》2013,36(1):210-215
认知诊断、项目自动生成是现代心理测量领域的重要发展领域,二者的结合更是心理测量领域亟待开展的重要课题。本研究以小学数学问题解决认知诊断项目自动生成为例,探讨认知诊断领域的项目生成技术及算法。研究发现:(1)计算机自生成的项目参数与原模板参数具有较高的一致性。(2)同一项目模板下生成的不同试题的测量学特征基本不变。(3)同一批被试在自动生成的两份试卷的前、后测的能力( )值高度相关(r=0.811),前、后两次对被试诊断结果的一致性高达86.5%。这表明本文所设计的认知诊断测验项目的自动生成技术及其算法基本可行,小学数学问题解决认知诊断项目的自动生成效果较好。这也为其它认知诊断领域的项目自动生成提供了技术借鉴和支持。  相似文献   

14.
Computer adaptive testing (CAT) is a relatively recent innovation in large scale testing programs, but has had very limited application in private industry. This paper describes the development of a CAT for use by a large insurance company in selecting computer programmer trainees. Incumbents provided the calibration and evaluation data. The CAT led to increased item security, but did not decrease required testing time. Further, the CAT was found to be similar to a conventional, fixed-item test in reliability and validity. In addition to actual test results, computer simulated test data were used in a more detailed evaluation of the CAT's effectiveness. The concluding discussion notes the advantages and disadvantages observed from the use of adaptive testing.  相似文献   

15.
As the usage of unproctored Internet testing (UIT) rises, new methods of mitigating challenges associated with UIT have been proposed. We suggest that one of the most promising methods is computer adaptive testing (CAT), and is a major advancement in pre-employment testing. CAT combines science and technology to help deliver a targeted and secure testing experience. In this article, we describe the use of CAT in organizations and highlight examples of how CAT has been applied to the measurement of cognitive ability, knowledge, and personality traits. We also set out a research agenda that will advance the development and implementation of future CATs.  相似文献   

16.
Several articles in the past fifteen years have suggested various models for analyzing dichotomous test or questionnaire items which were constructed to reflect an assumed underlying structure. This paper shows that many models are special cases of latent class analysis. A currently available computer program for latent class analysis allows parameter estimates and goodness-of-fit tests not only for the models suggested by previous authors, but also for many models which they could not test with the more specialized computer programs they developed. Several examples are given of the variety of models which may be generated and tested. In addition, a general framework for conceptualizing all such models is given. This framework should be useful for generating models and for comparing various models.  相似文献   

17.
A study was undertaken to examine the relationship between response latencies to verbal ability test items administered by computer and overall verbal intelligence test scores. Sixty-four undergraduate students responded to a test of verbal ability under four conditions of alternate test forms (A or B) and modes of administration (computerized vs paper-and-pencil). The response latencies recorded during computerized testing, averaged for each subject, showed a negative correlation with overall test scores as would be predicted from a speed-of-information-processing perspective of human intelligence (Jensen, 1982a, b; Vernon, 1983). This inverse relationship was evident in every condition of test form and mode of administration, thereby demonstrating the generalizability of these findings. Discussion considered the implications of test speededness for the results of this study and provided suggestions for future research employing response latency data as a means for studying the cognitive processes underlying intelligent behaviour.  相似文献   

18.
Efforts to develop a viable short form of the MMPI (Hathaway & McKinley, 1943) span more than 50 years, with more recent attempts to significantly shorten the item pool focused on the use of adaptive computerized test administration. In this article, we report some psychometric properties of an MMPI-Adolescent version (MMPI-A; Butcher et al., 1992) short form based on administration of the first 150 items of this test instrument. We report results for both the MMPI-A normative sample of 1,620 adolescents and a clinical sample of 565 adolescents in a variety of treatment settings. We summarize results for the MMPI-A basic scales in terms of Pearson product-moment correlations generated between full administration and short-form administration formats and mean Tscore elevations for the basic scales generated by each approach. In this investigation, we also examined single-scale and 2-point congruences found for the MMPI-A basic clinical scales as derived from standard and short-form administrations. We present the relative strengths and weaknesses of the MMPI-A short form and discuss the findings in terms of implications for attempts to shorten the item pool through the use of computerized adaptive assessment approaches.  相似文献   

19.
提出两种认知诊断计算机自适应测验下平衡属性收敛的新方法(MABI、RTA),模拟研究系统探讨和比较了此二者与已有方法(ABI、IABI和RABI)的表现。结果发现:(1)新方法较不考虑属性收敛的方法有更高的准确率以及更均衡的题目使用率;(2)新方法较ABI和RABI有稍低的准确性,但有更平衡的题目使用率;(3)新方法与IABI的准确性和题目使用率在不同选题策略下各有合优势。总之,两种新方法较好地兼顾测量准确性、题目使用率以及题库曝光情况。  相似文献   

20.
Alzheimer’s disease, the most common form of dementia is a neurodegenerative brain order that has currently no cure for it. Hence, early diagnosis of such disease using computer-aided systems is a subject of great importance and extensive research amongst researchers. Nowadays, deep learning or particularly convolutional neural network (CNN) is getting more attention due to its state-of-the-art performances in variety of computer vision tasks such as visual object classification, detection and segmentation. Several recent studies, that have used brain MRI scans and deep learning have shown promising results for diagnosis of Alzheimer’s disease. However, most common issue with deep learning architectures such as CNN is that they require large amount of data for training. In this paper, a mathematical model PFSECTL based on transfer learning is used in which a CNN architecture, VGG-16 trained on ImageNet dataset is used as a feature extractor for the classification task. Experimentation is performed on data collected from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. The accuracy of the 3-way classification using the described method is 95.73% for the validation set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号