首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The remote association test (RAT) has been applied in various fields; however, evidence of construct validity for the original version and subsequent extensions of the RAT remains limited. This study aimed to elucidate the dimensionality and the relationship between item features and item difficulties for the RAT—Chinese Version (RAT-C) using the Rasch model and the linear logistic test model (LLTM). The revised 30-item RAT-C was administered to 475 undergraduates (263 women and 212 men) in 8 universities in Taiwan. Item features (including types of associations among stimulus words, and frequency and concreteness of target words) were recoded. The analysis found that the RAT-C measured a single latent construct, with all 30 items conforming to the Rasch model’s expectation. Furthermore, according to the LLTM analysis, most item features predicted Rasch item difficulty, suggesting that these features can explain why some items were more difficult than others and can be used to create new items with known item difficulty to tailor the difficulty level for different groups of participants in the future.  相似文献   

2.
College students’ ability to judge whether a studied item had been learned well enough to be recalled on a later test was examined in three experiments with self-paced learning procedures. Generally, these learners compensated for item difficulty when allocating study time, studying hard items longer than easy items, but they still recalled more easy items than hard items and tended to drop items out too soon. When provided with test opportunities during study or a delay between study and judgment, learners compensated significantly more for item difficulty and recalled substantially more. Paradoxically, good and poor learners compensated similarly for item difficulty and benefited similarly from testing during study and from delayed decision making. Thus, although the ability to make metamemory decisions was shown to be important for effective learning, these decisions were made equally well by good and poor associative learners. An analysis of tasks used to investigate metamemory-memory relationships in adult learning may provide an account for this apparent learning ability paradox.  相似文献   

3.
This study examines item bias on Forms L and M of the Peabody Picture Vocabulary Test-Revised(PPVT-R) for a sample of Anglo-American and Mexican-American children. Analyses of variance (ANOVA) were employed to assess item bias as defined by items X ethnicity interactions. Follow-up analyses were performed using a Bonferroni-type procedure on individual item contrasts. Bias as measured by differences in item difficulty was found in both groups; however, there was no clear pattern of items that were more difficult for either group. The small number of items that were more difficult for one ethnic group than for the other, coupled with the high reliability of performance overall for both groups, suggest that bias in content of the PPVT-R is minimal.  相似文献   

4.
Beller  Michal  Gafni  Naomi 《Sex roles》2000,42(1-2):1-21
The purpose of this study was to investigate differential performance of boys and girls on open-ended (OE) and multiple-choice (MC) items on the 1988 and 1991 International Assessment of Educational Progress (IAEP) mathematics test. In the 1988 mathematics assessment, a representative sample of approximately 1,000 13-year-olds in each of the six participating countries was assessed. In the 1991 mathematics assessment, a representative sample of 9- and 13-year-olds (approximately 1,650 from each age group) in some 20 participating countries was assessed. Analyses of both assessments yielded results that indicated that boys generally performed better than girls in mathematics. In the 1988 assessment, gender effects were larger on MC items than on OE items, corresponding to results of earlier studies. However, the 1991 IAEP assessment produced contrary results: gender effects tended to be larger for OE items than for MC items. These inconsistent results challenge the assertion that girls perform relatively better on OE test items, and suggest that item format alone cannot account for gender differences in mathematics performance. Further investigation of the data revealed that the inconsistent patterns of gender effects with regard to item format were related to the difficulty level of the items, regardless of item format. Correlations between item difficulty and item gender effect size were computed for age 13 in the 1988 assessment and for ages 9 and 13 in the 1991 assessment. The correlations obtained were 0.26, 0.47, and 0.53, respectively, suggesting that the more difficult the items, the better boys perform relative to girls.  相似文献   

5.

Differential item functioning (DIF) statistics were computed using items from the Peabody Individual Achievement Test (PIAT)-Reading Comprehension subtest for children of the same age group (ages 7 through 12 respectively). The pattern of observed DIF items was determined by comparing each cohort across age groups. Differences related to race and gender were also identified within each cohort. Characteristics of DIF items were identified based on sentence length, vocabulary frequency, and density of a sentence. DIF items were more frequently associated with short sentences than with long sentences. This study explored the potential limitation in the longitudinal use of items in an adaptive test.  相似文献   

6.
在多级评分模型下,项目的难度参数或步骤参数有多个,在对多级评分模型进行选题时,通常对项目的多个难度参数用一个综合的指标来概括.当对每个项目的难度参数进行有效的综合后,综合后的难度参数分布发生了改变,这时如果增加适量的平均难度较难或较易的项目进入题库,测验的精度和项目的曝光率都有一定的改善.  相似文献   

7.
Abstract:  In test operations using IRT (item response theory), items are included in a test before being used to rate subjects and the response data is used to estimate their item parameters. However, this method of test operation may lead to item content leakage and an adequate test operation can become difficult. To address this problem, Ozaki and Toyoda (2005, 2006 ) developed item difficulty parameter estimation methods that use paired comparison data from the perspective of the difficulty of items as judged by raters familiar with the field. In the present paper, an improved method of item difficulty parameter estimation is developed. In this new method, an item for which the difficulty parameter is to be estimated is compared with multiple items simultaneously, from the perspective of their difficulty. This is not a one-to-one comparison but a one-to-many comparison. In the comparisons, raters are informed that items selected from an item pool are ordered according to difficulty. The order will provide insight to improve the accuracy of judgment.  相似文献   

8.
This study examined how specific features of adaptive tests are related to test takers' reactions. Participants took a computer-adaptive test in which 2 features, difficulty of the initial item and difficulty of subsequent items, were manipulated, then responded to questionnaires assessing their reactions to the test. The data show that the relationship between a test's objective difficulty, which was determined by the 2 manipulated test characteristics, and reactions was fully mediated by perceived performance. Additional analyses evaluated the impact of feedback on reactions to the adaptive test. In general, feedback that was consistent with perceptions of performance was positively related to reactions. The results suggest that minor changes to the design of an adaptive test may potentially enhance examinees' reactions.  相似文献   

9.
Asking people to discover the identity of a recognition test probe immediately before making a recognition judgment increases the probability of an old judgment. To inform theories of this "revelation effect," event-related potentials (ERPs) were recorded for revealed and intact test items across two experiments. In Experiment 1, we used a revelation effect paradigm where half of the test probes were presented as anagrams (i.e., a related task) and the other items were presented intact. The pattern of ERP results from this experiment suggested that revealing an item decreases initial familiarity levels and caused the revealed items to elicit similar levels of activity. In Experiment 2, half of the probes were preceded by an addition task (i.e., an unrelated task). The pattern of ERP effects in this study were distinct from those observed in Experiment 1. More specifically, revealed item ERPs were more negative than intact ERPs at frontal electrodes and more positive at parietal electrodes early in the interval. Later in the epoch, revealed item ERPs were more negative than intact items. These data suggest that related tasks decrease familiarity and alter the signal-to-noise ratio of old and new items, whereas unrelated tasks affect processing in a different way (perhaps by changing decision processes) that also results in the revelation effect. The implications for current theories of the revelation effect are discussed.  相似文献   

10.
The purpose of this investigation was to examine the extent to which item and text characteristics predict item difficulty on the comprehension portion of the Gates-MacGinitie Reading Tests for the 7th–9th and 10th–12th grade levels. Detailed item-based analyses were performed on 192 comprehension questions on the basis of the cognitive processing model framework proposed by Embretson and colleagues (Embretson & Wetzel, 1987). Item difficulty was analyzed in terms of various passage features (e.g., word frequency and number of propositions) and individual-question characteristics (e.g., abstractness and degree of inferential processing), using hierarchical linear modeling. The results indicated that the difficulty of the items in the test for the 7th–9th grade level is primarily influenced by text features—in particular, vocabulary difficulty—whereas the difficulty of the items in the test for the 10th–12th grade level is less systematically influenced by text features.  相似文献   

11.
The speeded performance on simple mental addition problems of 6- and 7-year-old children with and without mild mental retardation is modeled from a person perspective and an item perspective. On the person side, it was found that a single cognitive dimension spanned the performance differences between the two ability groups. However, a discontinuity, or "jump," was observed in the performance of the normal ability group on the easier items. On the item side, the addition problems were almost perfectly ordered in difficulty according to their problem size. Differences in difficulty were explained by factors related to the difficulty of executing nonretrieval strategies. All findings were interpreted within the framework of Siegler's (e.g., R. S. Siegler & C. Shipley, 1995) model of children's strategy choices in arithmetic. Models from item response theory were used to test the hypotheses.  相似文献   

12.
为了探讨项目难度与分值对自定步调学习时间的影响及学习时间分配的内在机制。实验1a和实验1b分别检验项目难度与分值对自定步调学习时间的影响,发现学习者倾向于将更多学习时间分配到困难或高分值的项目上;实验2设置”难1分项目-中5分项目-易5分项目”和“难1分项目-中1分项目-易5分项目”两种情境,在前者中发现难1分项目与中5分项目的自定步调学习时间显著多于易5分项目,后者中发现难1分项目的自定步调学习时间显著多于中1分项目和易5分项目,表明了学习者在自定步调学习中存在权衡过程。  相似文献   

13.
认知诊断计算机化自适应测验(Cognitive Diagnosis Computerized Adaptive Testing, CD-CAT)是认知诊断评估和计算机化自适应测验两者的结合,兼具认知诊断和自适应测验的特点。目前,针对CD-CAT的研究几乎都集中在0-1二级计分的数据。然而,在教育和心理评估的实际应用中,存在大量的多级计分的数据。因此,本研究探讨了多级计分CD-CAT(Polytomous CD-CAT, PCD-CAT)的实现技术,并提出了2种新的选题方法。通过模拟实验比较了新选题方法和传统选题方法在PCD-CAT的效果,结果表明:在定长PCD-CAT条件下,2种新选题方法的模式分类准确率是最高的,而在非定长PCD-CAT条件下,2种新方法的测验效率也是最高的。  相似文献   

14.
项目选择一直是元认知控制研究中的热点问题。本研究以不同难度和分值的计算题为实验材料,通过两个实验探讨学习率的心理现实性及其对项目选择的影响。实验1中,在不限时条件下,被试者需要完成不同难度的计算题并赋予不同分值。实验2中,在限时条件下,通过变化计算题的难度和分值设计了三种不同学习率的项目,被试只能选择其中一种项目来计算以获得更高的分值。研究表明:第一,当项目所用时间增多时,被试对该项目所赋分值就增大,而时间(难度)与分值的比例是不变的,即学习率是相同的;第二,被试倾向于选择学习率更高的项目来完成计算任务,而当学习率相同时,被试倾向于优先选择高分困难项目。研究证实了学习率的心理现实性,并确定了学习率是项目选择的主要依据。  相似文献   

15.
Three experiments are reported on the relation between children's intramodal and crossmodal visual and kinesthetic performance under conditions varying the difficulty of the input patterns. Crossmodal recognition errors exceeded intramodal errors on distance patterns with two and four constituents by ten-year-olds (Expt 1), and by 5.5-year-olds on single and double distance patterns (Expt 2). Preschoolers showed no significant crossmodal deficits in the recognition of single and double patterns (Expt 2), or in recall of single lengths presented in blocked or alternating order (Expt 3). No interactions between crossmodal errors and pattern difficulty were found. Order of presenting the patterns (Expts 1 and 2), and blocked versus alternating presentations (Expt 3) had significant effects on the relation between intramodal and crossmodal errors. The results are discussed with reference to explanations of crossmodal matching.  相似文献   

16.
This study investigated the relationship between semantic and episodic memory as they support lexical access by healthy younger and older adults and individuals with Alzheimer's disease (AD). In particular, we were interested in examining the pattern of semantic and episodic memory declines in AD (i.e., word-finding difficulty and impaired recent memory) vis-à-vis more preserved remote memories. We administered a picture naming task in which the episodic period of the pictures and whether the pictured items were unique to one period or commonly used across periods were varied. Groups of younger adults (N=40), healthy older adults (N=20) and older adults with AD (N=18) were asked to name drawings of objects in four conditions: dated unique, contemporary unique, dated common, and contemporary common. The results indicated that all participants named items that were common to both episodic periods more successfully than items unique to one period. An interaction was observed such that the healthy older and AD groups were more successful in retrieving names of objects presented in the dated compared to contemporary unique conditions, whereas the younger adults showed the reverse pattern. These results indicate that naming ability is affected both by the cumulative frequency of using an item over a lifetime and by when an item was first acquired. The findings support a theoretical stance which proposes an enduring reciprocal link between semantic and episodic memory. This theoretical relationship has practical implications for the development of intervention strategies when interacting with persons who have AD.  相似文献   

17.
Production frequency has often been used to identify central and peripheral information, under the assumption that high frequency implies that the item is central. However, no research to date has tested the relationship between centrality and frequency. Participants watched a video of a bank robbery and completed a free recall test, from which frequency for recalled items was computed. Two groups then watched the same video and rated centrality and forensic relevance for each item. Results showed that most, but not all, items with high frequency were rated as central and forensically relevant but that low frequency items were not diagnostic of either item centrality or forensic relevance. Forensic relevance was a better indicator of item centrality than frequency. We concluded that frequency measures should be avoided to determine centrality. Also, if centrality ratings cannot be collected, forensic relevance ratings may be more appropriate for this purpose.  相似文献   

18.
Under assumptions that will hold for the usual test situation, it is proved that test reliability and variance increase (a) as the average inter-item correlation increases, and (b) as the variance of the item difficulty distribution decreases. As the average item variance increases, the test variance will increase, but the test reliability will not be affected. (It is noted that as the average item variance increases, the average item difficulty approaches .50). In this development, no account is taken of the effect of chance success, or the possible effect on student attitude of different item difficulty distributions. In order to maximize the reliability and variance of a test, the items should have high intercorrelations, all items should be of the same difficulty level, and the level should be as near to 50% as possible.The desirability of determining this relationship has been indicated by previous writers. Work on the present paper arose out of some problems raised by Dr. Herbert S. Conrad in connection with an analysis of aptitude tests.On leave for Government war research from the Psychology Department, University of Chicago.  相似文献   

19.
This study linked nonlinear profile analysis (NPA) of dichotomous responses with an existing family of item response theory models and generalized latent variable models (GLVM). The NPA method offers several benefits over previous internal profile analysis methods: (a) NPA is estimated with maximum likelihood in a GLVM framework rather than relying on the choice of different dissimilarity measures that produce different results, (b) item and person parameters are computed during the same estimation step with an appropriate distribution for dichotomous variables, (c) the model estimates profile coordinate standard errors, and (d) additional individual-level variables can be included to model relationships with the profile parameters. An application examined experimental differences in topographic map comprehension among 288 subjects. The model produced a measure of overall test performance or comprehension in addition to pattern variables that measured the correspondence between subject response profiles and an item difficulty profile and an item-discrimination profile. The findings suggested that subjects who used 3-dimensional maps tended to correctly answer more items in addition to correctly answering items that were more discriminating indicators of map comprehension. The NPA analysis was also compared with results from a multidimensional item response theory model.  相似文献   

20.
计算机化自适应测验中原始题项目参数的估计   总被引:1,自引:1,他引:0  
计算机化自适应测验(Computerized Adaptive Testing, 简称CAT)其安全性面临着新的挑战, 小题库的安全更受威胁。如何建设一个大型、优质的题库成为CAT研究中一个非常重要的课题。目前CAT题库的建设存在一些问题, 如成本高且保密性较差。尤其是等值技术较复杂且锚题重复使用容易造成泄露。如能在实施CAT过程中插入未经过参数估计的项目(原始题), 同时对原始题项目参数进行估计, 这对建设大型、优质的CAT题库来说其意义是不言而喻的。本文基于1PLM和2PLM对此进行研究, 提出了原始题在线估计的新方法以及推导出了求区分度参数a迭代初值的计算公式。研究结果表明:无论是模拟研究还是实证研究, 原始题被作答的次数对项目参数估计结果都会产生不同的影响, 并且原始题作答人数越多项目参数估计精度也越高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号