期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A graphical method for the rapid calculation of biserial and point biserial correlation in test research

GOHEEN HW DAVIDOFF MD 《Psychometrika》1951,16(2):239-242

相似文献

2.

Exploring the relationship between new word learning and short-term memory for serial order recall,item recall,and item recognition

Steve Majerus Martine Poncelet Bruno Elsen Martial van der Linden 《Journal of Cognitive Psychology》2013,25(6):848-873

We reexplored the relationship between new word learning and verbal short-term memory (STM) capacities, by distinguishing STM for serial order information, item recall, and item recognition. STM capacities for order information were estimated via a serial order reconstruction task. A rhyme probe recognition task assessed STM for item recognition. Item recall capacities were derived from the proportion of item errors in an immediate serial recall task. In Experiment 1, strong correlations were observed between item recall and item recognition, but not between the item STM tasks and the serial order task, supporting recent theoretical positions that consider that STM for item and serial order rely on distinct capacities. Experiment 2 showed that only the serial order reconstruction task predicted independent variance in a paired associate word–nonword learning task. Our results suggest that STM capacities for serial order play a specific and causal role in learning new phonological information. 相似文献

3.

Sequential Detection of Compromised Items Using Response Times in Computerized Adaptive Testing

Edison M. Choe Jinming Zhang Hua-Hua Chang 《Psychometrika》2018,83(3):650-673

Item compromise persists in undermining the integrity of testing, even secure administrations of computerized adaptive testing (CAT) with sophisticated item exposure controls. In ongoing efforts to tackle this perennial security issue in CAT, a couple of recent studies investigated sequential procedures for detecting compromised items, in which a significant increase in the proportion of correct responses for each item in the pool is monitored in real time using moving averages. In addition to actual responses, response times are valuable information with tremendous potential to reveal items that may have been leaked. Specifically, examinees that have preknowledge of an item would likely respond more quickly to it than those who do not. Therefore, the current study proposes several augmented methods for the detection of compromised items, all involving simultaneous monitoring of changes in both the proportion correct and average response time for every item using various moving average strategies. Simulation results with an operational item pool indicate that, compared to the analysis of responses alone, utilizing response times can afford marked improvements in detection power with fewer false positives. 相似文献

4.

Bayesian item fit analysis for unidimensional item response theory models

《The British journal of mathematical and statistical psychology》2006,59(2):429-449

Assessing item fit for unidimensional item response theory models for dichotomous items has always been an issue of enormous interest, but there exists no unanimously agreed item fit diagnostic for these models, and hence there is room for further investigation of the area. This paper employs the posterior predictive model‐checking method, a popular Bayesian model‐checking tool, to examine item fit for the above‐mentioned models. An item fit plot, comparing the observed and predicted proportion‐correct scores of examinees with different raw scores, is suggested. This paper also suggests how to obtain posterior predictive p‐values (which are natural Bayesian p‐values) for the item fit statistics of Orlando and Thissen that summarize numerically the information in the above‐mentioned item fit plots. A number of simulation studies and a real data application demonstrate the effectiveness of the suggested item fit diagnostics. The suggested techniques seem to have adequate power and reasonable Type I error rate, and psychometricians will find them promising. 相似文献

5.

Estimating the π* goodness of fit index for finite mixtures of item response models

Javier Revuelta 《The British journal of mathematical and statistical psychology》2008,61(1):93-113

Testing the fit of finite mixture models is a difficult task, since asymptotic results on the distribution of likelihood ratio statistics do not hold; for this reason, alternative statistics are needed. This paper applies the π* goodness of fit statistic to finite mixture item response models. The π* statistic assumes that the population is composed of two subpopulations – those that follow a parametric model and a residual group outside the model; π* is defined as the proportion of population in the residual group. The population was divided into two or more groups, or classes. Several groups followed an item response model and there was also a residual group. The paper presents maximum likelihood algorithms for estimating item parameters, the probabilities of the groups and π*. The paper also includes a simulation study on goodness of recovery for the two‐ and three‐parameter logistic models and an example with real data from a multiple choice test. 相似文献

6.

Immediate serial recall of words and nonwords: Tests of the retrieval-based hypothesis

Saint-Aubin J Poirier M 《Psychonomic bulletin & review》2000,7(2):332-340

In two experiments, the immediate serial recall of lists of words or nonwords was investigated under quiet and articulatory suppression conditions. The results showed better item recall for words but better order recall for nonwords, as measured with proportion of order errors per item recalled. Articulatory suppression hindered the recall of item information for both types of lists and of order information for words. These results are interpreted in light of a retrieval account in which degraded phonological traces must undergo a reconstruction process calling on long-term knowledge of the tobe-remembered items. The minimal long-term representations for nonwords are thought to be responsible for their lower item recall and their better order recall. Under suppression, phonological representations are thought to be minimal, producing trace interpretation problems responsible for the greater number of item and order errors, relative to quiet conditions. The very low performance for nonwords under suppression is attributed to the combination of degraded phonological information and minimal long-term knowledge. 相似文献

7.

A note on the normal ogive or logistic curve in item analysis

Frederic M. Lord 《Psychometrika》1965,30(3):371-372

It is common to assume that the proportion of correct answers to an item has a normal-ogive or logistic relationship to total test score. However, this is shown to be a mistaken and an undesirable notion. 相似文献

8.

Verbal coding and the storage of form-position associations in visual-spatial short-term memory

Dent K Smyth MM 《Acta psychologica》2005,120(2):113-140

Short-term memory for form-position associations was assessed using an object relocation task. Participants attempted to remember the positions of either three or five Japanese Kanji characters, presented on a computer monitor. Following a short blank interval, participants were presented with 2 alternative Kanji, only 1 of which was present in the initial stimulus, and the set of locations occupied in the initial stimulus. They attempted to select the correct item and relocate it back to its original position. The proportion of correct item selections showed effects of both articulatory suppression and memory load. In contrast, the conditional probability of location given a correct item selection showed an effect of load but no effect of suppression. These results are consistent with the proposal that access to visual memory is aided by verbal recoding, but that there is no verbal contribution to memory for the association between form and position. 相似文献

9.

Familiarity and recollection in item and associative recognition

William E. Hockley Angela Consoli 《Memory & cognition》1999,27(4):657-664

Recognition memory for item information (single words) and associative information (word pairs) was tested immediately and after retention intervals of 30 min and 1 day (Experiment 1) and 2 days and 7 days (Experiment 2) using Tulving's (1985) remember/know response procedure. Associative recognition decisions were accompanied by more "remember" responses and less "know" responses than item recognition decisions. Overall recognition performance and the proportion of remember responses declined at similar rates for item and associative information. The pattern of results for item recognition was consistent with Donaldson's (1996) single-factor signal detection model of remember/know responses, as comparisons based on A' between overall item recognition and remember item recognition showed no significant differences. For associative recognition, however, A' for remember responses was reliably greater than for overall recognition. The results show that recollection plays a significant role in associative recognition. 相似文献

10.

The role of familiarity in item recognition, associative recognition, and plurality recognition on self-paced and speeded tests

Westerman DL 《Journal of experimental psychology. Learning, memory, and cognition》2001,27(3):723-732

Four experiments compare the effect of familiarity on item, associative, and plurality recognition on self-paced and speeded tests. The familiarity of test items was enhanced by presenting a prime that matched the subsequent test item. On item and plurality recognition tests, participants were more likely to respond "old" to primed than to unprimed test items. In associative recognition, priming increased the proportion of old responses on a speeded test, but not on a self-paced test. This suggests that familiarity plays a larger role in item and plurality recognition than in associative recognition on self-paced tests. On speeded tests, priming has a similar effect on item, associative, and plurality recognition. Results suggest that item and associative recognition rely differentially on familiarity and recollection. They are also consistent with recent evidence suggesting that different processes underlie plurality and associative recognition. 相似文献

11.

Toward Increasing Fairness in Score Scale Calibrations Employed in International Large-Scale Assessments

Maria Elena Oliveri Matthias von Davier 《International Journal of Testing》2014,14(1):1-21

In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often ignore item misfit in score scale calibrations. We also seek to obtain improved model-data fit estimates when calibrating international score scales. To this end, we examine the use of two alternative score scale calibration procedures: (a) a language-based score scale and (b) a more parsimonious international scale wherein a large proportion of international parameters are used with a subset of country-based parameters for items that misfit in the international scale. In our analyses, we used data from all 40 countries participating in the Progress in International Reading Literacy Study. Our findings revealed that current score scale calibration procedures yield large numbers of misfitting items (higher than 25% for some countries). Our proposed approach diminished the effects of proportion of item misfit on score scale calibrations and also yielded enhanced model-data fit estimates. These results lead to enhancing confidence in measurements obtained from international large-scale assessments. 相似文献

12.

MMPI and nightmare reports in women addicted to alcohol and other drugs

Z Z Cernovsky 《Perceptual and motor skills》1986,62(3):717-718

In a sample of 78 female alcohol and drug addicts, 24.4% marked True the Item 31 ("I have nightmares every few nights") of the MMPI. The proportion is significantly higher than in normative MMPI data of normal US Midwest women published by Coligan: only 8.2% of the latter marked the item True. The female alcohol and drug addicts who marked the item True differed from those responding with False by higher scores on Schizophrenia, Psychasthenia, Paranoia, Anxiety, Depression, Psychopathic Deviate, and Social Introversion scales and by lower scores on Ego Strength scale. Nightmare sufferers consistently scored in a more pathological direction. 相似文献

13.

项目功能差异在跨文化人格问卷分析中的应用 总被引：2，自引：0，他引：2

曹亦薇《心理学报》2003,35(1):120-126

利用IRT的等级模型调查了中日两组被试关于SHIBA简易人格量表中“环境敏感性”的项目功能差异(DIF)的现状。研究发现：（1）量表中DIF的项目比例大(3/4);（2）DIF与项目内容、阈值有关而与区分度大小关系不大;（3）DIF项目间的日方特征曲线较之中方有较强的整合性。该研究利用DIF研究结果对跨文化的人格比较作了新尝试。最后提出了关于深化DIF研究的新课题相似文献

14.

List-level transfer effects in temporal learning: Further complications for the list-level proportion congruent effect

《Journal of Cognitive Psychology》2013,25(4):373-385

Congruency effects are larger when most trials are congruent relative to incongruent. According to the conflict adaptation account, this proportion congruent effect is due to the decreased attention to words when most of the trials are conflicting. This paper extends on previous work arguing that list-level (contingency-unbiased) proportion congruent effects might be explainable by temporal learning biases. That is, congruency effects are larger in an easier task (i.e., mostly congruent) due to the faster pace of the task. Two non-conflict analogues of the proportion congruent effect are presented, one with a contrast manipulation and another with a contingency manipulation. Critically, both experiments control for potential item-specific temporal learning biases by intermixing biased context and unbiased transfer items. Results show a proportion congruent-like interaction for both item types, supporting the notion of task-wide temporal learning as an explanation for list-level proportion congruency effects. Distributional analyses lend further credence to the temporal learning account by showing that proportion congruent and proportion congruent-like effects are localised in the fastest and intermediate responses. 相似文献

15.

Effective warnings in the Deese-Roediger-McDermott false-memory paradigm: the role of identifiability

Neuschatz JS Benoit GE Payne DG 《Journal of experimental psychology. Learning, memory, and cognition》2003,29(1):35-41

These experiments document that warnings can substantially reduce false memories in the Deese-Roediger-McDermott (DRM) paradigm when the critical items are easily identifiable. Participants in a norming study identified the critical item after hearing a list of words. The lists with critical items that could be identified by the largest proportion of participants (high identifiable [HI] lists) and the smallest proportion of participants (low identifiable [LI] lists) were used in the experiment. Participants heard lists of words (e.g., bed, rest, doze) related to a critical item (e.g., sleep) and were warned about the nature of the lists before the study phase. The results indicated that warnings reduced false recognition of critical items for HI lists but not LI lists. 相似文献

16.

Sex Differences in Mathematics Components of the Iowa Tests of Basic Skills

Barbara S. Plake Brenda H. Loyd H. D. Hoover 《Psychology of women quarterly》1981,5(S5):780-784

The Mathematics Problem Solving (MRS) and Mathematics Concepts (MC) subtests of the Iowa Tests of Basic Skills were investigated for content and psychometric item bias at grades 3, 6, and 8. A small proportion of items were identified in each subtest which significantly favored either males or females. No skill classification, item content or location trends could be found for the mathematics subtests at each grade level. Across the grade levels, items in the MC subtest favored males for grades 3 and 6, but females were favored at grade 8. The procedure used in the study is generalizable to other groups (minority or grade levels). Test consumers have the right to know whether the test they use is fair for selected groups of students. Results from empirical investigations should appear in the Test Manual that accompanies the test battery. 相似文献

17.

An experimental study of the effects on item-analysis data of changing item placement and test time limit

MOLLENKOPF WG 《Psychometrika》1950,15(3):291-315

Item-analysis data are usually obtained from a single test administration, with a given item sequence and time limit. Questions can be raised as to the effects upon item data resulting from changes in item-position and test-timing. In this study, two forms of a verbal test and two forms of a mathematics test were used. In each case, both forms of each test contained the same items, but items coming early in one form were placed late in the other. Each of these forms was administered once with a short time limit and once with generous timing to comparable groups of high school students. The relationships of various speed and power scores were determined, and the changes which occurred during the added time were studied. Values of the item indicesp (proportion right), (another difficulty index), and the item-test biserial correlation coefficient were obtained for both the speed and the power conditions and were systematically compared. The proportion right of those attempting the item, the index, and the biserialr were all found to have undesirable characteristics for items appearing late in a speeded test.The author gratefully acknowledges the suggestions and criticisms of Dr. Harold Gulliksen, Research Adviser at the Educational Testing Service. 相似文献

18.

Use of multilevel logistic regression to identify the causes of differential item functioning

Balluerka N Gorostiaga A Gómez-Benito J Hidalgo MD 《Psicothema》2010,22(4):1018-1025

Given that a key function of tests is to serve as evaluation instruments and for decision making in the fields of psychology and education, the possibility that some of their items may show differential behaviour is a major concern for psychometricians. In recent decades, important progress has been made as regards the efficacy of techniques designed to detect this differential item functioning (DIF). However, the findings are scant when it comes to explaining its causes. The present study addresses this problem from the perspective of multilevel analysis. Starting from a case study in the area of transcultural comparisons, multilevel logistic regression is used: 1) to identify the item characteristics associated with the presence of DIF; 2) to estimate the proportion of variation in the DIF coefficients that is explained by these characteristics; and 3) to evaluate alternative explanations of the DIF by comparing the explanatory power or fit of different sequential models. The comparison of these models confirmed one of the two alternatives (familiarity with the stimulus) and rejected the other (the topic area) as being a cause of differential functioning with respect to the compared groups. 相似文献

19.

IRT中最小化χ2/EM参数估计方法

朱玮丁树良陈小攀《心理学报》2006,38(3):453-460

对IRT的双参数Logistic模型（2PLM）中未知参数估计问题,给出了一个新的估计方法――最小化χ2/EM估计。新方法在充分考虑项目反应理论(IRT)与经典测量理论(CTT)之间的差异的前提下,从统计计算的角度改进了Berkson的最小化χ2估计,取消了Berkson实施最小化χ2估计时需要已知能力参数的不合实际的前提,扩大了应用范围。实验结果表明新方法能力参数的估计结果与BILOG相比,精确度要高,且当样本容量超过2000时,项目参数的估计结果也优于BILOG。实验还表明新方法稳健性好相似文献

20.

认知诊断CAT中具有非统计约束选题方法的比较

毛秀珍辛涛《心理学报》2014,46(12):1910-1922

项目曝光控制和内容约束关系到测验安全、测验的信度和效度, 是计算机化自适应测验(Computerized Adaptive Testing, CAT)中两类重要的非统计约束条件。本文在认知诊断CAT中针对内容约束和项目曝光控制要求, 运用5种方法选择测验项目。它们分别是：(1) Monte Carlo方法与项目合格方法相结合, 记为MC-IE; (2) Monte Carlo方法与最大优先指标方法相结合, 记为MC-MPI; (3) Monte Carlo方法与限制阈值方法相结合, 记为MC-RT; (4) Monte Carlo方法与限制进度指标方法相结合, 记为MC-RPG以及(5) Monte Carlo方法与最大后验概率方法相结合, 记为MC-PP。然后通过在线性、收敛、发散、无结构和独立五种属性结构下构建题库并运用重参化融融统和模型模拟被试反应比较它们的选题表现。研究发现, (1) 相同选题方法在不同属性结构下项目曝光率的分布类似, 测量精度按线性、收敛、发散、无结构和独立结构的顺序依次降低; (2) 相同属性结构下, 不同方法的测量精度高低依次为MC-PP、MC-IE、MC-RT、MC-MPI和MC-RPG方法; 项目曝光均匀性优劣依次为MC-RPG、MC-MPI、MC-RT、MC-IE和MC-PP方法。统一量纲值表明, MC-RPG方法的综合表现最好, MC-MPI方法的表现次之。相似文献