期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Bayesian predictive analysis of test scores

Hidetoki Ishii & Hiroshi Watanabe 《The Japanese psychological research》2001,43(1):25-36

In the classical test theory, a high-reliability test always leads to a precise measurement. However, when it comes to the prediction of test scores, it is not necessarily so. Based on a Bayesian statistical approach, we predicted the distributions of test scores for a new subject, a new test, and a new subject taking a new test. Under some reasonable conditions, the predicted means, variances, and covariances of predicted scores were obtained and investigated. We found that high test reliability did not necessarily lead to small variances or covariances. For a new subject, higher test reliability led to larger predicted variances and covariances, because high test reliability enabled a more accurate prediction of test score variances. Regarding a new subject taking a new test, in this study, higher test reliability led to a large variance when the sample size was smaller than half the number of tests. The classical test theory is reanalyzed from the viewpoint of predictions and some suggestions are made. 相似文献

2.

Concurrent validity of tests to measure the coordinated exertion of force by computerized target pursuit

Nagasawa Y Demura S Kitabayashi T 《Perceptual and motor skills》2004,98(2):551-560

The purpose of this study was to examine concurrent validity of a new test for coordinated exertion of force. Coordinated exertion of force was measured using computerized target pursuit from the following viewpoints: the relations between the new test, a pursuit-rotor test, and a pegboard test. College students (24 men and 24 women) were required to change their grip exertion to match changing demand values (displayed in either a bar chart or a wave form) appearing on the display of a personal computer. The sum of the differences between the demanded values and grip-exertion values for 25 sec. was a parameter to evaluate the new test. The reliabilities of the new test, the pursuit-rotor, and the pegboard test were acceptable (ICC = .70 to .99). Scores on the new test showed low correlations with the pursuit-rotor and the pegboard test. The relation between the two different displays in the new test was significant but low (r = .49, p < .05). It was inferred that the new test measures a somewhat different ability than that measured by the pursuit-rotor and pegboard test and that the abilities tested by the types of displayed demand values are somewhat different. 相似文献

3.

The Internationalization of Testing and New Models of Test Delivery on the Internet

《International Journal of Testing》2013,13(2):121-131

The Internet has opened up a whole new set of opportunities for advancing the science of psychometrics and the technology of testing. It has also created some new challenges for those of us involved in test design and testing. In particular, we are seeing impacts from internationalization of testing and new models for test delivery. These are changing the traditional balance of power between test producers, test users, test-takers, and the consumers of the results of testing. The use of a relatively new and immature technology for both low and high stakes assessment poses a number of challenges that need to be addressed. This article endeavors to clarify the role and function of test administration. It describes 4 modes of test administration and discusses the implications each has for the degree of control that can be exercised over testing. In this context, it considers issues such as security and confidentiality, including data protection and copyright; technical performance and bandwidth constraints and their impact on testing; and what constitutes "good practice" in the remote delivery of testing. 相似文献

4.

团体儿童智力测验的编制：目的、准则及其衡鉴

金瑜《心理科学》1994,(3)

论文阐述了选择编制与世界著名的个别施测的韦克斯勒儿童智力测验相似的但团体施行的儿童智力测验的理由;论述了指导新编测验的五条准则以及选题过程;还报告了对新编测验试用稿的几次相继的因素分析及其它的信度、效度检验结果。相似文献

5.

The effects on the predictive variance of a new subject’s score for a new test

Hidetoshi Ishii Hiroshi Watanabe 《The Japanese psychological research》2002,44(2):113-119

Abstract: It is often required to predict the scores or their variations under interest. Ishii and Watanabe (2001) investigated, in the context of psychological measurement, the Bayesian predictive distribution of a new subject’s scores for tests and subjects’ scores for a new test. In this paper, the Bayesian posterior predictive distribution of a new subject’s scores for a new parallel test were considered. And the effects of the number of subjects, the number of the tests, and the test reliability were investigated. Then, it was found that, under assumptions that (co)variance parameters are known, the predictive variance of a new subject’s score for a new test was equal to the predictive variances of the new subject’s scores for the existent tests. It was also found that the effect of the number of subjects was relatively large and the effect of the number of tests was relatively small, when a new subject’s scores for existent tests were not observed. 相似文献

6.

Azimian-Faridani N Wilding EL 《Psychonomic bulletin & review》2004,11(5):926-931

Event-related potentials (ERPs) were recorded during a verbal recognition memory task in order to investigate whether changes in familiarity are part of the explanation for the revelation effect. For half of the test words, participants solved an anagram prior to making the old/new recognition judgment. A revelation effect was obtained: When test words were preceded by the anagram task, a higher probability of an old response was associated with the items than was otherwise the case. The ERPs recorded time-locked to the onset of the test words were separated according to old/new status andthe presence/absence of the anagram task. The ERP index of familiarity was of lower amplitude for both old and new items that were preceded by the anagram task. These findings are consistent with the view that part of the explanation for the revelation effect is a reduction in the familiarity of the critical test items. 相似文献

7.

The Glasgow Face Matching Test

A. Mike Burton David White Allan McNeill 《Behavior research methods》2010,42(1):286-291

We describe a new test for unfamiliar face matching, the Glasgow Face Matching Test (GFMT). Viewers are shown pairs of faces, photographed in full-face view but with different cameras, and are asked to make same/different judgments. The full version of the test comprises 168 face pairs, and we also describe a shortened version with 40 pairs. We provide normative data for these tests derived from large subject samples. We also describe associations between the GFMT and other tests of matching and memory. The new test correlates moderately with face memory but more strongly with object matching, a result that is consistent with previous research highlighting a link between object and face matching, specific to unfamiliar faces. The test is available free for scientific use. 相似文献

8.

两水平研究中单维测验信度的估计

叶宝娟温忠粦《心理科学》2013,36(3):728-733

在心理、教育和管理等研究领域中,经常会碰到两水平（两层）的数据结构,如学生嵌套在班级中,员工嵌套在企业中。在两水平研究中,被试通常不是独立的,如果直接用单水平信度公式进行估计,会高估测验信度。文献上已有研究讨论如何更准确地估计两水平研究中单维测验的信度。本研究指出了现有的估计公式的不足之处,用两水平验证性因子分析推导出一个新的信度公式,举例演示如何计算,并给出简单的计算程序。相似文献

9.

Measuring perceptual speed in complex everyday situations

Arendasy M Sommer M 《Perceptual and motor skills》2004,98(2):615-626

This paper describes first results of the development of a new test to measure perceptual speed in everyday situations. The new items are best described as images depicting typical situations in everyday life, which have some picture elements in common. The pictures were presented to the subjects for a short time, and their task was to indicate which of 5 goal stimuli were present in the respective pictures seen before. In two studies, the scalability of the new items according to the Rasch model was investigated. In Study 1 data from 316 subjects with heterogeneous educational backgrounds were used in the empirical analysis. The results of Study 1 indicate that Rasch homogeneity and thus positive psychometric properties for the new test could be obtained. This result was further validated using a second sample (N = 198). The validation study replicates the results obtained in Study 1. These results represent a substantial condition sine qua non for following studies on the construct and criterion validity of the new test. 相似文献

10.

一个新的测量过程框架——对引入认知加工模型的再思考 总被引：1，自引：0，他引：1

孙娟《心理学探新》2002,22(2):41-45

从认知任务分析出发的测验设计,核心在于从认知加工的角度对项目底层作答机制作出解释。其中,能够刻画任务难度的模型最有可能与心理计量模型相结合。因而,具备成熟难度法则的各种小型理论或通用理论可以进入测量过程,实现从纯粹误差结构控制的测量到内容导引的测量的转变。在这样一个框架之下,理论对任务上作答过程的理解是否恰当,刺激特性变量与任务难度关系的揭示(即难度法则)是否准确,可以在测量的过程中进行证伪。项目反应理论还未及很好回答的效度问题,可望在这一拓展的框架中获得圆满解决。本文为新的测量过程框架提供了一个示意图。相似文献

11.

The influence of standard-opponent tests on blood androgen and corticoid levels of high- and low-ranking swordtail males (Xiphophorus helleri) before and after social isolation

Ralph-P. Hannes 《Aggressive behavior》1985,11(1):9-15

The blood androgen levels of both high- and low-ranking swordtail males show a reduction to one third of initial levels after social isolation but are returned to normal following a 20-minute exposure to a small male of the same species behind a transparent partition (standard-opponent test). Experiments to determine the cause of this effect revealed that the social contact involved in the test was not responsible, but that rather the presence of the fish in a new environment (the test-aquarium) for 20 hours itself sufficed to restore the normal androgen concentrations. The blood corticoid levels of both high- and low-ranking males are also reduced to one third of initial levels by social isolation. The normal level of this hormone was, however, restored following a standard-opponent test only in the case of the high-ranking males; the corticoid levels of the low-ranking males remaining depressed. Transfer to a new environment in itself did not account for the effect on the high-ranking males. This result suggests that the pituitary-adrenal systems of high- and low-ranking males are differentially responsive to the social situation represented by the standard-opponent test. 相似文献

12.

An improved portmanteau test for autocorrelated errors in interrupted time-series regression models

Huitema BE McKean JW 《Behavior research methods》2007,39(3):343-349

A new portmanteau test for autocorrelation among the errors of interrupted time-series regression models is proposed. Simulation results demonstrate that the inferential properties of the proposed Q(H-M) test statistic are considerably more satisfactory than those of the well known Ljung-Box test and moderately better than those of the Box-Pierce test. These conclusions generally hold for a wide variety of autoregressive (AR), moving averages (MA), and ARMA error processes that are associated with time-series regression models of the form described in Huitema and McKean (2000a, 2000b). 相似文献

13.

Randomization test for coupled data

Bruno D. Zumbo 《Attention, perception & psychophysics》1996,58(3):471-478

Coupled data arise in perceptual research when subjects are contributing two scores to the data pool. These two scores, it can be reasonably argued, cannot be assumed to be independent of one another; therefore, special treatment is needed when performing statistical inference. This paper shows how the Type I error rate of randomization-based inference is affected by coupled data. It is demonstrated through Monte Carlo simulation that a randomization test behaves much like its parametric counterpart except that, for the randomization test, a negative correlation results in an inflation in the Type I error rate. A new randomization test, the couplet-referenced randomization test, is developed and shown to work for sample sizes of 8 or more observations. An example is presented to demonstrate the computation and interpretation of the new randomization test. 相似文献

14.

Correction of item-total correlations in item analysis

Sten Henrysson 《Psychometrika》1963,28(2):211-218

The biserial correlation between an item and the total test of which the item is a part tends to be misleadingly high when used in item analysis, since the item is included in the total test. Two formulas with correction for this overlap are derived and compared with Zubin's and Guilford's formulas. One of the new coefficients is invariant to test length. 相似文献

15.

儿童注意力检测单片机的研制及动态分析指标的试用

刘成刚兰公瑞刘希平盖笑松《心理科学》2007,30(4):929-933

儿童注意力的测量途径包括家长或教师评定法、行为观察法、生理检测法及实验室测量法等。在实验室测量法中,持续性操作测试（CPT技术）是主要的测量范式。传统的CPT技术主要应用漏报率和误报率作为评价指标。但是,由于注意不是静态的,而是一个高度动态和波动的过程,漏报率和误报率还不足以反映注意状态的动态变化。近年来的国外研究发现,把注意作为一个动态的现时过程来评估,以儿童在作业中反应状态的波动情况作为新的指标能提供更有价值的信息。本研究以此为依据,开发研制了基于CPT技术的儿童注意力检测单片机SCM-CPT测试仪,在测试结果处理上采用了新的动态分析指标,在98名普通小学生中进行了试用,并为新指标建立了参考性评价标准。相似文献

16.

General ability measurement: An application of multidimensional item response theory

Daniel O. Segall 《Psychometrika》2001,66(1):79-97

相似文献

17.

Becoming a written word: Eye movements reveal order of acquisition effects following incidental exposure to new words during silent reading

Holly S.S.L. Joseph Elizabeth Wonnacott Paul Forbes Kate Nation 《Cognition》2014

We know that from mid-childhood onwards most new words are learned implicitly via reading; however, most word learning studies have taught novel items explicitly. We examined incidental word learning during reading by focusing on the well-documented finding that words which are acquired early in life are processed more quickly than those acquired later. Novel words were embedded in meaningful sentences and were presented to adult readers early (day 1) or later (day 2) during a five-day exposure phase. At test adults read the novel words in semantically neutral sentences. Participants’ eye movements were monitored throughout exposure and test. Adults also completed a surprise memory test in which they had to match each novel word with its definition. Results showed a decrease in reading times for all novel words over exposure, and significantly longer total reading times at test for early than late novel words. Early-presented novel words were also remembered better in the offline test. Our results show that order of presentation influences processing time early in the course of acquiring a new word, consistent with partial and incremental growth in knowledge occurring as a function of an individual’s experience with each word. 相似文献

18.

Weakly parallel tests in latent trait theory with some criticisms of classical test theory

Fumiko Samejima 《Psychometrika》1977,42(2):193-198

A new concept of weakly parallel tests, in contrast to strongly parallel tests in latent trait theory, is proposed. Some criticisms of the fundamental concepts in classical test theory, such as the reliability of a test and the standard error of estimation, are given. 相似文献

19.

Testing global memory models using ROC curves.

R Ratcliff C F Sheu S D Gronlund 《Psychological review》1992,99(3):518-535

Global memory models are evaluated by using data from recognition memory experiments. For recognition, each of the models gives a value of familiarity as the output from matching a test item against memory. The experiments provide ROC (receiver operating characteristic) curves that give information about the standard deviations of familiarity values for old and new test items in the models. The experimental results are consistent with normal distributions of familiarity (a prediction of the models). However, the results also show that the new-item familiarity standard deviation is about 0.8 that of the old-item familiarity standard deviation and independent of the strength of the old items (under the assumption of normality). The models are inconsistent with these results because they predict either nearly equal old and new standard deviations or increasing values of old standard deviation with strength. Thus, the data provide the basis for revision of current models or development of new models. 相似文献

20.

The multiple selection vocabulary test (MSVT-B)--an accelerated intelligence test]

J Merz S Lehrl V Galster H Erzigkeit 《Psychiatrie, Neurologie, und medizinische Psychologie》1975,27(7):423-428

A new accelerated intelligence test is presented which is known as the multiple selection vocabulary test B. This is a very objective and reliable test measuring the general level of intelligence. The time of testing will be about three to five minutes in the case of persons that are free of any psychiatric disorders. Test results will be only negligibly influenced by psychic and mental disorders. 相似文献