Efficiently measuring recognition performance with sparse data |
| |
Authors: | Email author" target="_blank">Lael?J?SchoolerEmail author Richard?M?Shiffrin |
| |
Institution: | (1) Max Planck Institute for Human Development, Center for Adaptive Behavior and Cognition, Lentzeallee 94, 14195 Berlin, Germany;(2) Indiana University, Bloomington, Indiana |
| |
Abstract: | We examine methods for measuring performance in signal-detection-like tasks when each participant provides only a few observations.
Monte Carlo simulations demonstrate that standard statistical techniques applied to ad’ analysis can lead to large numbers of Type I errors (incorrectly rejecting a hypothesis of no difference). Various statistical
methods were compared in terms of their Type I and Type II error (incorrectly accepting a hypothesis of no difference) rates.
Our conclusions are the same whether these two types of errors are weighted equally or Type I errors are weighted more heavily.
The most promising method is to combine an aggregated’ measure with a percentile bootstrap confidence interval, a computerintensive nonparametric method of statistical inference.
Researchers who prefer statistical techniques more commonly used in psychology, such as a repeated measurest test, should useγ (Goodman & Kruskal, 1954), since it performs slightly better than or nearly as well asd’. In general, when repeated measurest tests are used,γ is more conservative thand’: It makes more Type II errors, but its Type I error rate tends to be much closer to that of the traditional .05 α level.
It is somewhat surprising thatγ performs as well as it does, given that the simulations that generated the hypothetical data conformed completely to thed’ model. Analyses in which H—FA was used had the highest Type I error rates. Detailed simulation results can be downloaded
fromwww.psychonomic.org/archive/Schooler-BRM-2004.zip. |
| |
Keywords: | |
本文献已被 PubMed SpringerLink 等数据库收录! |
|