首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present a new database of lexical decision times for English words and nonwords, for which two groups of British participants each responded to 14,365 monosyllabic and disyllabic words and the same number of nonwords for a total duration of 16 h (divided over multiple sessions). This database, called the British Lexicon Project (BLP), fills an important gap between the Dutch Lexicon Project (DLP; Keuleers, Diependaele, & Brysbaert, Frontiers in Language Sciences. Psychology, 1, 174, 2010) and the English Lexicon Project (ELP; Balota et al., 2007), because it applies the repeated measures design of the DLP to the English language. The high correlation between the BLP and ELP data indicates that a high percentage of variance in lexical decision data sets is systematic variance, rather than noise, and that the results of megastudies are rather robust with respect to the selection and presentation of the stimuli. Because of its design, the BLP makes the same analyses possible as the DLP, offering researchers with a new interesting data set of word-processing times for mixed effects analyses and mathematical modeling. The BLP data are available at and as Electronic Supplementary Materials.  相似文献   

2.
A largely overlooked side effect in most studies of morphological priming is a consistent main effect of semantic transparency across priming conditions. That is, participants are faster at recognizing stems from transparent sets (e.g., farm) in comparison to stems from opaque sets (e.g., fruit), regardless of the preceding primes. This suggests that semantic transparency may also be consistently associated with some property of the stem word. We propose that this property might be traced back to the consistency, throughout the lexicon, between the orthographic form of a word and its meaning, here named Orthography-Semantics Consistency (OSC), and that an imbalance in OSC scores might explain the “stem transparency” effect. We exploited distributional semantic models to quantitatively characterize OSC, and tested its effect on visual word identification relying on large-scale data taken from the British Lexicon Project (BLP). Results indicated that (a) the “stem transparency” effect is solid and reliable, insofar as it holds in BLP lexical decision times (Experiment 1); (b) an imbalance in terms of OSC can account for it (Experiment 2); and (c) more generally, OSC explains variance in a large item sample from the BLP, proving to be an effective predictor in visual word access (Experiment 3).  相似文献   

3.
A simple stochastic model is formulated in order to determine the optimal time between the first test and the second test when the test-retest method of assessing reliability is used. A forgetting process and a change in true score process are postulated. The optimal time between tests is derived by maximizing the probability that the respondent has not remembered the response on the first test and has not had a change in true score. The resulting test-retest correlation is then found to be a linear function of the true reliability of the test, where the slope of this function is the key probability of not remembering and having no change in true score. Some numerical examples and suggestions for using the results in empirical studies are given. Specific recommendations are presented for improved design and analysis of intentions data.This research was made possible by a grant from the Center for Food Policy Research, Graduate School of Business, Columbia University, New York, New York, 10027.  相似文献   

4.
Maximum validity of a test with equivalent items   总被引:1,自引:0,他引:1  
It is assumed that a scale of true scores on a function exists and that the probability of answering an item correctly is a curve of the type of the integral of the normal curve. The product moment correlation between the test score and true score is derived for a normal distribution of subjects and a test composed of equivalent items. Numerical examples demonstrate that the maximum correlation between test scores and true scores occurs for a one hundred item test when the point correlation between items is less than three tenths.  相似文献   

5.
The misinformation effect is a well-established phenomenon in the false memory literature, although the mechanisms that underlie it are debated. In the present study, we explored one aspect of the controversy, the fate of the original memory. We began from an activation-based view of memory, capitalizing on the well-understood processes of associative priming and spreading activation, to test the hypothesis that true and suggested information can coexist in memory. After exposure to misinformation, participants were unknowingly primed with associates of either the true or a suggested item. Misled participants who were primed for the true item performed better on a final memory test than did misled participants primed for neutral information. The results indicated that true and suggested information coexist and that retrieval is influenced by each concept's activation level at test. Implications for theories of the misinformation effect were discussed.  相似文献   

6.
The Rev. is a BASIC computer program for IBM-PC-compatible systems that provides Bayesian estimates of “true” scores from multiple scores measuring the same construct. Psychological reports often include test scores from earlier evaluations without objectively incorporating them into current findings. Using Thorndike’s formulas for objectively combining test scores while providing estimates with reduced standard errors, the Rev. is an interactive program that facilitates test interpretation by combining information from many test administrations. The user provides four easily obtainable pieces of information for each test administration. The output includes an estimated “true” score and the standard error of the estimate.  相似文献   

7.
In the theory of test validity it is assumed that error scores on two distinct tests, a predictor and a criterion, are uncorrelated. The expected-value concept of true score in the calssical test-theory model as formulated by Lord and Novick, Guttman, and others, implies mathematically, without further assumptions, that true scores and error scores are uncorrelated. This concept does not imply, however, that error scores on two arbitrary tests are uncorrelated, and an additional axiom of “experimental independence” is needed in order to obtain familiar results in the theory of test validity. The formulas derived in the present paper do not depend on this assumption and can be applied to all test scores. These more general formulas reveal some unexpected and anomalous properties of test validty and have implications for the interpretation of validity coefficients in practice. Under some conditions there is no attenuation produced by error of measurement, and the correlation between observed scores sometimes can exceed the correlation between true scores, so that the usual correction for attenuation may be inappropriate and misleading. Observed scores on two tests can be positively correlated even when true scores are negatively correlated, and the validity coefficient can exceed the index of reliability. In some cases of practical interest, the validity coefficient will decrease with increase in test length. These anomalies sometimes occur even when the correlation between error scores is quite small, and their magnitude is inversely related to test reliability. The elimination of correlated errors in practice will not enhance a test's predictive value, but will restore the properties of the validity coefficient that are familiar in the classical theory.  相似文献   

8.
The author compared simulations of the "true" null hypothesis (zeta) test, in which sigma was known and fixed, with the t test, in which s, an estimate of sigma, was calculated from the sample because the t test was used to emulate the "true" test. The true null hypothesis test bears exclusively on calculating the probability that a sample distance (mean) is larger than a specified value. The results showed that the value of t was sensitive to sampling fluctuations in both distance and standard error. Large values of t reflect small standard errors when n is small. The value of t achieves sensitivity primarily to distance only when the sample sizes are large. One cannot make a definitive statement about the probability or "significance" of a distance solely on the basis of the value of t.  相似文献   

9.
In a multiple (or multivariate) regression model where the predictors are subject to errors of measurement with a known variance-covariance structure, two-sample hypotheses are formulated for (i) equality of regressions on true scores and (ii) equality of residual variances (or covariance matrices) after regression on true scores. The hypotheses are tested using a large-sample procedure based on maximum likelihood estimators. Formulas for the test statistic are presented; these may be avoided in practice by using a general purpose computer program. The procedure has been applied to a comparison of learning in high schools using achievement test data.  相似文献   

10.
Escorial S  Navas MJ 《Psicothema》2006,18(2):319-325
The aim of this work is to analyze the gender differences in the scales of a recently constructed test: the so-called EDTC. This test measures the following traits: sensation seeking, fearlessness, and impulsivity. Gender differences will be studied using Differential Item Functioning (DIF) techniques, in order to determine whether these differences are true differences in the assessed dimensions or if, on the contrary, they are the result of a mere artefact of the measuring instrument used. The methods used to study DIF are standardization, SIBTEST, logistic regression, Lord's chi 2 test, and indices based on the DFIT model. Despite the fact that some items with DIF exist, the gender differences observed seem to be the result of true differences in the measured personality constructs and they don't seem to be artificially produced by a bias in the test items.  相似文献   

11.
By proposing that the latent or true nature of subjects is identified with a limited number of response patterns (the Guttman scale patterns), the probability of an observed response pattern can be written as the sum of products of the probability of the true type multiplied by the chance of sufficient response error to cause the observed pattern to appear. This model contains the proportions of the true types as parameters plus some misclassification probabilities as parameters. Maximum likelihood methods are used to make estimates and test the fit for some examples.  相似文献   

12.
13.
Abstract— A commonly used method for comparing groups of individuals is the analysis of variance (ANOVA) F test. When the assumptions underlying the derivation of this test are true, its power, meaning its probability of detecting true differences among the groups, competes well with all other methods that might be used. But when these assumptions are false, its power can be relatively low. Many new statistical methods have been proposed—ones that are aimed at achieving about the same amount of power when the assumptions of the F test are true but which have the potential of high power in situations where the F test performs poorly. A brief summary of some relevant issues and recent developments is provided. Some related issues are discussed and implications for future research are described.  相似文献   

14.
A general one-way analysis of variance components with unequal replication numbers is used to provide unbiased estimates of the true and error score variance of classical test theory. The inadequacy of the ANOVA theory is noted and the foundations for a Bayesian approach are detailed. The choice of prior distribution is discussed and a justification for the Tiao-Tan prior is found in the particular context of the “n-split” technique. The posterior distributions of reliability, error score variance, observed score variance and true score variance are presented with some extensions of the original work of Tiao and Tan. Special attention is given to simple approximations that are available in important cases and also to the problems that arise when the ANOVA estimate of true score variance is negative. Bayesian methods derived by Box and Tiao and by Lindley are studied numerically in relation to the problem of estimating true score. Each is found to be useful and the advantages and disadvantages of each are discussed and related to the classical test-theoretic methods. Finally, some general relationships between Bayesian inference and classical test theory are discussed. Supported in part by the National Institute of Child Health and Human Development under Research Grant 1 PO1 HDO1762. Reproduction, translation, use or disposal by or for the United States Government is permitted.  相似文献   

15.
The author compared simulations of the “true” null hypothesis (z) test, in which ò was known and fixed, with the t test, in which s, an estimate of ò, was calculated from the sample because the t test was used to emulate the “true” test. The true null hypothesis test bears exclusively on calculating the probability that a sample distance (mean) is larger than a specified value. The results showed that the value of t was sensitive to sampling fluctuations in both distance and standard error. Large values of t reflect small standard errors when n is small. The value of t achieves sensitivity primarily to distance only when the sample sizes are large. One cannot make a definitive statement about the probability or “significance” of a distance solely on the basis of the value of t.  相似文献   

16.
关于两种Angoff法比较的模拟实验研究   总被引:1,自引:0,他引:1  
采用模拟实验法比较研究了两种Angoff法——概率法和对错法——设定分数线的准确性和稳定性,结果表明:(1)当真能力低于测验的平均难度时,概率法高估分数线,而对错法低估分数线;反之,当真能力高于测验平均难度时,概率法低估,而对错法高估;(2)当真能力接近测验平均难度时,概率法比对错法更准确;反之,当真能力远高于或低于测验平均难度时,对错法更准确;(3)无论在何种实验条件下,概率法均比对错法更稳定。  相似文献   

17.
The true intra‐individual change model is generalized by defining individual method effects. This allows the analysis of non‐congeneric test–retest variables assumed to measure a common, possibly (temporally) transient, attribute. Temporal change in the attribute between different times of measurement is modelled by the true‐change variable. Individual causal method effects, due to heterogeneity of the measurement methods, account for the imperfect correlation of the true‐score variables at each time of measurement. The reliability of the composite scores, at each time of measurement, and the reliability of the difference composite score may be estimated with appropriate coefficients derived from the model. Measurements of daily life tension in adult females serve to illustrate how the model can be used empirically.  相似文献   

18.
Thirty high- and 30 low-hypnotizable subjects saw slides of a purse snatching and then imagined seeing the slides in hypnosis or waking conditions. The experimenter suggested the offender had a moustache (true), wore a scarf (false), and picked up flowers (false). Memory was tested by the experimenter after the suggestion, by another experimenter during an inquiry session, and again by the 2nd experimenter after the experimenter appeared to end the session. Hypnotizability, but not hypnosis, was associated with false memory reports; more high-than low-hypnotizable subjects reported false memories. The context of testing influenced true and false memory reports; fewer reports occurred in an informal rather than a formal test context.  相似文献   

19.
When subjects give higher confidence or memory ratings to a test word in a recognition test, do they simply raise their criterion without making better discrimination, or do they raise both criterion and true discrimination between the studied words (SW) and the lures? Given that previous studies found subjects’ false alarm responses to lures slower than to SW, and recognition latency inversely correlated with the confidence rating, can the latency difference between the lures and SW be accounted for by confidence or memory ratings? The present results showed that when subjects gave higher confidence or memory ratings, both their bias and sensitivity were raised, indicating that they could consciously distinguish the lures from the SW. However, a latency difference between true and false recognitions persisted after confidence and memory ratings were held constant, suggesting an unconscious source of discrimination between the two types of memory.  相似文献   

20.
Bradley thought that there is a connexion between the theory of reality and the theory of truth. The theory of reality to which he subscribed, Monism, rules out a correspondence theory of truth, he thought, since it denies the existence of a plurality of facts, or things, in virtue of correspondence to which a judgment could be true. But though he rejects the correspondence theory he insists on the independence of truth from belief, wish and hope. For him the test of truth is coherence, which has two aspects, system and comprehensiveness. However, he does not think that this test yields ‘absolute’ truth. This, he maintains, for at least three different reasons, is unobtainable. Judgments can only be partially true. However, since there are degrees of truth, some judgments are closer to the truth than others, even though none are, or could be, unconditionally true.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号