首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Three plausible assumptions of conditional independence in a hierarchical model for responses and response times on test items are identified. For each of the assumptions, a Lagrange multiplier test of the null hypothesis of conditional independence against a parametric alternative is derived. The tests have closed-form statistics that are easy to calculate from the standard estimates of the person parameters in the model. In addition, simple closed-form estimators of the parameters under the alternatives of conditional dependence are presented, which can be used to explore model modification. The tests were applied to a data set from a large-scale computerized exam and showed excellent power to detect even minor violations of conditional independence.  相似文献   

2.
Null hypothesis significance testing (NHST) is the most commonly used statistical methodology in psychology. The probability of achieving a value as extreme or more extreme than the statistic obtained from the data is evaluated, and if it is low enough, the null hypothesis is rejected. However, because common experimental practice often clashes with the assumptions underlying NHST, these calculated probabilities are often incorrect. Most commonly, experimenters use tests that assume that sample sizes are fixed in advance of data collection but then use the data to determine when to stop; in the limit, experimenters can use data monitoring to guarantee that the null hypothesis will be rejected. Bayesian hypothesis testing (BHT) provides a solution to these ills because the stopping rule used is irrelevant to the calculation of a Bayes factor. In addition, there are strong mathematical guarantees on the frequentist properties of BHT that are comforting for researchers concerned that stopping rules could influence the Bayes factors produced. Here, we show that these guaranteed bounds have limited scope and often do not apply in psychological research. Specifically, we quantitatively demonstrate the impact of optional stopping on the resulting Bayes factors in two common situations: (1) when the truth is a combination of the hypotheses, such as in a heterogeneous population, and (2) when a hypothesis is composite—taking multiple parameter values—such as the alternative hypothesis in a t-test. We found that, for these situations, while the Bayesian interpretation remains correct regardless of the stopping rule used, the choice of stopping rule can, in some situations, greatly increase the chance of experimenters finding evidence in the direction they desire. We suggest ways to control these frequentist implications of stopping rules on BHT.  相似文献   

3.
The standard textbook treatment of conventional statistical tests assumes random sampling from a population and interprets the outcome of the statistical testing as being about a population. Problems with this interpretation include that (1) experimenters rarely make any attempt to randomly sample, (2) if random sampling occurred, conventional statistical tests would not precisely describe the population, and (3) experimenters do not use statistical testing to generalize to a population. The assumption of random sampling can be replaced with the assumption that scores were produced by a process. Rejecting the null hypothesis then leads to a conclusion about process, applying to only the subjects in the experiment (e.g., that some difference in the treatment of two groups caused the difference in average scores). This interpretation avoids the problems noted and fits how statistical testing is used in psychology.  相似文献   

4.
When participants are asked to respond in the same way to stimuli from different sources (e.g., auditory and visual), responses are often observed to be substantially faster when both stimuli are presented simultaneously (redundancy gain). Different models account for this effect, the two most important being race models and coactivation models. Redundancy gains consistent with the race model have an upper limit, however, which is given by the well-known race model inequality (Miller, 1982). A number of statistical tests have been proposed for testing the race model inequality in single participants and groups of participants. All of these tests use the race model as the null hypothesis, and rejection of the null hypothesis is considered evidence in favor of coactivation. We introduce a statistical test in which the race model prediction is the alternative hypothesis. This test controls the Type I error if a theory predicts that the race model prediction holds in a given experimental condition.  相似文献   

5.
Most existing tests for significance of difference between means require specific assumptions concerning the distribution function in the parent population. The need for a test which can be applied without making any such assumption is stressed. Such a statistical test is derived. The application of the test involves converting scores to rank orders. The exact probabilities may then be calculated for specified differences between samples by means of which the null hypothesis may be tested. The application of the test is simple and requires a minimum of calculation. The test loses in precision because of the conversion to rank orders but gains in generality since it may be safely used with any kind of distribution.This study was started at the Iowa Child Welfare Research Station of the State University of Iowa. I should like to express my gratitude for the help I received there.  相似文献   

6.
Discounting is the process by which outcomes lose value. Much of discounting research has focused on differences in the degree of discounting across various groups. This research has relied heavily on conventional null hypothesis significance tests that are familiar to psychologists, such as t‐tests and ANOVAs. As discounting research questions have become more complex by simultaneously focusing on within‐subject and between‐group differences, conventional statistical testing is often not appropriate for the obtained data. Generalized estimating equations (GEE) are one type of mixed‐effects model that are designed to handle autocorrelated data, such as within‐subject repeated‐measures data, and are therefore more appropriate for discounting data. To determine if GEE provides similar results as conventional statistical tests, we compared the techniques across 2,000 simulated data sets. The data sets were created using a Monte Carlo method based on an existing data set. Across the simulated data sets, the GEE and the conventional statistical tests generally provided similar patterns of results. As the GEE and more conventional statistical tests provide the same pattern of result, we suggest researchers use the GEE because it was designed to handle data that has the structure that is typical of discounting data.  相似文献   

7.
I begin by asking the meta-epistemological question, 'What is justification?', analogous to the meta-ethical question, 'What is rightness?' I introduce the possibility of non-cognitivist, naturalist, non-naturalist, and eliminativist answers in meta-epistemology,corresponding to those in meta-ethics. I devote special attention to the naturalistic hypothesis that epistemic justification is identical to probability, showing its antecedent plausibility. I argue that despite this plausibility, justification cannot be identical with probability, under the standard interpretation of the probability calculus, for the simple reason that justification can increase indefinitely but probability cannot. I then propose an alternative model for prima facie justification, based on an analogy with Ross's account of prima facie obligation, arguing that this model illuminates the differences between justification and probability and, given the plausible assumption of epistemic pluralism, explains them as well.  相似文献   

8.
Multilevel modeling provides one approach to synthesizing single-case experimental design data. In this study, we present the multilevel model (the two-level and the three-level models) for summarizing single-case results over cases, over studies, or both. In addition to the basic multilevel models, we elaborate on several plausible alternative models. We apply the proposed models to real datasets and investigate to what extent the estimated treatment effect is dependent on the modeling specifications and the underlying assumptions. By considering a range of plausible models and assumptions, researchers can determine the degree to which the effect estimates and conclusions are sensitive to the specific assumptions made. If the same conclusions are reached across a range of plausible assumptions, confidence in the conclusions can be enhanced. We advise researchers not to focus on one model but conduct multiple plausible multilevel analyses and investigate whether the results depend on the modeling options.  相似文献   

9.
Evolutionary debunking arguments appeal to selective etiologies of human morality in an attempt to undermine moral realism. But is morality actually the product of evolution by natural selection? Although debunking arguments have attracted considerable attention in recent years, little of it has been devoted to whether the underlying evolutionary assumptions are credible. In this paper, we take a closer look at the evolutionary hypotheses put forward by two leading debunkers, namely Sharon Street and Richard Joyce. We raise a battery of considerations, both empirical and theoretical, that combine to cast doubt on the plausibility of both hypotheses. We also suggest that it is unlikely that there is in the vicinity a plausible alternative hypothesis suitable for the debunker's cause.  相似文献   

10.
The informativity of a computational model of language acquisition is directly related to how closely it approximates the actual acquisition task, sometimes referred to as the model's cognitive plausibility. We suggest that though every computational model necessarily idealizes the modeled task, an informative language acquisition model can aim to be cognitively plausible in multiple ways. We discuss these cognitive plausibility checkpoints generally and then apply them to a case study in word segmentation, investigating a promising Bayesian segmentation strategy. We incorporate cognitive plausibility by using an age‐appropriate unit of perceptual representation, evaluating the model output in terms of its utility, and incorporating cognitive constraints into the inference process. Our more cognitively plausible model shows a beneficial effect of cognitive constraints on segmentation performance. One interpretation of this effect is as a synergy between the naive theories of language structure that infants may have and the cognitive constraints that limit the fidelity of their inference processes, where less accurate inference approximations are better when the underlying assumptions about how words are generated are less accurate. More generally, these results highlight the utility of incorporating cognitive plausibility more fully into computational models of language acquisition.  相似文献   

11.
In cognitive modeling, data are often categorical observations taken over participants and items. Usually subsets of these observations are pooled and analyzed by a cognitive model assuming the category counts come from a multinomial distribution with the same model parameters underlying all observations. It is well known that if there are individual differences in participants and/or items, a model analysis of the pooled data may be quite misleading, and in such cases it may be appropriate to augment the cognitive model with parametric random effects assumptions. On the other hand, if random effects are incorporated into a cognitive model that is not needed, the resulting model may be more flexible than the multinomial model that assumes no heterogeneity, and this may lead to overfitting. This article presents Monte Carlo statistical tests for directly detecting individual participant and/or item heterogeneity that depend only on the data structure itself. These tests are based on the fact that heterogeneity in participants and/or items results in overdispersion of certain category count statistics. It is argued that the methods developed in the article should be applied to any set of participant 3 item categorical data prior to cognitive model-based analyses.  相似文献   

12.
以2002-2011年中国期刊网收录的50例应用多层线性模型(HLM)的心理学期刊论文为研究对象,从样本描述、模型发展与规范、数据准备、估计方法与假设检验4个角度进行文献计量和内容分析,对我国心理学研究中HLM方法的使用现状进行评估。结果表明,HLM方法主要用于管理、发展和教育心理学,绝大多数应用都是两层模型且层2样本量较大。HLM方法在广泛应用的同时仍存在忽略前提假设检验、分析过程中的重要信息和结果报告不完整等问题,随后提供了4条建议。  相似文献   

13.
Plausibility has been implicated as playing a critical role in many cognitive phenomena from comprehension to problem solving. Yet, across cognitive science, plausibility is usually treated as an operationalized variable or metric rather than being explained or studied in itself. This article describes a new cognitive model of plausibility, the Plausibility Analysis Model (PAM), which is aimed at modeling human plausibility judgment. This model uses commonsense knowledge of concept-coherence to determine the degree of plausibility of a target scenario. In essence, a highly plausible scenario is one that fits prior knowledge well: with many different sources of corroboration, without complexity of explanation, and with minimal conjecture. A detailed simulation of empirical plausibility findings is reported, which shows a close correspondence between the model and human judgments. In addition, a sensitivity analysis demonstrates that PAM is robust in its operations.  相似文献   

14.
L. V. Jones and J. W. Tukey (2000) pointed out that the usual 2-sided, equal-tails null hypothesis test at level alpha can be reinterpreted as simultaneous tests of 2 directional inequality hypotheses, each at level alpha/2, and that the maximum probability of a Type I error is alpha/2 if the truth of the null hypothesis is considered impossible. This article points out that in multiple testing with familywise error rate controlled at alpha, the directional error rate (assuming all null hypotheses are false) is greater than alpha/2 and can be arbitrarily close to alpha. Single-step, step-down, and step-up procedures are analyzed, and other error rates, including the false discovery rate, are discussed. Implications for confidence interval estimation and hypothesis testing practices are considered.  相似文献   

15.
Bayes factor approaches for testing interval null hypotheses   总被引:1,自引:0,他引:1  
Psychological theories are statements of constraint. The role of hypothesis testing in psychology is to test whether specific theoretical constraints hold in data. Bayesian statistics is well suited to the task of finding supporting evidence for constraint, because it allows for comparing evidence for 2 hypotheses against each another. One issue in hypothesis testing is that constraints may hold only approximately rather than exactly, and the reason for small deviations may be trivial or uninteresting. In the large-sample limit, these uninteresting, small deviations lead to the rejection of a useful constraint. In this article, we develop several Bayes factor 1-sample tests for the assessment of approximate equality and ordinal constraints. In these tests, the null hypothesis covers a small interval of non-0 but negligible effect sizes around 0. These Bayes factors are alternatives to previously developed Bayes factors, which do not allow for interval null hypotheses, and may especially prove useful to researchers who use statistical equivalence testing. To facilitate adoption of these Bayes factor tests, we provide easy-to-use software.  相似文献   

16.
This paper presents the Featural and Unitary Semantic Space (FUSS) hypothesis of the meanings of object and action words. The hypothesis, implemented in a statistical model, is based on the following assumptions: First, it is assumed that the meanings of words are grounded in conceptual featural representations, some of which are organized according to modality. Second, it is assumed that conceptual featural representations are bound into lexico-semantic representations that provide an interface between conceptual knowledge and other linguistic information (syntax and phonology). Finally, the FUSS model employs the same principles and tools for objects and actions, modeling both domains in a single semantic space. We assess the plausibility of the model by showing that it can capture generalizations presented in the literature, in particular those related to category-related deficits, and show that it can predict semantic effects in behavioral experiments for object and action words better than other models such as Latent Semantic Analysis (Landauer & Dumais, 1997) and similarity metrics derived from Wordnet (Miller & Fellbaum, 1991).  相似文献   

17.
Relapse is the recovery of a previously suppressed response. Animal models have been useful in examining the mechanisms underlying relapse (e.g., reinstatement, renewal, reacquisition, resurgence). However, there are several challenges to analyzing relapse data using traditional approaches. For example, null hypothesis significance testing is commonly used to determine whether relapse has occurred. However, this method requires several a priori assumptions about the data, as well as a large sample size for between‐subjects comparisons or repeated testing for within‐subjects comparisons. Monte Carlo methods may represent an improved analytic technique, because these methods require no prior assumptions, permit smaller sample sizes, and can be tailored to account for all of the data from an experiment instead of some limited set. In the present study, we conducted reanalyses of three studies of relapse (Berry, Sweeney, & Odum, 2014 ; Galizio et al., 2018 ; Odum & Shahan, 2004 ) using Monte Carlo techniques to determine if relapse occurred and if there were differences in rate of response based on relevant independent variables (such as group membership or schedule of reinforcement). These reanalyses supported the previous findings. Finally, we provide general recommendations for using Monte Carlo methods in studies of relapse.  相似文献   

18.
Three experiments investigated the malleability of perceived plausibility and the subjective likelihood of occurrence of plausible and implausible events among participants who had no recollection of experiencing them. In Experiment 1, a plausibility-enhancing manipulation (reading accounts of the occurrence of events) combined with a personalized suggestion increased the perceived plausibility of the implausible event, as well as participants' ratings of the likelihood that they had experienced it. Plausibility and likelihood ratings were uncorrelated. Subsequent studies showed that the plausibility manipulation alone was sufficient to increase likelihood ratings but only if the accounts that participants read were set in a contemporary context. These data suggest that false autobiographical beliefs can be induced in clinical and forensic contexts even for initially implausible events.  相似文献   

19.
We advocate for rank‐permutation tests as the best choice for null‐hypothesis significance testing of behavioral data, because these tests require neither distributional assumptions about the populations from which our data were drawn nor the measurement assumption that our data are measured on an interval scale. We provide an algorithm that enables exact‐probability versions of such tests without recourse to either large‐sample approximation or resampling approaches. We particularly consider a rank‐permutation test for monotonic trend, and provide an extension of this test that allows unequal number of data points, or observations, for each subject. We provide an extended table of critical values of the test statistic for this test, and both a spreadsheet implementation and an Oracle® Java Web Start application to generate other critical values at https://sites.google.com/a/eastbayspecialists.co.nz/rank-permutation/ .  相似文献   

20.
Many writers have implicitly or explicitly stated that nonparametric tests are free from the assumption of homogeneity of variance. Nonparametric tests for difference in central tendencies generally involve the assumption of homogeneity of variance. The assumption of homogeneity of variance for the t test and for nonparametric tests serves the same purpose: it allows the user to draw more specific inferences when the null hypothesis is rejected.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号