首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Informative hypotheses are increasingly being used in psychological sciences because they adequately capture researchers’ theories and expectations. In the Bayesian framework, the evaluation of informative hypotheses often makes use of default Bayes factors such as the fractional Bayes factor. This paper approximates and adjusts the fractional Bayes factor such that it can be used to evaluate informative hypotheses in general statistical models. In the fractional Bayes factor a fraction parameter must be specified which controls the amount of information in the data used for specifying an implicit prior. The remaining fraction is used for testing the informative hypotheses. We discuss different choices of this parameter and present a scheme for setting it. Furthermore, a software package is described which computes the approximated adjusted fractional Bayes factor. Using this software package, psychological researchers can evaluate informative hypotheses by means of Bayes factors in an easy manner. Two empirical examples are used to illustrate the procedure.  相似文献   

2.
In comparing characteristics of independent populations, researchers frequently expect a certain structure of the population variances. These expectations can be formulated as hypotheses with equality and/or inequality constraints on the variances. In this article, we consider the Bayes factor for testing such (in)equality-constrained hypotheses on variances. Application of Bayes factors requires specification of a prior under every hypothesis to be tested. However, specifying subjective priors for variances based on prior information is a difficult task. We therefore consider so-called automatic or default Bayes factors. These methods avoid the need for the user to specify priors by using information from the sample data. We present three automatic Bayes factors for testing variances. The first is a Bayes factor with equal priors on all variances, where the priors are specified automatically using a small share of the information in the sample data. The second is the fractional Bayes factor, where a fraction of the likelihood is used for automatic prior specification. The third is an adjustment of the fractional Bayes factor such that the parsimony of inequality-constrained hypotheses is properly taken into account. The Bayes factors are evaluated by investigating different properties such as information consistency and large sample consistency. Based on this evaluation, it is concluded that the adjusted fractional Bayes factor is generally recommendable for testing equality- and inequality-constrained hypotheses on variances.  相似文献   

3.
Several issues are discussed when testing inequality constrained hypotheses using a Bayesian approach. First, the complexity (or size) of the inequality constrained parameter spaces can be ignored. This is the case when using the posterior probability that the inequality constraints of a hypothesis hold, Bayes factors based on non‐informative improper priors, and partial Bayes factors based on posterior priors. Second, the Bayes factor may not be invariant for linear one‐to‐one transformations of the data. This can be observed when using balanced priors which are centred on the boundary of the constrained parameter space with a diagonal covariance structure. Third, the information paradox can be observed. When testing inequality constrained hypotheses, the information paradox occurs when the Bayes factor of an inequality constrained hypothesis against its complement converges to a constant as the evidence for the first hypothesis accumulates while keeping the sample size fixed. This paradox occurs when using Zellner's g prior as a result of too much prior shrinkage. Therefore, two new methods are proposed that avoid these issues. First, partial Bayes factors are proposed based on transformed minimal training samples. These training samples result in posterior priors that are centred on the boundary of the constrained parameter space with the same covariance structure as in the sample. Second, a g prior approach is proposed by letting g go to infinity. This is possible because the Jeffreys–Lindley paradox is not an issue when testing inequality constrained hypotheses. A simulation study indicated that the Bayes factor based on this g prior approach converges fastest to the true inequality constrained hypothesis.  相似文献   

4.
Analyses are mostly executed at the population level, whereas in many applications the interest is on the individual level instead of the population level. In this paper, multiple N =?1 experiments are considered, where participants perform multiple trials with a dichotomous outcome in various conditions. Expectations with respect to the performance of participants can be translated into so-called informative hypotheses. These hypotheses can be evaluated for each participant separately using Bayes factors. A Bayes factor expresses the relative evidence for two hypotheses based on the data of one individual. This paper proposes to “average” these individual Bayes factors in the gP-BF, the average relative evidence. The gP-BF can be used to determine whether one hypothesis is preferred over another for all individuals under investigation. This measure provides insight into whether the relative preference of a hypothesis from a pre-defined set is homogeneous over individuals. Two additional measures are proposed to support the interpretation of the gP-BF: the evidence rate (ER), the proportion of individual Bayes factors that support the same hypothesis as the gP-BF, and the stability rate (SR), the proportion of individual Bayes factors that express a stronger support than the gP-BF. These three statistics can be used to determine the relative support in the data for the informative hypotheses entertained. Software is available that can be used to execute the approach proposed in this paper and to determine the sensitivity of the outcomes with respect to the number of participants and within condition replications.  相似文献   

5.
Recent studies have shown that many physiological and behavioral processes can be characterized by long-range correlations. The Hurst exponent H of fractal analysis and the fractional-differencing parameter d of the ARFIMA methodology are useful for capturing serial correlations. In this study, we report on different estimators of H and d implemented in R, a popular and freely available software package. By means of Monte Carlo simulations, we analyzed the performance of (1) the Geweke—Porter-Hudak estimator, (2) the approximate maximum likelihood algorithm, (3) the smoothed periodogram approach, (4) the Whittle estimator, (5) rescaled range analysis, (6) a modified periodogram, (7) Higuchi’s method, and (8) detrended fluctuation analysis. The findings—confined to ARFIMA (0, d, 0) models and fractional Gaussian noise—identify the best estimators for persistent and antipersistent series. Two examples combining these results with the step-by-step procedure proposed by Delignières et al. (2006) demonstrate how this evaluation can be used as a guideline in a typical research situation.  相似文献   

6.
When analyzing repeated measurements data, researchers often have expectations about the relations between the measurement means. The expectations can often be formalized using equality and inequality constraints between (i) the measurement means over time, (ii) the measurement means between groups, (iii) the means adjusted for time-invariant covariates, and (iv) the means adjusted for time-varying covariates. The result is a set of informative hypotheses. In this paper, the Bayes factor is used to determine which hypothesis receives most support from the data. A pivotal element in the Bayesian framework is the specification of the prior. To avoid subjective prior specification, training data in combination with restrictions on the measurement means are used to obtain so-called constrained posterior priors. A simulation study and an empirical example from developmental psychology show that this prior results in Bayes factors with desirable properties.  相似文献   

7.
Bayes factor approaches for testing interval null hypotheses   总被引:1,自引:0,他引:1  
Psychological theories are statements of constraint. The role of hypothesis testing in psychology is to test whether specific theoretical constraints hold in data. Bayesian statistics is well suited to the task of finding supporting evidence for constraint, because it allows for comparing evidence for 2 hypotheses against each another. One issue in hypothesis testing is that constraints may hold only approximately rather than exactly, and the reason for small deviations may be trivial or uninteresting. In the large-sample limit, these uninteresting, small deviations lead to the rejection of a useful constraint. In this article, we develop several Bayes factor 1-sample tests for the assessment of approximate equality and ordinal constraints. In these tests, the null hypothesis covers a small interval of non-0 but negligible effect sizes around 0. These Bayes factors are alternatives to previously developed Bayes factors, which do not allow for interval null hypotheses, and may especially prove useful to researchers who use statistical equivalence testing. To facilitate adoption of these Bayes factor tests, we provide easy-to-use software.  相似文献   

8.
A sizeable literature exists on the use of frequentist power analysis in the null-hypothesis significance testing (NHST) paradigm to facilitate the design of informative experiments. In contrast, there is almost no literature that discusses the design of experiments when Bayes factors (BFs) are used as a measure of evidence. Here we explore Bayes Factor Design Analysis (BFDA) as a useful tool to design studies for maximum efficiency and informativeness. We elaborate on three possible BF designs, (a) a fixed-n design, (b) an open-ended Sequential Bayes Factor (SBF) design, where researchers can test after each participant and can stop data collection whenever there is strong evidence for either \(\mathcal {H}_{1}\) or \(\mathcal {H}_{0}\), and (c) a modified SBF design that defines a maximal sample size where data collection is stopped regardless of the current state of evidence. We demonstrate how the properties of each design (i.e., expected strength of evidence, expected sample size, expected probability of misleading evidence, expected probability of weak evidence) can be evaluated using Monte Carlo simulations and equip researchers with the necessary information to compute their own Bayesian design analyses.  相似文献   

9.
The Bayes factor is an intuitive and principled model selection tool from Bayesian statistics. The Bayes factor quantifies the relative likelihood of the observed data under two competing models, and as such, it measures the evidence that the data provides for one model versus the other. Unfortunately, computation of the Bayes factor often requires sampling-based procedures that are not trivial to implement. In this tutorial, we explain and illustrate the use of one such procedure, known as the product space method (Carlin & Chib, 1995). This is a transdimensional Markov chain Monte Carlo method requiring the construction of a “supermodel” encompassing the models under consideration. A model index measures the proportion of times that either model is visited to account for the observed data. This proportion can then be transformed to yield a Bayes factor. We discuss the theory behind the product space method and illustrate, by means of applied examples from psychological research, how the method can be implemented in practice.  相似文献   

10.
The standard univariate and multivariate methods are conventionally used to analyze continuous data from groups by trials repeated measures designs, in spite of being extremely sensitive to departures from the multisample sphericity assumption when group sizes are unequal. However, in the last 10 years several authors have offered alternative solutions to these tests that do not rest on this assumption. In an attempt to improve the precision of the Brown–Forsythe (BF) procedure, a new approximate degrees of freedom (df) approach is presented in this article. Unlike the BF test, the new method not only assures that the df will be always positive but also provides invariant solutions under linear transformations of the data. Monte Carlo methods are used to compare the new solution, in terms of control of Type I error rates, with the modified empirical generalized least squares and BF methods. Our extensive numerical studies show that the modified BF procedure outperformed the other two methods for a wide range of conditions.  相似文献   

11.
In recent years, statisticians and psychologists have provided the critique that p-values do not capture the evidence afforded by data and are, consequently, ill suited for analysis in scientific endeavors. The issue is particular salient in the assessment of the recent evidence provided for ESP by Bem (2011) in the mainstream Journal of Personality and Social Psychology. Wagenmakers, Wetzels, Borsboom, and van der Maas (Journal of Personality and Social Psychology, 100, 426-432, 2011) have provided an alternative Bayes factor assessment of Bem's data, but their assessment was limited to examining each experiment in isolation. We show here that the variant of the Bayes factor employed by Wagenmakers et al. is inappropriate for making assessments across multiple experiments, and cannot be used to gain an accurate assessment of the total evidence in Bem's data. We develop a meta-analytic Bayes factor that describes how researchers should update their prior beliefs about the odds of hypotheses in light of data across several experiments. We find that the evidence that people can feel the future with neutral and erotic stimuli to be slight, with Bayes factors of 3.23 and 1.57, respectively. There is some evidence, however, for the hypothesis that people can feel the future with emotionally valenced nonerotic stimuli, with a Bayes factor of about 40. Although this value is certainly noteworthy, we believe it is orders of magnitude lower than what is required to overcome appropriate skepticism of ESP.  相似文献   

12.
This study examines the hypotheses that (1) 17 domains of general knowledge can be identified; (2) these are positively intercorrelated and form a general factor of general knowledge; (3) there are sex differences in the different domains of general knowledge; and (4) males have more general knowledge in more of these domains than females and in the general factor. The study tests these hypotheses on a sample of 302 German high school students. All the hypotheses were confirmed. All the domains of general knowledge were positively intercorrelated. A general factor was found that explained 31.3% of the variance. Males achieved significantly and substantially higher scores than females in general knowledge of 0.60d. The only area in which females scored significantly higher than males was Nutrition for which there was a medium size effect size (d=0.50). The results are highly similar to those among university students in Northern Ireland reported by Lynn, Irwing, and Cammock (2002).  相似文献   

13.
Abstract

In intervention studies having multiple outcomes, researchers often use a series of univariate tests (e.g., ANOVAs) to assess group mean differences. Previous research found that this approach properly controls Type I error and generally provides greater power compared to MANOVA, especially under realistic effect size and correlation combinations. However, when group differences are assessed for a specific outcome, these procedures are strictly univariate and do not consider the outcome correlations, which may be problematic with missing outcome data. Linear mixed or multivariate multilevel models (MVMMs), implemented with maximum likelihood estimation, present an alternative analysis option where outcome correlations are taken into account when specific group mean differences are estimated. In this study, we use simulation methods to compare the performance of separate independent samples t tests estimated with ordinary least squares and analogous t tests from MVMMs to assess two-group mean differences with multiple outcomes under small sample and missingness conditions. Study results indicated that a MVMM implemented with restricted maximum likelihood estimation combined with the Kenward–Roger correction had the best performance. Therefore, for intervention studies with small N and normally distributed multivariate outcomes, the Kenward–Roger procedure is recommended over traditional methods and conventional MVMM analyses, particularly with incomplete data.  相似文献   

14.
When bivariate normality is violated, the default confidence interval of the Pearson correlation can be inaccurate. Two new methods were developed based on the asymptotic sampling distribution of Fisher's z′ under the general case where bivariate normality need not be assumed. In Monte Carlo simulations, the most successful of these methods relied on the (Vale & Maurelli, 1983, Psychometrika, 48, 465) family to approximate a distribution via the marginal skewness and kurtosis of the sample data. In Simulation 1, this method provided more accurate confidence intervals of the correlation in non-normal data, at least as compared to no adjustment of the Fisher z′ interval, or to adjustment via the sample joint moments. In Simulation 2, this approximate distribution method performed favourably relative to common non-parametric bootstrap methods, but its performance was mixed relative to an observed imposed bootstrap and two other robust methods (PM1 and HC4). No method was completely satisfactory. An advantage of the approximate distribution method, though, is that it can be implemented even without access to raw data if sample skewness and kurtosis are reported, making the method particularly useful for meta-analysis. Supporting information includes R code.  相似文献   

15.
Null hypothesis significance testing (NHST) is the most commonly used statistical methodology in psychology. The probability of achieving a value as extreme or more extreme than the statistic obtained from the data is evaluated, and if it is low enough, the null hypothesis is rejected. However, because common experimental practice often clashes with the assumptions underlying NHST, these calculated probabilities are often incorrect. Most commonly, experimenters use tests that assume that sample sizes are fixed in advance of data collection but then use the data to determine when to stop; in the limit, experimenters can use data monitoring to guarantee that the null hypothesis will be rejected. Bayesian hypothesis testing (BHT) provides a solution to these ills because the stopping rule used is irrelevant to the calculation of a Bayes factor. In addition, there are strong mathematical guarantees on the frequentist properties of BHT that are comforting for researchers concerned that stopping rules could influence the Bayes factors produced. Here, we show that these guaranteed bounds have limited scope and often do not apply in psychological research. Specifically, we quantitatively demonstrate the impact of optional stopping on the resulting Bayes factors in two common situations: (1) when the truth is a combination of the hypotheses, such as in a heterogeneous population, and (2) when a hypothesis is composite—taking multiple parameter values—such as the alternative hypothesis in a t-test. We found that, for these situations, while the Bayesian interpretation remains correct regardless of the stopping rule used, the choice of stopping rule can, in some situations, greatly increase the chance of experimenters finding evidence in the direction they desire. We suggest ways to control these frequentist implications of stopping rules on BHT.  相似文献   

16.
Two expectations of the adjusted Rand index (ARI) are compared. It is shown that the expectation derived by Morey and Agresti (1984, Educational and Psychological Measurement, 44, 33) under the multinomial distribution to approximate the exact expectation from the hypergeometric distribution (Hubert & Arabie, 1985, Journal of Classification, 2, 193) provides a poor approximation, and, in some cases, the difference between the two expectations can increase with the sample size. Proofs concerning the minimum and maximum difference between the two expectations are provided, and it is shown through simulation that the ARI can differ significantly depending on which expectation is used. Furthermore, when compared in a hypothesis testing framework, multinomial approximation overly favours the null hypothesis.  相似文献   

17.
Psychological theories often produce hypotheses that pertain to individual differences in within-person variability. To empirically test the predictions entailed by such hypotheses with longitudinal data, researchers often use multilevel approaches that allow them to model between-person differences in the mean level of a certain variable and the residual within-person variance. Currently, these approaches can be applied only when the data stem from a single variable. However, it is common practice in psychology to assess not just a single measure but rather several measures of a construct. In this paper we describe a model in which we combine the single-indicator model with confirmatory factor analysis. The new model allows individual differences in latent mean-level factors and latent within-person variability factors to be estimated. Furthermore, we show how the model's parameters can be estimated with a maximum likelihood estimator, and we illustrate the approach using an example that involves intensive longitudinal data.  相似文献   

18.
A composite step‐down procedure, in which a set of step‐down tests are summarized collectively with Fisher's combination statistic, was considered to test for multivariate mean equality in two‐group designs. An approximate degrees of freedom (ADF) composite procedure based on trimmed/Winsorized estimators and a non‐pooled estimate of error variance is proposed, and compared to a composite procedure based on trimmed/Winsorized estimators and a pooled estimate of error variance. The step‐down procedures were also compared to Hotelling's T2 and Johansen's ADF global procedure based on trimmed estimators in a simulation study. Type I error rates of the pooled step‐down procedure were sensitive to covariance heterogeneity in unbalanced designs; error rates were similar to those of Hotelling's T2 across all of the investigated conditions. Type I error rates of the ADF composite step‐down procedure were insensitive to covariance heterogeneity and less sensitive to the number of dependent variables when sample size was small than error rates of Johansen's test. The ADF composite step‐down procedure is recommended for testing hypotheses of mean equality in two‐group designs except when the data are sampled from populations with different degrees of multivariate skewness.  相似文献   

19.
In many situations it is desirable or necessary to administer a set of tests to several different groups, and to ask if the results obtained in the different groups may be regarded as being essentially the same in some sense. In the case of two variables (one dependent and one independent) one may, for instance, ask if the errors of estimate and the regression lines may be regarded as being the same for the populations from which the different groups are drawn. For this case, the present article considers tests for three hypotheses regarding the populations from which the different groups are drawn: (a)H A, the hypothesis that all standard errors of estimate are equal; (b)H B, the hypothesis that all regression lines are parallel, (assumingH A); and (c)H C, the hypothesis that the regression lines are identical, (assumingH B). Test criteria for these three hypotheses and their sampling theory for large samples are presented. The results are extended to the case of several independent variables. An illustrative problem is presented for two groups, two independent and one dependent variable.  相似文献   

20.
Longitudinal studies are the gold standard for research on time-dependent phenomena in the social sciences. However, they often entail high costs due to multiple measurement occasions and a long overall study duration. It is therefore useful to optimize these design factors while maintaining a high informativeness of the design. Von Oertzen and Brandmaier (2013,Psychology and Aging, 28, 414) applied power equivalence to show that Latent Growth Curve Models (LGCMs) with different design factors can have the same power for likelihood-ratio tests on the latent structure. In this paper, we show that the notion of power equivalence can be extended to Bayesian hypothesis tests of the latent structure constants. Specifically, we show that the results of a Bayes factor design analysis (BFDA; Schönbrodt & Wagenmakers (2018,Psychonomic Bulletin and Review, 25, 128) of two power equivalent LGCMs are equivalent. This will be useful for researchers who aim to plan for compelling evidence instead of frequentist power and provides a contribution towards more efficient procedures for BFDA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号