Single-subject and statistical inference are virtually identical. With both techniques change is inferred when variability across conditions is sufficiently large to accommodate variability within conditions, replication is the final arbiter of whether change is likely to occur by chance, a large effect size is preferred to a small consistent difference, there are similar threats to internal validity, and generalizability of results is valued. Knowing how to use statistical inferential procedures would make behavior analysts more methodologically sophisticated. It would also help them to critically evaluate research in other areas of psychology, obtain research grants, and publish their research in diverse outlets, which would help others to see behavior-analytic work.  相似文献   

Statistical inference promises automatic, objective, reliable assessments of data, independent of the skills or biases of the investigator, whereas the single-subject methods favored by behavior analysts often are said to rely too much on the investigator's subjective impressions, particularly in the visual analysis of data. In fact, conventional statistical methods are difficult to apply correctly, even by experts, and the underlying logic of null-hypothesis testing has drawn criticism since its inception. By comparison, single-subject methods foster direct, continuous interaction between investigator and subject and development of strong forms of experimental control that obviate the need for statistical inference. Treatment effects are demonstrated in experimental designs that incorporate replication within and between subjects, and the visual analysis of data is adequate when integrated into such designs. Thus, single-subject methods are ideal for shaping-and maintaining-the kind of experimental practices that will ensure the continued success of behavior analysis.  相似文献   

Significance testing plays a prominent role in behavioral science, but its value is frequently overestimated. It does not estimate the reliability of a finding, it does not yield a probability that results are due to chance, nor does it usually answer an important question. In behavioral science it can limit the reasons for doing experiments, reduce scientific responsibility, and emphasize population parameters at the expense of behavior. It can, and usually does, lead to a poor approach to theory testing, and it can also, in behavior-analytic experiments, discount reliability of data. At best, statistical significance is an ancillary aspect of a set of data, and therefore should play a relatively minor role in advancing a science of behavior.  相似文献   

For any given number of factors, Minimum Rank Factor Analysis yields optimal communalities for an observed covariance matrix in the sense that the unexplained common variance with that number of factors is minimized, subject to the constraint that both the diagonal matrix of unique variances and the observed covariance matrix minus that diagonal matrix are positive semidefinite. As a result, it becomes possible to distinguish the explained common variance from the total common variance. The percentage of explained common variance is similar in meaning to the percentage of explained observed variance in Principal Component Analysis, but typically the former is much closer to 100 than the latter. So far, no statistical theory of MRFA has been developed. The present paper is a first start. It yields closed-form expressions for the asymptotic bias of the explained common variance, or, more precisely, of the unexplained common variance, under the assumption of multivariate normality. Also, the asymptotic variance of this bias is derived, and also the asymptotic covariance matrix of the unique variances that define a MRFA solution. The presented asymptotic statistical inference is based on a recently developed perturbation theory of semidefinite programming. A numerical example is also offered to demonstrate the accuracy of the expressions.This work was supported, in part, by grant DMS-0073770 from the National Science Foundation.  相似文献   

Statistical significance, by itself, is not a sufficient condition for claiming that a hypothesis has been supported. Constructive replications are considerably more important. Unfortunately, classical (Fisherian) statistics are not easily adapted to sequential research strategies; their focus is the single experiment. For this reason, statistically significant results may be meaningless while a particular sequence of nonsignificant results may be quite important. Advice on how to overcome some limitations of classical statistical procedures is given, along with a compendium of “do's and don't's.”  相似文献   

Statistical inference: learning in artificial neural networks   总被引:1,自引:0,他引:1  
Artificial neural networks (ANNs) are widely used to model low-level neural activities and high-level cognitive functions. In this article, we review the applications of statistical inference for learning in ANNs. Statistical inference provides an objective way to derive learning algorithms both for training and for evaluation of the performance of trained ANNs. Solutions to the over-fitting problem by model-selection methods, based on either conventional statistical approaches or on a Bayesian approach, are discussed. The use of supervised and unsupervised learning algorithms for ANNs are reviewed. Training a multilayer ANN by supervised learning is equivalent to nonlinear regression. The ensemble methods, bagging and arching, described here, can be applied to combine ANNs to form a new predictor with improved performance. Unsupervised learning algorithms that are derived either by the Hebbian law for bottom-up self-organization, or by global objective functions for top-down self-organization are also discussed.  相似文献   

This paper discusses similarities between the mathematization of operant behavior and the early history of the most mathematical of sciences-physics. Galileo explored the properties of motion without dealing with the causes of motion, focusing on changes in motion. Newton's dynamics were concerned with the action of forces as causes of change. Skinner's rationale for using rate to describe behavior derived from an interest in changes in rate. Reinforcement has played the role of force in the dynamics of behavior. Behavioral momentum and maximization have received mathematical formulations in behavior analysis. Yet to be worked out are the relations between molar and molecular formulations of behavioral theory.  相似文献   

Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes—incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way—in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.  相似文献   

This paper develops a unified approach, based on ranks, to the statistical analysis of data arising from complex experimental designs. In this way we answer a major objection to the use of rank procedures as a major methodology in data analysis. We show that the rank procedures, including testing, estimation and multiple comparisons, are generated in a natural way from a robust measure of scale. The rank methods closely parallel the familiar methods of least squares, so that estimates and tests have natural interpretations.This research was supported in part by grant MCS76-07292 from the National Science Foundation.  相似文献   

Statistical inference and nonrandom samples   总被引:1,自引:0,他引:1  

This research investigated causal inferences between leader reward behavior (positive and punitive) and subordinate goal attainment, absenteeism, and work satisfaction over a 3-month period in a merchandise distribution center (n = 252). Four groups were studied: (a) male supervisors-male subordinates, (b) male supervisors-female subordinates, (c) female supervisors-female subordinates, and (d) female supervisors-male subordinates. Using the techniques of tests of mean differences and corrected cross-lag correlations, the results revealed that: (a) No significant differences attributed to sex were found between the four groups with the perceptions of leader reward behavior or subordinate outcome measures, and (b) the causal inference analysis suggested that the relationships between leader reward behavior and subordinate attitudes and behavior were independent of the effects of sex of supervisor or subordinate. Implications for research on sex stereotypes and leadership were discussed.  相似文献   

We present a review of statistical inference in generalized linear mixed models (GLMMs). GLMMs are an extension of generalized linear models and are suitable for the analysis of non‐normal data with a clustered structure. A GLMM contains parameters common to all clusters (fixed regression effects and variance components) and cluster‐specific parameters. The latter parameters are assumed to be randomly drawn from a population distribution. The parameters of this population distribution (the variance components) have to be estimated together with the fixed effects. We focus on the case in which the cluster‐specific parameters are normally distributed. The cluster‐specific effects are integrated out of the likelihood so that the fixed effects and variance components can be estimated. Unfortunately, the integral over the cluster‐specific effects is intractable for most GLMMs with a normal mixing distribution. Within a classical statistical framework, we distinguish between two broad classes of methods to handle this intractable integral: methods that rely on a numerical approximation to the integral and methods that use an analytical approximation to the integrand. Finally, we present an overview of available methods for testing hypotheses about the parameters of GLMMs.  相似文献   

Finite sample inference procedures are considered for analyzing the observed scores on a multiple choice test with several items, where, for example, the items are dissimilar, or the item responses are correlated. A discrete p-parameter exponential family model leads to a generalized linear model framework and, in a special case, a convenient regression of true score upon observed score. Techniques based upon the likelihood function, Akaike's information criteria (AIC), an approximate Bayesian marginalization procedure based on conditional maximization (BCM), and simulations for exact posterior densities (importance sampling) are used to facilitate finite sample investigations of the average true score, individual true scores, and various probabilities of interest. A simulation study suggests that, when the examinees come from two different populations, the exponential family can adequately generalize Duncan's beta-binomial model. Extensions to regression models, the classical test theory model, and empirical Bayes estimation problems are mentioned. The Duncan, Keats, and Matsumura data sets are used to illustrate potential advantages and flexibility of the exponential family model, and the BCM technique.The authors wish to thank Ella Mae Matsumura for her data set and helpful comments, Frank Baker for his advice on item response theory, Hirotugu Akaike and Taskin Atilgan, for helpful discussions regarding AIC, Graham Wood for his advice concerning the class of all binomial mixture models, Yiu Ming Chiu for providing useful references and information on tetrachoric models, and the Editor and two referees for suggesting several references and alternative approaches.  相似文献   

Recent interest in comparative psychology has stimulated much research and debate concerning cognitive processes in animal behavior. The present paper relates to this general area by treating particular issues in the analysis of comparative cognition: specifically, how cognition is inferred from animal behavior; whether the postulation of intervening cognitive processes furthers our understanding of behavior; and how rival approaches help advance the science of behavior.  相似文献   

The quality of approximations to first and second order moments (e.g., statistics like means, variances, regression coefficients) based on latent ability estimates is being discussed. The ability estimates are obtained using either the Rasch, or the two-parameter logistic model. Straightforward use of such statistics to make inferences with respect to true latent ability is not recommended, unless we account for the fact that the basic quantities are estimates. In this paper true score theory is used to account for the latter; the counterpart of observed/true score being estimated/true latent ability. It is shown that statistics based on the true score theory are virtually unbiased if the number of items presented to each examinee is larger than fifteen. Three types of estimators are compared: maximum likelihood, weighted maximum likelihood, and Bayes modal. Furthermore, the (dis)advantages of the true score method and direct modeling of latent ability is discussed.  相似文献   

Cross-national comparisons of IQ have become common since the release of a large dataset of international IQ scores. However, these studies have consistently failed to consider the potential lack of independence of these scores based on spatial proximity. To demonstrate the importance of this omission, we present a re-evaluation of several hypotheses put forward to explain variation in mean IQ among nations namely: (i) distance from central Africa, (ii) temperature, (iii) parasites, (iv) nutrition, (v) education, and (vi) GDP. We quantify the strength of spatial autocorrelation (SAC) in the predictors, response variables and the residuals of multiple regression models explaining national mean IQ. We outline a procedure for the control of SAC in such analyses and highlight the differences in the results before and after control for SAC. We find that incorporating additional terms to control for spatial interdependence increases the fit of models with no loss of parsimony. Support is provided for the finding that a national index of parasite burden and national IQ are strongly linked and temperature also features strongly in the models. However, we tentatively recommend a physiological - via impacts on host-parasite interactions - rather than evolutionary explanation for the effect of temperature. We present this study primarily to highlight the danger of ignoring autocorrelation in spatially extended data, and outline an appropriate approach should a spatially explicit analysis be considered necessary.  相似文献   

The perfect fit of syntactic derivability and logical consequence in first-order logic is one of the most celebrated facts of modern logic. In the present flurry of attention given to the semantics of natural language, surprisingly little effort has been focused on the problem of logical inference in natural language and the possibility of its completeness. Even the traditional theory of the syllogism does not give a thorough analysis of the restricted syntax it uses.My objective is to show how a theory of inference may be formulated for a fragment of English that includes a good deal more than the classical syllogism. The syntax and semantics are made as formal and as explicit as is customary for artificial formal languages. The fragment chosen is not maximal but is restricted severely in order to provide a clear overview of the method without the cluttering details that seem to be an inevitable part of any grammar covering a substantial fragment of a natural language. (Some readers may feel the details given here are too onerous.)I am especially concerned with quantifier words in both object and subject position, with negation, and with possession. I do not consider propositional attitudes or the modalities of possibility and necessity, although the model-theoretic semantics I use has a standard version to deal with such intensional contexts.An important point of methodology stressed in earlier publications (Suppes, 1976; Suppes & Macken, 1978; Suppes, 1979) is that the semantic representation of the English sentences in the fragment uses neither quantifiers nor variables, but only constants denoting given sets and relations, and operations on sets and relations.In the first section, I rapidly sketch the formal framework of generative syntax and model-theoretic semantics, with special attention to extended relation algebras. The second section states the grammar and semantics of the fragment of English considered. The next section is concerned with developing some of the rules of inference. The results given are quite incomplete. The final section raises problems of extension. Classical logic is a poor guide for dealing with inferences involving high-frequency function words such as of, to, a, in, for, with, as, on, at, and by. Indeed, the line between logical and nonlogical inference in English seems to be nonexistent or, if made, highly arbitrary in character-much more so than has been claimed by those critical of the traditional analyticsynthetic tradition.No theorems on soundness or completeness are considered because of the highly tentative and incomplete character of the rules of inference proposed. However, because of the variable-free semantics used, soundness is easy to establish for the rules given.The research reported here has been supported in part by National Science Foundation Grant No. SED77-09698.  相似文献   

Research on initial conceptual knowledge and research on early statistical learning mechanisms have been, for the most part, two separate enterprises. We report a study with 11-month-old infants investigating whether they are sensitive to sampling conditions and whether they can integrate intentional information in a statistical inference task. Previous studies found that infants were able to make inferences from samples to populations, and vice versa [Xu, F., & Garcia, V. (2008). Intuitive statistics by 8-month-old infants. Proceedings of the National Academy of Sciences of the United States of America, 105, 5012-5015]. We found that when employing this statistical inference mechanism, infants are sensitive to whether a sample was randomly drawn from a population or not, and they take into account intentional information (e.g., explicitly expressed preference, visual access) when computing the relationship between samples and populations. Our results suggest that domain-specific knowledge is integrated with statistical inference mechanisms early in development.  相似文献   

Bem and Allen (1974) have suggested that trait consistency-ratings can function as a moderator variable, with regard to the cross-situational validity of trait position-ratings. The present study attempts to consider the relationship between these two types of self-ratings. A model of “man the quasi-statistician” was proposed which yields the predictions that (a) consistency-ratings will vary positively with the polarization-estimate from position-ratings and (b) the extent of correlation will be a function of the degree of polarization associated with a given class of trait.In two experiments, position-ratings and consistency-ratings were obtained over a series of dimensions. The predictions were confirmed in both studies. In addition, Experiment II provided evidence for a correspondence between intuitive and formal estimates of central tendency polarization, although this was not the case for dispersion. The results were discussed as bearing on the model and as relating to the findings of Bem and Allen (1974).  相似文献   

