首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches.  相似文献   

2.
Theory development in both psychology and neuroscience can benefit by consideration of both behavioral and neural data sets. However, the development of appropriate methods for linking these data sets is a difficult statistical and conceptual problem. Over the past decades, different linking approaches have been employed in the study of perceptual decision-making, beginning with rudimentary linking of the data sets at a qualitative, structural level, culminating in sophisticated statistical approaches with quantitative links. We outline a new approach, in which a single model is developed that jointly addresses neural and behavioral data. This approach allows for specification and testing of quantitative links between neural and behavioral aspects of the model. Estimating the model in a Bayesian framework allows both data sets to equally inform the estimation of all model parameters. The use of a hierarchical model architecture allows for a model, which accounts for and measures the variability between neurons. We demonstrate the approach by re-analysis of a classic data set containing behavioral recordings of decision-making with accompanying single-cell neural recordings. The joint model is able to capture most aspects of both data sets, and also supports the analysis of interesting questions about prediction, including predicting the times at which responses are made, and the corresponding neural firing rates.  相似文献   

3.
Several studies aimed at testing the validity of Holland's hexagonal and Roe's circular models of interests showed results on which the null hypothesis of random arrangement can be rejected, and the investigators concluded that the tested models were supported. None of these studies, however, tested each model in its entirety. The present study is based on the assumption that the rejection of the null hypothesis of chance is not rigorous enough. Reanalysis of 13 data sets of published studies, using a more rigorous method, reveals that although the random null hypothesis can in fact be rejected in 11 data sets, the hexagonal-circular model was supported by only 2 data sets and was rejected by 11 data sets. The hierarchical model for the structure of vocational interests (I. Gati, Journal of Vocational Behavior, 1979, 15, 90–106) was submitted to an identical test and was supported by 6 out of 10 data sets, including 4 data sets that rejected the hexagonal-circular model. The predictions of each of the models which tend to be discontinued by empirical data were identified. The implications of the findings for the structure of interests and occupational choice are discussed.  相似文献   

4.
An algorithm for generating artificial test clusters   总被引:3,自引:0,他引:3  
An algorithm for generating artificial data sets which contain distinct nonoverlapping clusters is presented. The algorithm is useful for generating test data sets for Monte Carlo validation research conducted on clustering methods or statistics. The algorithm generates data sets which contain either 1, 2, 3, 4, or 5 clusters. By default, the data are embedded in either a 4, 6, or 8 dimensional space. Three different patterns for assigning the points to the clusters are provided. One pattern assigns the points equally to the clusters while the remaining two schemes produce clusters of unequal sizes. Finally, a number of methods for introducing error in the data have been incorporated in the algorithm.  相似文献   

5.
Dorr M  Vig E  Barth E 《Visual cognition》2012,20(4-5):495-514
We here study the predictability of eye movements when viewing high-resolution natural videos. We use three recently published gaze data sets that contain a wide range of footage, from scenes of almost still-life character to professionally made, fast-paced advertisements and movie trailers. Inter-subject gaze variability differs significantly between data sets, with variability being lowest for the professional movies. We then evaluate three state-of-the-art saliency models on these data sets. A model that is based on the invariants of the structure tensor and that combines very generic, sparse video representations with machine learning techniques outperforms the two reference models; performance is further improved for two data sets when the model is extended to a perceptually inspired colour space. Finally, a combined analysis of gaze variability and predictability shows that eye movements on the professionally made movies are the most coherent (due to implicit gaze-guidance strategies of the movie directors), yet the least predictable (presumably due to the frequent cuts). Our results highlight the need for standardized benchmarks to comparatively evaluate eye movement prediction algorithms.  相似文献   

6.
We examined the extent to which variations in session duration affected the outcomes of functional analyses. Forty-six individuals, all diagnosed with mental retardation and referred for assessment and treatment of self-injurious or aggressive behavior, participated in functional analyses, consisting of repeated exposure to multiple test conditions during 15-min sessions. For each set of assessment data, new data sets based on session durations of 10 and 5 min were prepared by deleting data from the last 5 and 10 min, respectively, of each session. Each graph (N = 138) was then reviewed individually by graduate students who had previous experience conducting and interpreting functional analyses, but who were blind to both participant identity and session duration. Interpretations of behavioral function based on the 10- and 5-min data sets were then compared with those based on the 15-min data sets. All of the 10-min data sets yielded interpretations identical to those based on 15-min data sets. Interpretations based on the 5-min and 15-min data sets yielded three discrepancies, all of which were the result of increased response rates toward the latter parts of sessions. These results suggest that the efficiency of assessment might be improved with little or no loss in clarity by simply reducing the duration of assessment sessions.  相似文献   

7.
The factor structures of two sets of coping styles, proposed by Lazarus and Plutchik respectively, were investigated, as well as the relationship between each set with extraversion and neuroticism. Results show that sex is a major moderating variable. The Lazarus data yielded three separate factors for men and women, while the Plutchik data yielded four factors, of which only one was in common for men and women. A factor analysis of both sets of data yielded seven factors, suggesting that about half of the content is common to both sets of coping styles. Extraversion is positively related to six of the coping styles, and negatively to one. Neuroticism is positively related to five coping styles. For the rest, both Extraversion and Neuroticism correlate differently for men and women.  相似文献   

8.
Two independent data sets were selected to examine the interrelations among reaction time (RT), between-subject variability or diversity (SD), and age. In both data sets, a strong correlation between RT and SD was obtained. This strong correlation was not affected when age was controlled in a partial correlation analysis. On the other hand, a weaker but significant correlation was obtained between age and SD. This correlation was eliminated when RT was controlled in a partial correlation analysis. Our analyses of the two data sets also indicated that the relation between RT and SD is identical for both young and elderly groups. Thus, the greater diversity often observed in performances of older groups is a direct consequence of slowing, rather than an independent effect of advancing age.  相似文献   

9.
This paper studies the problem of scaling ordinal categorical data observed over two or more sets of categories measuring a single characteristic. Scaling is obtained by solving a constrained entropy model which finds the most probable values of the scales given the data. A Kullback-Leibler statistic is generated which operationalizes a measure for the strength of consistency among the sets of categories. A variety of data of two and three sets of categories are analyzed using the entropy approach.This research was partially supported by the Air Force Office of Scientific Research under grant AFOSR 83-0234. The support by the Air Force through grant AFOSR-83-0234 is gratefully acknowledged. The comments of the editor and referees have been most helpful in improving the paper, and in bringing several additional references to our attention.  相似文献   

10.
IPSAPRO, an ipsative scoring program written for the IBM PC, aids in the detection and transformation of response sets that often contaminate rating scale and reaction time experiments. Response sets such as the tendency to use only extreme points of a rating scale or to work for speed over accuracy in reaction time experiments are removed in IPSAPRO by standardizing each subject’s ratings or times against their own means and standard deviations. Ipsatization can be applied to existing data sets or take place automatically at the data collection stage in a text-stimuli presentation manager that is provided with the program.  相似文献   

11.
Michael Fuller 《Zygon》2015,50(3):569-582
The advent of extremely large data sets, known as “big data,” has been heralded as the instantiation of a new science, requiring a new kind of practitioner: the “data scientist.” This article explores the concept of big data, drawing attention to a number of new issues—not least ethical concerns, and questions surrounding interpretation—which big data sets present. It is observed that the skills required for data scientists are in some respects closer to those traditionally associated with the arts and humanities than to those associated with the natural sciences; and it is urged that big data presents new opportunities for dialogue, especially concerning hermeneutical issues, for theologians and data scientists.  相似文献   

12.
As a procedure for handling missing data, Multiple imputation consists of estimating the missing data multiple times to create several complete versions of an incomplete data set. All these data sets are analyzed by the same statistical procedure, and the results are pooled for interpretation. So far, no explicit rules for pooling F tests of (repeated-measures) analysis of variance have been defined. In this article we outline the appropriate procedure for the results of analysis of variance (ANOVA) for multiply imputed data sets. It involves both reformulation of the ANOVA model as a regression model using effect coding of the predictors and applying already existing combination rules for regression models. The proposed procedure is illustrated using 3 example data sets. The pooled results of these 3 examples provide plausible F and p values.  相似文献   

13.
Visual variability discrimination requires an observer to categorize collections of items on the basis of the variability in the collection; such discriminations may be vital to the adaptive actions of both humans and other animals. We present a theory of visual variability discrimination that aggregates localized differences between nearby items, and we compare this finding differences model with a previously proposed positional entropy model across several data sets involving both people and pigeons. We supplement those previously published data sets with four new experiments, three of which involve arrays comprising items entailing systematic, quantitative differences. Although both theories provide strong and similar fits of the published data sets, only the finding differences model is applicable to investigations involving quantitative item differences, providing excellent fits in these new experiments.  相似文献   

14.
Thirty previously published data sets, from seminal category learning tasks, are reanalyzed using the varying abstraction model (VAM). Unlike a prototype-versus-exemplar analysis, which focuses on extreme levels of abstraction only, a VAM analysis also considers the possibility of partial abstraction. Whereas most data sets support no abstraction when only the extreme possibilities are considered, we show that evidence for abstraction can be provided using the broader view on abstraction provided by the VAM. The present results generalize earlier demonstrations of partial abstraction (Vanpaemel & Storms, 2008), in which only a small number of data sets was analyzed. Following the dominant modus operandi in category learning research, Vanpaemel and Storms evaluated the models on their best fit, a practice known to ignore the complexity of the models under consideration. In the present study, in contrast, model evaluation not only relies on the maximal likelihood, but also on the marginal likelihood, which is sensitive to model complexity. Finally, using a large recovery study, it is demonstrated that, across the 30 data sets, complexity differences between the models in the VAM family are small. This indicates that a (computationally challenging) complexity-sensitive model evaluation method is uncalled for, and that the use of a (computationally straightforward) complexity-insensitive model evaluation method is justified.  相似文献   

15.
Psychophysical studies with infants or with patients often are unable to use pilot data, training, or large numbers of trials. To evaluate threshold estimates under these conditions, computer simulations of experiments with small numbers of trials were performed by using psychometric functions based on a model of two types of noise: stimulus-related noise (affecting slope) and extraneous noise (affecting upper asymptote). Threshold estimates were biased and imprecise when extraneous noise was high, as were the estimates of extraneous noise. Strategies were developed for rejecting data sets as too noisy for unbiased and precise threshold estimation; these strategies were most successful when extraneous noise was low for most of the data sets. An analysis of 1,026 data sets from visual function tests of infants and toddlers showed that extraneous noise is often considerable, that experimental paradigms can be developed that minimize extraneous noise, and that data analysis that does not consider the effects of extraneous noise may underestimate test-retest reliability and overestimate interocular differences.  相似文献   

16.
Discounting is the process by which outcomes lose value. Much of discounting research has focused on differences in the degree of discounting across various groups. This research has relied heavily on conventional null hypothesis significance tests that are familiar to psychologists, such as t‐tests and ANOVAs. As discounting research questions have become more complex by simultaneously focusing on within‐subject and between‐group differences, conventional statistical testing is often not appropriate for the obtained data. Generalized estimating equations (GEE) are one type of mixed‐effects model that are designed to handle autocorrelated data, such as within‐subject repeated‐measures data, and are therefore more appropriate for discounting data. To determine if GEE provides similar results as conventional statistical tests, we compared the techniques across 2,000 simulated data sets. The data sets were created using a Monte Carlo method based on an existing data set. Across the simulated data sets, the GEE and the conventional statistical tests generally provided similar patterns of results. As the GEE and more conventional statistical tests provide the same pattern of result, we suggest researchers use the GEE because it was designed to handle data that has the structure that is typical of discounting data.  相似文献   

17.
Two sources of variability must each be considered when examining change in level between two sets of data obtained by human observers; namely, variance within data sets (phases) and variability attributed to each data point (reliability). Birkimer and Brown (1979a, 1979b) have suggested that both chance levels and disagreement bands be considered in examining observer reliability and have made both methods more accessible to researchers. By clarifying and extending Birkimer and Brown's papers, a system is developed using observer agreement to determine the data point variability and thus to check the adequacy of obtained data within the experimental context.  相似文献   

18.
When large numbers of statistical tests are computed, such as in broad investigations of personality and behavior, the number of significant findings required before the total can be confidently considered beyond chance is typically unknown. Employing modern software, specially written code, and new procedures, the present article uses three sets of personality data to demonstrate how approximate randomization tests can evaluate (a) the number of significant correlations between a single variable and a large number of other variables, (b) the number of significant correlations between two large sets of variables, and (c) the average size of a large number of effects. Randomization tests can free researchers to fully explore large data sets and potentially have even wider applicability.  相似文献   

19.
The application of item response theory (IRT) models requires the identification of the data's dimensionality. A popular method for determining the number of latent dimensions is the factor analysis of a correlation matrix. Unlike factor analysis, which is based on a linear model, IRT assumes a nonlinear relationship between item performance and ability. Because multidimensional scaling (MDS) assumes a monotonic relationship this method may be useful for the assessment of a data set's dimensionality for use with IRT models. This study compared MDS, exploratory and confirmatory factor analysis (EFA and CFA, respectively) in the assessment of the dimensionality of data sets which had been generated to be either one- or two-dimensional. In addition, the data sets differed in the degree of interdimensional correlation and in the number of items defining a dimension. Results showed that MDS and CFA were able to correctly identify the number of latent dimensions for all data sets. In general, EFA was able to correctly identify the data's dimensionality, except for data whose interdimensional correlation was high.  相似文献   

20.
Similarity measures have been studied extensively in many domains, but usually with well‐structured data sets. In many psychological applications, however, such data sets are not available. It often cannot even be predicted how many items will be observed, or what exactly they will entail. This paper introduces a similarity measure, called the metric‐frequency (MF) measure, that can be applied to such data sets. If it is not known beforehand how many items will be observed, then the number of items actually observed in itself carries information. A typical feature of the MF is that it incorporates such information. The primary purpose of our measure is that it should be pragmatic, widely applicable, and tractable, even if data are complex. The MF generalizes Tversky's set‐theoretic measure of similarity to cases where items may be present or absent and at the same time can be numerical as with Shepard's metric measure, but need not be so. As an illustration, we apply the MF to family therapy where it cannot be predicted what issues the clients will raise in therapy sessions. The MF is flexible enough to be applicable to idiographic data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号