首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Lloyd (2009) contends that climate models are confirmed by various instances of fit between their output and observational data. The present paper argues that what these instances of fit might confirm are not climate models themselves, but rather hypotheses about the adequacy of climate models for particular purposes. This required shift in thinking—from confirming climate models to confirming their adequacy-for-purpose—may sound trivial, but it is shown to complicate the evaluation of climate models considerably, both in principle and in practice.  相似文献   

2.
With increasing popularity, growth curve modeling is more and more often considered as the 1st choice for analyzing longitudinal data. Although the growth curve approach is often a good choice, other modeling strategies may more directly answer questions of interest. It is common to see researchers fit growth curve models without considering alterative modeling strategies. In this article we compare 3 approaches for analyzing longitudinal data: repeated measures analysis of variance, covariance pattern models, and growth curve models. As all are members of the general linear mixed model family, they represent somewhat different assumptions about the way individuals change. These assumptions result in different patterns of covariation among the residuals around the fixed effects. In this article, we first indicate the kinds of data that are appropriately modeled by each and use real data examples to demonstrate possible problems associated with the blanket selection of the growth curve model. We then present a simulation that indicates the utility of Akaike information criterion and Bayesian information criterion in the selection of a proper residual covariance structure. The results cast doubt on the popular practice of automatically using growth curve modeling for longitudinal data without comparing the fit of different models. Finally, we provide some practical advice for assessing mean changes in the presence of correlated data.  相似文献   

3.
Abstract— When ratings of judged similarity or frequencies of stimulus identification are averaged across subjects, the psychological structure of the data is fundamentally changed. Regardless of the structure of the individual-subject data, the averaged similarity data will likely be well fit by a standard multidimensional scaling model, and the averaged identification data will likely be well fit by the similarity-choice model. In fact, both models often provide excellent fits to averaged data, even if they fail to fit the data of each individual subject. Thus, a good fit of either model to averaged data cannot be taken as evidence that the model describes the psychological structure that characterizes individual subjects. We hypothesize that these effects are due to the increased symmetry that is a mathematical consequence of the averaging operation.  相似文献   

4.
Remember-know judgments provide additional information in recognition memory tests, but the nature of this information and the attendant decision process are in dispute. Competing models have proposed that remember judgments reflect a sum of familiarity and recollective information (the one-dimensional model), are based on a difference between these strengths (STREAK), or are purely recollective (the dual-process model). A choice among these accounts is sometimes made by comparing the precision of their fits to data, but this strategy may be muddied by differences in model complexity: Some models that appear to provide good fits may simply be better able to mimic the data produced by other models. To evaluate this possibility, we simulated data with each of the models in each of three popular remember-know paradigms, then fit those data to each of the models. We found that the one-dimensional model is generally less complex than the others, but despite this handicap, it dominates the others as the best-fitting model. For both reasons, the one-dimensional model should be preferred. In addition, we found that some empirical paradigms are ill-suited for distinguishing among models. For example, data collected by soliciting remember/know/new judgments--that is, the trinary task--provide a particularly weak ground for distinguishing models. Additional tables and figures may be downloaded from the Psychonomic Society's Archive of Norms, Stimuli, and Data, at www.psychonomic.org/archive.  相似文献   

5.
6.
温涵  梁韵斯 《心理科学》2015,(4):987-994
拟合指数检验是评价结构方程模型(SEM)的重要环节。从协方差结构分析的角度将SEM与传统的回归模型比较,容易理解为什么SEM需要拟合指数。揭示了目前几种流行的拟合指数检验的实质:基于卡方的绝对拟合指数(如RMSEA)检验的实质是重新设定卡方检验的显著性水平(不同于通常的.05),相对拟合指数(如NNFI和CFI)检验的实质是基于虚模型设定均方(卡方与自由度之比)降低到的比例;在NNFI大于临界值后,报告和检验CFI是不必要的。根据研究结果提出了一些方便实用的拟合检验建议。  相似文献   

7.
Three hundred and twenty two college-bound high school students described aspects of their college decision-making processes. Students listed their criteria as well as the alternatives (i. e., schools) under consideration, rated the importance of each criterion, and rated each alternative with respect to each criterion. They also gave their overall impressions of each alternative. Finally, students rated their comfort with the decision-making process and, at the conclusion of the study, reported on how many schools they had applied to, had or had not been accepted at, were waiting to hear from, or were waitlisted at. Students consider four or five alternatives, and use eight to ten criteria in evaluating them. These figures do not change appreciably over the course of the process, although only about half the criteria or/and slightly more than half of the schools considered at one time are considered again 6 months later, and there are several changes in the kinds of criteria considered at different points in time. There was a marginally significant trend for higher ability and average ability students to consider more criteria, more distinct types of criteria, and more alternatives than do lower ability students. There were no gender differences in this regard. Gender differences and academic ability group differences were apparent, however, in the types of criteria students reported. Participation in multiple sessions in this study had few reliable effects on decision-making performance. Students were given a list of 34 standard criteria at each session, and incorporated some of these into their own lists of criteria during subsequent sessions. However, there was no indication that repeated participation led students to adopt a more analytical strategy than they would have otherwise. Data were compared with three linear models of information integration. Models using data with multiple criteria better fit the students' data than did a model using only the most important criterion. Higher ability students were particularly better able to integrate information according to linear models.  相似文献   

8.
Two models are proposed for responding under fixed-interval schedules of reinforcement. The first model is a Poisson model and seems suitable for situations in which responding produces a classical “FI scallop”. A second model is then developed to describe “break and run” performance, which is also known to occur under some Fixed Interval schedules. The models do not however give any indication of the circumstances under which a particular mode of responding should arise. A comparison of the models to a small set of data collected from rats performing under an FI 60 sec schedule indicates that for the data considered, the second model (a State model) produced by far the best fit.  相似文献   

9.
The diffusion model (Ratcliff, 1978) and the leaky competing accumulator model (LCA, Usher & McClelland, 2001) were tested against two-choice data collected from the same subjects with the standard response time procedure and the response signal procedure. In the response signal procedure, a stimulus is presented and then, at one of a number of experimenter-determined times, a signal to respond is presented. The models were fit to the data from the two procedures simultaneously under the assumption that responses in the response signal procedure were based on a mixture of decision processes that had already terminated at response boundaries before the signal and decision processes that had not yet terminated. In the latter case, decisions were based on partial information in one variant of each model or on guessing in a second variant. Both variants of the diffusion model fit the data well and both fit better than either variant of the LCA model, although the differences in numerical goodness-of-fit measures were not large enough to allow decisive selection between the models.  相似文献   

10.
For decades sequential sampling models have successfully accounted for human and monkey decision-making, relying on the standard assumption that decision makers maintain a pre-set decision standard throughout the decision process. Based on the theoretical argument of reward rate maximization, some authors have recently suggested that decision makers become increasingly impatient as time passes and therefore lower their decision standard. Indeed, a number of studies show that computational models with an impatience component provide a good fit to human and monkey decision behavior. However, many of these studies lack quantitative model comparisons and systematic manipulations of rewards. Moreover, the often-cited evidence from single-cell recordings is not unequivocal and complimentary data from human subjects is largely missing. We conclude that, despite some enthusiastic calls for the abandonment of the standard model, the idea of an impatience component has yet to be fully established; we suggest a number of recently developed tools that will help bring the debate to a conclusive settlement.  相似文献   

11.
Previous studies aimed at testing the structure of occupations have been based on analysis of aggregate data. In studies comparing the hierarchical and the hexagonal-circular models for the structure of interests, the former fit the data at least as well as the latter. The present study compared separately for each subject the hierarchical and the hexagonal models as the hypothesized structures for occupations. Twenty-six students judged the similarity between all possible pairs of 24 high-level occupations, 3 occupations for each of Roe's eight fields. The findings demonstrated that (a) the within-subject structures resemble the structure found in the aggregate data; (b) the structure of occupations based on similarity judgments resembles the structure based on preference data; (c) diagnostic properties suggest that for most subjects the perceived structure of occupations can be better described as clustering than as two-dimensional; (d) a tree representation of the perceived structure of occupations is more adequate than is the two-dimensional representation; and (e) the within-subject structure fit the hierarchical model better than it fit the hexagonal-circular model for all but one subject. A detailed analysis revealed those predictions of each model which were disconfirmed by most subjects' judgments. These results provide additional support for the relative advantage of the hierarchical over the circular model. These findings' implications for the structure of vocational interests and occupational choice were discussed.  相似文献   

12.
This paper extends Lumsden's fluctuation model to the graded response case and, from the resulting basic scaling model, develops a one‐dimensional item response theory graded response model (GRM). Under some additional assumptions, it follows that the item category response functions (ICRFs) can be closely approximated by the ICRFs of the standard GRM with equal item discrimination. For fixed item locations, the item responses depend on two individual differences parameters: the person central location and the person reliability. Procedures for estimating the person parameters and for addressing the goodness of fit of the proposed model as compared to the standard GRM are discussed. The accuracy of the person estimates is assessed by means of simulation studies. Finally, all the developments are illustrated using three empirical examples in personality measurement.  相似文献   

13.
This paper investigates the consequences of extending the assumptions of pure insertion and selective influence (popular in RT theorizing) to the level of the distribution. In the case of pure insertion and under the additional assumption that the additive random variable is exponentially distributed, a solution is obtained which not only allows estimation of the exponential-rate parameter but also provides a test of the assumptions. The result is shown to be applicable not only when processing is serial but also for certain parallel models. In addition, discrimination between self-terminating and exhaustive search strategies is provided, and in the case of either, both parameter estimation and tests of the model are possible. Extensions to nonexponential models are investigated and a general method of moments solution is outlined. In the case of selective influence a general nonparametric alternative to Sternberg's additive factor method is developed. The problem of empirical estimation and application is then considered. Simulations which place bounds on the type I and II error are reported. Finally the first theorem is provided an illustrative application with data from a memory scanning experiment. The results provide some support for the double assumption of pure insertion and that the additive random variable is distributed exponentially.  相似文献   

14.
The aim of this study was to examine the factor structure and composite reliability of the Rosenberg Self-Esteem Scale (RSES) using a sample of 669 ex-prisoners identified in the National Survey of American Life. Six distinct factor models, with uncorrelated measurement error terms, were specified and tested using confirmatory factor analysis (CFA). Results indicated that the two-factor model consisting of positive and negative latent variables provided a better fit to the data than the alternative models. Moreover, only positive self-esteem was a significant predictor of recidivism. Composite reliability indicated that the two factors were measured with very good reliability. The results consequently provide additional support for a two-dimensional model of the RSES within offender populations.  相似文献   

15.
Determining the knowledge that guides human judgments is fundamental to understanding how people reason, make decisions, and form predictions. We use an experimental procedure called 'iterated learning,' in which the responses that people give on one trial are used to generate the data they see on the next, to pinpoint the knowledge that informs people's predictions about everyday events (e.g., predicting the total box office gross of a movie from its current take). In particular, we use this method to discriminate between two models of human judgments: a simple Bayesian model ( Griffiths & Tenenbaum, 2006 ) and a recently proposed alternative model that assumes people store only a few instances of each type of event in memory (Min K ; Mozer, Pashler, & Homaei, 2008 ). Although testing these models using standard experimental procedures is difficult due to differences in the number of free parameters and the need to make assumptions about the knowledge of individual learners, we show that the two models make very different predictions about the outcome of iterated learning. The results of an experiment using this methodology provide a rich picture of how much people know about the distributions of everyday quantities, and they are inconsistent with the predictions of the Min K model. The results suggest that accurate predictions about everyday events reflect relatively sophisticated knowledge on the part of individuals.  相似文献   

16.
Consideration will be given to a model developed by Rasch that assumes scores observed on some types of attainment tests can be regarded as realizations of a Poisson process. The parameter of the Poisson distribution is assumed to be a product of two other parameters, one pertaining to the ability of the subject and a second pertaining to the difficulty of the test. Rasch's model is expanded by assuming a prior distribution, with fixed but unknown parameters, for the subject parameters. The test parameters are considered fixed. Secondly, it will be shown how additional between- and within-subjects factors can be incorporated. Methods for testing the fit and estimating the parameters of the model will be discussed, and illustrated by empirical examples.  相似文献   

17.
Model selection should be based not solely on goodness-of-fit, but must also consider model complexity. While the goal of mathematical modeling in cognitive psychology is to select one model from a set of competing models that best captures the underlying mental process, choosing the model that best fits a particular set of data will not achieve this goal. This is because a highly complex model can provide a good fit without necessarily bearing any interpretable relationship with the underlying process. It is shown that model selection based solely on the fit to observed data will result in the choice of an unnecessarily complex model that overfits the data, and thus generalizes poorly. The effect of over-fitting must be properly offset by model selection methods. An application example of selection methods using artificial data is also presented. Copyright 2000 Academic Press.  相似文献   

18.
This paper presents two experiments where participants had to approximate function values at various generalization points of a square, using given function values at a small set of data points. A representative set of standard function approximation models was trained to exactly fit the function values at data points, and models' responses at generalization points were compared to those of humans. Then one defined a large class of possible models (including the best two identified predictors) and the class maximal possible prediction accuracy was evaluated. A new model of quick multivariate function approximation belonging to this class was proposed. Its prediction accuracy was close to the maximum possible, and significantly better than that of all other models tested. The new model also provided a significant account of human response variability. Finally, it was shown that this model is more particularly suitable for problems in which the visual system can perform some specific structuring of the data space. This model is therefore considered as a suitable starting point for further investigations into quick multivariate function approximation, which is to date an inadequately explored question in cognitive psychology.  相似文献   

19.
A Bayesian random effects model for testlets   总被引:4,自引:0,他引:4  
Standard item response theory (IRT) models fit to dichotomous examination responses ignore the fact that sets of items (testlets) often come from a single common stimuli (e.g. a reading comprehension passage). In this setting, all items given to an examinee are unlikely to be conditionally independent (given examinee proficiency). Models that assume conditional independence will overestimate the precision with which examinee proficiency is measured. Overstatement of precision may lead to inaccurate inferences such as prematurely ending an examination in which the stopping rule is based on the estimated standard error of examinee proficiency (e.g., an adaptive test). To model examinations that may be a mixture of independent items and testlets, we modified one standard IRT model to include an additional random effect for items nested within the same testlet. We use a Bayesian framework to facilitate posterior inference via a Data Augmented Gibbs Sampler (DAGS; Tanner & Wong, 1987). The modified and standard IRT models are both applied to a data set from a disclosed form of the SAT. We also provide simulation results that indicates that the degree of precision bias is a function of the variability of the testlet effects, as well as the testlet design.The authors wish to thank Robert Mislevy, Andrew Gelman and Donald B. Rubin for their helpful suggestions and comments, Ida Lawrence and Miriam Feigenbaum for providing us with the SAT data analyzed in section 5, and to the two anonymous referees for their careful reading and thoughtful suggestions on an earlier draft. We are also grateful to the Educational Testing service for providing the resources to do this research.  相似文献   

20.
This paper studies three models for cognitive diagnosis, each illustrated with an application to fraction subtraction data. The objective of each of these models is to classify examinees according to their mastery of skills assumed to be required for fraction subtraction. We consider the DINA model, the NIDA model, and a new model that extends the DINA model to allow for multiple strategies of problem solving. For each of these models the joint distribution of the indicators of skill mastery is modeled using a single continuous higher-order latent trait, to explain the dependence in the mastery of distinct skills. This approach stems from viewing the skills as the specific states of knowledge required for exam performance, and viewing these skills as arising from a broadly defined latent trait resembling the θ of item response models. We discuss several techniques for comparing models and assessing goodness of fit. We then implement these methods using the fraction subtraction data with the aim of selecting the best of the three models for this application. We employ Markov chain Monte Carlo algorithms to fit the models, and we present simulation results to examine the performance of these algorithms. The work reported here was performed under the auspices of the External Diagnostic Research Team funded by Educational Testing Service. Views expressed in this paper does not necessarily represent the views of Educational Testing Service.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号