首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Model building or model selection with linear mixed models (LMMs) is complicated by the presence of both fixed effects and random effects. The fixed effects structure and random effects structure are codependent, so selection of one influences the other. Most presentations of LMM in psychology and education are based on a multilevel or hierarchical approach in which the variance-covariance matrix of the random effects is assumed to be positive definite with nonzero values for the variances. When the number of fixed effects and random effects is unknown, the predominant approach to model building is a step-up method in which one starts with a limited model (e.g., few fixed and random intercepts) and then additional fixed effects and random effects are added based on statistical tests. A model building approach that has received less attention in psychology and education is a top-down method. In the top-down method, the initial model has a single random intercept but is loaded with fixed effects (also known as an “overelaborate” model). Based on the overelaborate fixed effects model, the need for additional random effects is determined. There has been little if any examination of the ability of these methods to identify a true population model (i.e., identifying the model that generated the data). The purpose of this article is to examine the performance of the step-up and top-down model building approaches for exploratory longitudinal data analysis. Student achievement data sets from the Chicago longitudinal study serve as the populations in the simulations.  相似文献   

2.
This study addresses the issue of data combination in personnel selection. In a pilot study for the selection of trainee pilots for the German Luftwaffe, 99 applicants were assessed using a comprehensive battery of tests that measured inductive thinking, spatial thinking, attentiveness, visual and verbal short-term memory, sensorimotor coordination, and reactive stress tolerance. The global evaluation of the applicants' performance in a flight simulator served as an external criterion. The predictive validity of this test battery was checked by carrying out a discriminant analysis as well as by calculating a neural network. The 2 methods were compared with regard to their classification rate, stability, and separation of correct and incorrect classifications. The results show that artificial neural networks are useful tools for improving the quality of selection procedures for trainee pilots.  相似文献   

3.
The concept of a psychophysical threshold is foundational in perceptual psychology. In practice, thresholds are operationalized as stimulus values that lead to a fairly high level of performance such as .75 or .707 in two-choice tasks. These operationalizations are not useful for assessing subliminality—the state in which a stimulus is so weak that performance is at chance. We present a hierarchical Bayesian model of performance that incorporates a threshold that divides subliminal from near-liminal performance. The model provides a convenient means to measure at-chance thresholds and therefore is useful for testing theories of subliminal priming. The hierarchical nature of the model is critical for efficient analysis as strength is pooled across people and stimulus values. A comparison to Rasch psychometric models is provided.  相似文献   

4.
Multivariate count data are commonly analysed by using Poisson distributions with varying intensity parameters, resulting in a random‐effects model. In the analysis of a data set on the frequency of different emotion experiences we find that a Poisson model with a single random effect does not yield an adequate fit. An alternative model that requires as many random effects as emotion categories requires high‐dimensional integration and the estimation of a large number of parameters. As a solution to these computational problems, we propose a factor‐analytic Poisson model and show that a two‐dimensional factor model fits the reported data very well. Moreover, it yields a substantively satisfactory solution: one factor describing the degree of pleasantness and unpleasantness of emotions and the other factor describing the activation levels of the emotions. We discuss the incorporation of covariates to facilitate rigorous tests of the random‐effects structure. Marginal maximum likelihood methods lead to straight‐forward estimation of the model, for which goodness‐of‐fit tests are also presented.  相似文献   

5.
Model selection is a central issue in mathematical psychology. One useful criterion for model selection is generalizability; that is, the chosen model should yield the best predictions for future data. Some researchers in psychology have proposed that the Bayes factor can be used for assessing model generalizability. An alternative method, known as the generalization criterion, has also been proposed for the same purpose. We argue that these two methods address different levels of model generalizability (local and global), and will often produce divergent conclusions. We illustrate this divergence by applying the Bayes factor and the generalization criterion to a comparison of retention functions. The application of alternative model selection criteria will also be demonstrated within the framework of model generalizability.  相似文献   

6.
Three methods for fitting the diffusion model (Ratcliff, 1978) to experimental data are examined. Sets of simulated data were generated with known parameter values, and from fits of the model, we found that the maximum likelihood method was better than the chi-square and weighted least squares methods by criteria of bias in the parameters relative to the parameter values used to generate the data and standard deviations in the parameter estimates. The standard deviations in the parameter values can be used as measures of the variability in parameter estimates from fits to experimental data. We introduced contaminant reaction times and variability into the other components of processing besides the decision process and found that the maximum likelihood and chi-square methods failed, sometimes dramatically. But the weighted least squares method was robust to these two factors. We then present results from modifications of the maximum likelihood and chi-square methods, in which these factors are explicitly modeled, and show that the parameter values of the diffusion model are recovered well. We argue that explicit modeling is an important method for addressing contaminants and variability in nondecision processes and that it can be applied in any theoretical approach to modeling reaction time.  相似文献   

7.
Abstract

Inference of variance components in linear mixed modeling (LMM) provides evidence of heterogeneity between individuals or clusters. When only nonnegative variances are allowed, there is a boundary (i.e., 0) in the variances’ parameter space, and regular inference statistical procedures for such a parameter could be problematic. The goal of this article is to introduce a practically feasible permutation method to make inferences about variance components while considering the boundary issue in LMM. The permutation tests with different settings (i.e., constrained vs. unconstrained estimation, specific vs. generalized test, different ways of calculating p values, and different ways of permutation) were examined with both normal data and non-normal data. In addition, the permutation tests were compared to likelihood ratio (LR) tests with a mixture of chi-squared distributions as the reference distribution. We found that the unconstrained permutation test with the one-sided p-value approach performed better than the other permutation tests and is a useful alternative when the LR tests are not applicable. An R function is provided to facilitate the implementation of the permutation tests, and a real data example is used to illustrate the application. We hope our results will help researchers choose appropriate tests when testing variance components in LMM.  相似文献   

8.
Abstract

In this paper, we apply Vuong’s general approach of model selection to the comparison of nested and non-nested unidimensional and multidimensional item response theory (IRT) models. Vuong’s approach of model selection is useful because it allows for formal statistical tests of both nested and non-nested models. However, only the test of non-nested models has been applied in the context of IRT models to date. After summarizing the statistical theory underlying the tests, we investigate the performance of all three distinct Vuong tests in the context of IRT models using simulation studies and real data. In the non-nested case we observed that the tests can reliably distinguish between the graded response model and the generalized partial credit model. In the nested case, we observed that the tests typically perform as well as or sometimes better than the traditional likelihood ratio test. Based on these results, we argue that Vuong’s approach provides a useful set of tools for researchers and practitioners to effectively compare competing nested and non-nested IRT models.  相似文献   

9.
A model is described to account for the data of Durso, Cooke, Breen, and Schvaneveldt (1987). On the basis of the relative frequency of an item's presentation as a target, the item develops an automatic tendency to attract attention. When stimuli are then displayed, each calls the attention system to a degree determined by its present strength. We assume that attention eventually drifts to the strongest stimulus (which is then given as a response), but in a time determined inversely by the difference in strength between the two strongest stimuli. A version of this model in which the strengths were freely estimated parameters predicted the various elements of the data with good accuracy. In other versions of the model, strength values were derived from assumptions concerning the learning of automatism. Two of these models, quite different in character, captured the major qualitative features of the data. Further empirical tests of the models are suggested.  相似文献   

10.
Interpretation is the process whereby a hearer reasons to an interpretation of a speaker's discourse. The hearer normally adopts a credulous attitude to the discourse, at least for the purposes of interpreting it. That is to say the hearer tries to accommodate the truth of all the speaker's utterances in deriving an intended model. We present a nonmonotonic logical model of this process which defines unique minimal preferred models and efficiently simulates a kind of closed-world reasoning of particular interest for human cognition. Byrne's "suppression" data (Byrne, 1989) are used to illustrate how variants on this logic can capture and motivate subtly different interpretative stances which different subjects adopt, thus indicating where more fine-grained empirical data are required to understand what subjects are doing in this task. We then show that this logical competence model can be implemented in spreading activation network models. A one pass process interprets the textual input by constructing a network which then computes minimal preferred models for (3-valued) valuations of the set of propositions of the text. The neural implementation distinguishes easy forward reasoning from more complex backward reasoning in a way that may be useful in explaining directionality in human reasoning.  相似文献   

11.
Social networks are increasingly becoming recognized as a source of influence on political attitudes and behavior. In this study, we examine the moderating impact of social networks on the relationship among several attitudes. We argue that those who regularly interact with individuals with different views from their own will be more likely to think of themselves in nonpartisan terms. It is therefore hypothesized that an individual's discussion network influences the relationship between one's support for various core values and one's partisanship. As a corollary, we argue that disagreement in discussion networks reduces individuals' reliance on partisanship when forming subsequent attitudes. To test these propositions, we employ data asking respondents to list individuals with whom they discuss politics on a regular basis and who such individuals supported in a recent election to create a measure of network disagreement. Empirical tests provide strong support for our hypotheses.  相似文献   

12.
探究带宽选择方法、样本量、题目数量、等值设计、数据模拟方式对项目反应理论观察分数核等值的影响。通过两种数据模拟方式,获得研究数据,并计算局部与全域评价指标。研究发现,在随机组设计中,带宽选择方法表现相似;考生样本量和题目数量影响甚微。在非等组设计中,惩罚法与Silverman经验准则表现优异;增加题目量可降低百分相对误差和随机误差;增加样本量导致百分相对误差变大,随机误差减小。数据模拟方式可影响等值评价。未来应重点关注等值系统评估。  相似文献   

13.
Sensitivity of MRQAP Tests to Collinearity and Autocorrelation Conditions   总被引:3,自引:0,他引:3  
Multiple regression quadratic assignment procedures (MRQAP) tests are permutation tests for multiple linear regression model coefficients for data organized in square matrices of relatedness among n objects. Such a data structure is typical in social network studies, where variables indicate some type of relation between a given set of actors. We present a new permutation method (called “double semi-partialing”, or DSP) that complements the family of extant approaches to MRQAP tests. We assess the statistical bias (type I error rate) and statistical power of the set of five methods, including DSP, across a variety of conditions of network autocorrelation, of spuriousness (size of confounder effect), and of skewness in the data. These conditions are explored across three assumed data distributions: normal, gamma, and negative binomial. We find that the Freedman–Lane method and the DSP method are the most robust against a wide array of these conditions. We also find that all five methods perform better if the test statistic is pivotal. Finally, we find limitations of usefulness for MRQAP tests: All tests degrade under simultaneous conditions of extreme skewness and high spuriousness for gamma and negative binomial distributions. Special thanks go to Cajo Ter Braak, Philip Hans Franses, Patrick Houweling, Pierre Legendre, three anonymous reviewers, the associate editor, and the editor for comments.  相似文献   

14.
Laenen, Alonso, and Molenberghs (2007) and Laenen, Alonso, Molenberghs, and Vangeneugden (2009) proposed a method to assess the reliability of rating scales in a longitudinal context. The methodology is based on hierarchical linear models, and reliability coefficients are derived from the corresponding covariance matrices. However, finding a good parsimonious model to describe complex longitudinal data is a challenging task. Frequently, several models fit the data equally well, raising the problem of model selection uncertainty. When model uncertainty is high one may resort to model averaging, where inferences are based not on one but on an entire set of models. We explored the use of different model building strategies, including model averaging, in reliability estimation. We found that the approach introduced by Laenen et al. (2007, 2009) combined with some of these strategies may yield meaningful results in the presence of high model selection uncertainty and when all models are misspecified, in so far as some of them manage to capture the most salient features of the data. Nonetheless, when all models omit prominent regularities in the data, misleading results may be obtained. The main ideas are further illustrated on a case study in which the reliability of the Hamilton Anxiety Rating Scale is estimated. Importantly, the ambit of model selection uncertainty and model averaging transcends the specific setting studied in the paper and may be of interest in other areas of psychometrics.  相似文献   

15.
A new algorithm for obtaining exact person fit indexes for the Rasch model is introduced which realizes most powerful tests for a very general family of alternative hypotheses, including tests concerning DIF as well as model-deviating item correlations. The method is also used as a goodness-of-fit test for whole data sets where the item parameters are assumed to be known. For tests with 30 items at most, exact values are obtained, for longer tests a Monte Carlo-algorithm is proposed. Simulated examples and an empirical investigation demonstrate test power and applicability to item elimination.The author wishes to thank Elisabeth Ponocny-Seliger and the reviewers for many helpful comments. All exact goodness-of-fit tests proposed in this article are implemented in the menu-driven program T-Rasch 1.0 by Ponocny and Ponocny-Seliger (1999) which can be obtained from ProGAMMA (WWW: http://www.gamma.rug.nl) and also performs nonparametric tests.  相似文献   

16.
《Pratiques Psychologiques》2007,13(2):255-265
The functional method is a new construction method of subjective evaluation tests such as personality, values or competences. Its approach differs from the usually used methods that are directly inherited from the works of Spearman that are tests of maximal performances. The latter tests were developed (almost one hundred years ago) for selection purposes in a differential perspective that suggests certain “underlying” aptitudes from which the level is simply obtained by a sum of the right answers. Thus, the subjective evaluation tests have no unique underlying aptitude and do not give a sum of correct answers, but it shows a global and multidimensional vision of oneself. The functional method points out this multidimensional specificity and offers a model of the subject's answers that reveals his or her strategy that is projected into a hyper-spherical measurement space defined by the characteristics of the items. This measurement space allows the calculation of scores that are the projections (scalar products) of vectors onto one another. The very interests of this method are a better reliability, a clinical intra-personal evaluation of the subjects (that are much more efficient than the simple differential approach that is also described in the article) and, finally, a more informative control upon the intake conditions (test biases, suitability of the test for a person, etc.).  相似文献   

17.
A need is identified for a theoretical model to help make sense of conflicting reports on appropriate intervention in the lives of parents of handicapped children. Crisis theory offers a useful way forward, particularly when incorporated into the concept of psycho-social transitions. This concept is used to construct a model in which various negative parental emotions are viewed as entirely natural, but which may reflect diferent kinds of transition and therefore require different forms of intervention. It is postulated that appropriate forms of help to meet specific transitional needs will be most likely to lead to positive crisis resolution and adjustment to the reality of the child's handicapping condition. Preliminary findings of a study are reported, which employs the model as a means of evaluating the satisfied and unmet needs of three groups of mothers of handicapped children of similar ages. Although living in the same geographical area, each group had received a different kind of service for themselves and their children. The psycho-social transitions model proved useful in discriminating between theresolved and unresolved problems of each group and in identifying potentially helpful ways forward.  相似文献   

18.
We discuss measuring and detecting influential observations and outliers in the context of exponential family random graph (ERG) models for social networks. We focus on the level of the nodes of the network and consider those nodes whose removal would result in changes to the model as extreme or “central” with respect to the structural features that “matter”. We construe removal in terms of two case-deletion strategies: the tie-variables of an actor are assumed to be unobserved, or the node is removed resulting in the induced subgraph. We define the difference in inferred model resulting from case deletion from the perspective of information theory and difference in estimates, in both the natural and mean-value parameterisation, representing varying degrees of approximation. We arrive at several measures of influence and propose the use of two that do not require refitting of the model and lend themselves to routine application in the ERGM fitting procedure. MCMC p values are obtained for testing how extreme each node is with respect to the network structure. The influence measures are applied to two well-known data sets to illustrate the information they provide. From a network perspective, the proposed statistics offer an indication of which actors are most distinctive in the network structure, in terms of not abiding by the structural norms present across other actors.  相似文献   

19.
We advocate for rank‐permutation tests as the best choice for null‐hypothesis significance testing of behavioral data, because these tests require neither distributional assumptions about the populations from which our data were drawn nor the measurement assumption that our data are measured on an interval scale. We provide an algorithm that enables exact‐probability versions of such tests without recourse to either large‐sample approximation or resampling approaches. We particularly consider a rank‐permutation test for monotonic trend, and provide an extension of this test that allows unequal number of data points, or observations, for each subject. We provide an extended table of critical values of the test statistic for this test, and both a spreadsheet implementation and an Oracle® Java Web Start application to generate other critical values at https://sites.google.com/a/eastbayspecialists.co.nz/rank-permutation/ .  相似文献   

20.
A novel method for the identification of differential item functioning (DIF) by means of recursive partitioning techniques is proposed. We assume an extension of the Rasch model that allows for DIF being induced by an arbitrary number of covariates for each item. Recursive partitioning on the item level results in one tree for each item and leads to simultaneous selection of items and variables that induce DIF. For each item, it is possible to detect groups of subjects with different item difficulties, defined by combinations of characteristics that are not pre-specified. The way a DIF item is determined by covariates is visualized in a small tree and therefore easily accessible. An algorithm is proposed that is based on permutation tests. Various simulation studies, including the comparison with traditional approaches to identify items with DIF, show the applicability and the competitive performance of the method. Two applications illustrate the usefulness and the advantages of the new method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号