首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Growth mixture models (GMMs) with nonignorable missing data have drawn increasing attention in research communities but have not been fully studied. The goal of this article is to propose and to evaluate a Bayesian method to estimate the GMMs with latent class dependent missing data. An extended GMM is first presented in which class probabilities depend on some observed explanatory variables and data missingness depends on both the explanatory variables and a latent class variable. A full Bayesian method is then proposed to estimate the model. Through the data augmentation method, conditional posterior distributions for all model parameters and missing data are obtained. A Gibbs sampling procedure is then used to generate Markov chains of model parameters for statistical inference. The application of the model and the method is first demonstrated through the analysis of mathematical ability growth data from the National Longitudinal Survey of Youth 1997 (Bureau of Labor Statistics, U.S. Department of Labor, 1997). A simulation study considering 3 main factors (the sample size, the class probability, and the missing data mechanism) is then conducted and the results show that the proposed Bayesian estimation approach performs very well under the studied conditions. Finally, some implications of this study, including the misspecified missingness mechanism, the sample size, the sensitivity of the model, the number of latent classes, the model comparison, and the future directions of the approach, are discussed.  相似文献   

2.
This article uses a general latent variable framework to study a series of models for nonignorable missingness due to dropout. Nonignorable missing data modeling acknowledges that missingness may depend not only on covariates and observed outcomes at previous time points as with the standard missing at random assumption, but also on latent variables such as values that would have been observed (missing outcomes), developmental trends (growth factors), and qualitatively different types of development (latent trajectory classes). These alternative predictors of missing data can be explored in a general latent variable framework with the Mplus program. A flexible new model uses an extended pattern-mixture approach where missingness is a function of latent dropout classes in combination with growth mixture modeling. A new selection model not only allows an influence of the outcomes on missingness but allows this influence to vary across classes. Model selection is discussed. The missing data models are applied to longitudinal data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study, the largest antidepressant clinical trial in the United States to date. Despite the importance of this trial, STAR*D growth model analyses using nonignorable missing data techniques have not been explored until now. The STAR*D data are shown to feature distinct trajectory classes, including a low class corresponding to substantial improvement in depression, a minority class with a U-shaped curve corresponding to transient improvement, and a high class corresponding to no improvement. The analyses provide a new way to assess drug efficiency in the presence of dropout.  相似文献   

3.
Abstract

When estimating multiple regression models with incomplete predictor variables, it is necessary to specify a joint distribution for the predictor variables. A convenient assumption is that this distribution is a multivariate normal distribution, which is also the default in many statistical software packages. This distribution will in general be misspecified if predictors with missing data have nonlinear effects (e.g., x2) or are included in interaction terms (e.g., x·z). In the present article, we introduce a factored regression modeling approach for estimating regression models with missing data that is based on maximum likelihood estimation. In this approach, the model likelihood is factorized into a part that is due to the model of interest and a part that is due to the model for the incomplete predictors. In three simulation studies, we showed that the factored regression modeling approach produced valid estimates of interaction and nonlinear effects in regression models with missing values on categorical or continuous predictor variables under a broad range of conditions. We developed the R package mdmb, which facilitates a user-friendly application of the factored regression modeling approach, and present a real-data example that illustrates the flexibility of the software.  相似文献   

4.
A maximum likelihood approach is described for estimating the validity of a test (x) as a predictor of a criterion variable (y) when there are both missing and censoredy scores present in the data set. The missing data are due to selection on a latent variable (y s ) which may be conditionally related toy givenx. Thus, the missing data may not be missing random. The censoring process in due to the presence of a floor or ceiling effect. The maximum likelihood estimates are constructed using the EM algorithm. The entire analysis is demonstrated in terms of hypothetical data sets.  相似文献   

5.
We present a framework for estimating average and conditional effects of a discrete treatment variable on a continuous outcome variable, conditioning on categorical and continuous covariates. Using the new approach, termed the EffectLiteR approach, researchers can consider conditional treatment effects given values of all covariates in the analysis and various aggregates of these conditional treatment effects such as average effects, effects on the treated, or aggregated conditional effects given values of a subset of covariates. Building on structural equation modeling, key advantages of the new approach are (1) It allows for latent covariates and outcome variables; (2) it permits (higher order) interactions between the treatment variable and categorical and (latent) continuous covariates; and (3) covariates can be treated as stochastic or fixed. The approach is illustrated by an example, and open source software EffectLiteR is provided, which makes a detailed analysis of effects conveniently accessible for applied researchers.  相似文献   

6.
The treatment of missing data in the social sciences has changed tremendously during the last decade. Modern missing data techniques such as multiple imputation and full-information maximum likelihood are used much more frequently. These methods assume that data are missing at random. One very common approach to increase the likelihood that missing at random is achieved consists of including many covariates as so-called auxiliary variables. These variables are either included based on data considerations or in an inclusive fashion; that is, taking all available auxiliary variables. In this article, we point out that there are some instances in which auxiliary variables exhibit the surprising property of increasing bias in missing data problems. In a series of focused simulation studies, we highlight some situations in which this type of biasing behavior can occur. We briefly discuss possible ways how one can avoid selecting bias-inducing covariates as auxiliary variables.  相似文献   

7.
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood estimation methods (conditional, marginal, and joint). Three information criteria fit indices (Akaike information criterion, Bayesian information criterion, and sample size adjusted BIC) were used in a simulation study and an empirical study. Findings of this study showed that the spurious latent class problem was observed with marginal maximum likelihood and joint maximum likelihood estimations. However, conditional maximum likelihood estimation showed no overextraction problem with non-normal ability distributions.  相似文献   

8.
A common form of missing data is caused by selection on an observed variable (e.g., Z). If the selection variable was measured and is available, the data are regarded as missing at random (MAR). Selection biases correlation, reliability, and effect size estimates when these estimates are computed on listwise deleted (LD) data sets. On the other hand, maximum likelihood (ML) estimates are generally unbiased and outperform LD in most situations, at least when the data are MAR. The exception is when we estimate the partial correlation. In this situation, LD estimates are unbiased when the cause of missingness is partialled out. In other words, there is no advantage of ML estimates over LD estimates in this situation. We demonstrate that under a MAR condition, even ML estimates may become biased, depending on how partial correlations are computed. Finally, we conclude with recommendations about how future researchers might estimate partial correlations even when the cause of missingness is unknown and, perhaps, unknowable.  相似文献   

9.
Statisticians typically estimate the parameters of latent class and latent profile models using the Expectation-Maximization algorithm. This paper proposes an alternative two-stage approach to model fitting. The first stage uses the modified k-means and hierarchical clustering algorithms to identify the latent classes that best satisfy the conditional independence assumption underlying the latent variable model. The second stage then uses mixture modeling treating the class membership as known. The proposed approach is theoretically justifiable, directly checks the conditional independence assumption, and converges much faster than the full likelihood approach when analyzing high-dimensional data. This paper also develops a new classification rule based on latent variable models. The proposed classification procedure reduces the dimensionality of measured data and explicitly recognizes the heterogeneous nature of the complex disease, which makes it perfect for analyzing high-throughput genomic data. Simulation studies and real data analysis demonstrate the advantages of the proposed method.  相似文献   

10.
The effects of a treatment or an intervention on a count outcome are often of interest in applied research. When controlling for additional covariates, a negative binomial regression model is usually applied to estimate conditional expectations of the count outcome. The difference in conditional expectations under treatment and under control is then defined as the (conditional) treatment effect. While traditionally aggregates of these conditional treatment effects (e.g., average treatment effects) are computed by averaging over the empirical distribution, a recently proposed moment-based approach allows for computing aggregate effects as a function of distribution parameters. The moment-based approach makes it possible to control for (latent) multivariate normally distributed covariates and provides more reliable inferences under certain conditions. In this paper we propose three different ways to account for non-normally distributed continuous covariates in this approach: an alternative, known non-normal distribution; a plausible factorization of the joint distribution; and an approximation using finite Gaussian mixtures. A saturated model is used for categorical covariates, making a distributional assumption obsolete. We further extend the moment-based approach to allow for multiple treatment conditions and the computation of conditional effects for categorical covariates. An illustrative example highlighting the key features of our extension is provided.  相似文献   

11.
In this article, a maximum likelihood approach is developed to analyze structural equation models with dichotomous variables that are common in behavioral, psychological and social research. To assess nonlinear causal effects among the latent variables, the structural equation in the model is defined by a nonlinear function. The basic idea of the development is to augment the observed dichotomous data with the hypothetical missing data that involve the latent underlying continuous measurements and the latent variables in the model. An EM algorithm is implemented. The conditional expectation in the E-step is approximated via observations simulated from the appropriate conditional distributions by a Metropolis-Hastings algorithm within the Gibbs sampler, whilst the M-step is completed by conditional maximization. Convergence is monitored by bridge sampling. Standard errors are also obtained. Results from a simulation study and a real example are presented to illustrate the methodology.  相似文献   

12.
The standard tobit or censored regression model is typically utilized for regression analysis when the dependent variable is censored. This model is generalized by developing a conditional mixture, maximum likelihood method for latent class censored regression. The proposed method simultaneously estimates separate regression functions and subject membership in K latent classes or groups given a censored dependent variable for a cross-section of subjects. Maximum likelihood estimates are obtained using an EM algorithm. The proposed method is illustrated via a consumer psychology application.  相似文献   

13.
Abstract

Recent advances have allowed for modeling mixture components within latent growth modeling using robust, skewed mixture distributions rather than normal distributions. This feature adds flexibility in handling non-normality in longitudinal data, through manifest or latent variables, by directly modeling skewed or heavy-tailed latent classes rather than assuming a mixture of normal distributions. The aim of this study was to assess through simulation the potential under- or over-extraction of latent classes in a growth mixture model when underlying data follow either normal, skewed-normal, or skewed-t distributions. In order to assess this, we implement skewed-t, skewed-normal, and conventional normal (i.e., not skewed) forms of the growth mixture model. The skewed-t and skewed-normal versions of this model have only recently been implemented, and relatively little is known about their performance. Model comparison, fit, and classification of correctly specified and mis-specified models were assessed through various indices. Findings suggest that the accuracy of model comparison and fit measures are dependent on the type of (mis)specification, as well as the amount of class separation between the latent classes. A secondary simulation exposed computation and accuracy difficulties under some skewed modeling contexts. Implications of findings, recommendations for applied researchers, and future directions are discussed; a motivating example is presented using education data.  相似文献   

14.
Multilevel analyses are often used to estimate the effects of group-level constructs. However, when using aggregated individual data (e.g., student ratings) to assess a group-level construct (e.g., classroom climate), the observed group mean might not provide a reliable measure of the unobserved latent group mean. In the present article, we propose a Bayesian approach that can be used to estimate a multilevel latent covariate model, which corrects for the unreliable assessment of the latent group mean when estimating the group-level effect. A simulation study was conducted to evaluate the choice of different priors for the group-level variance of the predictor variable and to compare the Bayesian approach with the maximum likelihood approach implemented in the software Mplus. Results showed that, under problematic conditions (i.e., small number of groups, predictor variable with a small ICC), the Bayesian approach produced more accurate estimates of the group-level effect than the maximum likelihood approach did.  相似文献   

15.
Latent class regression models relate covariates and latent constructs such as psychiatric disorders. Though full maximum likelihood estimation is available, estimation is often in three steps: (i) a latent class model is fitted without covariates; (ii) latent class scores are predicted; and (iii) the scores are regressed on covariates. We propose a new method for predicting class scores that, in contrast to posterior probability-based methods, yields consistent estimators of the parameters in the third step. Additionally, in simulation studies the new methodology exhibited only a minor loss of efficiency. Finally, the new and the posterior probability-based methods are compared in an analysis of mobility/exercise.  相似文献   

16.
Longitudinal data sets typically suffer from attrition and other forms of missing data. When this common problem occurs, several researchers have demonstrated that correct maximum likelihood estimation with missing data can be obtained under mild assumptions concerning the missing data mechanism. With reasonable substantive theory, a mixture of cross-sectional and longitudinal methods developed within multiple-group structural equation modeling can provide a strong basis for inference about developmental change. Using an approach to the analysis of missing data, the present study investigated developmental trends in adolescent (N = 759) alcohol, marijuana, and cigarette use across a 5-year period using multiple-group latent growth modeling. An associative model revealed that common developmental trends existed for all three substances. Age and gender were included in the model as predictors of initial status and developmental change. Findings discuss the utility of latent variable structural equation modeling techniques and missing data approaches in the study of developmental change.  相似文献   

17.
In the Italian academic system, a student can enroll for an exam immediately after the end of the teaching period or can postpone it; in this second case the exam result is missing. We propose an approach for the evaluation of a student performance throughout the course of study, accounting also for nonattempted exams. The approach is based on an item response theory model that includes two discrete latent variables representing student performance and priority in selecting the exams to take. We explicitly account for nonignorable missing observations as the indicators of attempted exams also contribute to measure the performance (within-item multidimensionality). The model also allows for individual covariates in its structural part.  相似文献   

18.
The posterior distribution of the bivariate correlation is analytically derived given a data set wherex is completely observed buty is missing at random for a portion of the sample. Interval estimates of the correlation are then constructed from the posterior distribution in terms of highest density regions (HDRs). Various choices for the form of the prior distribution are explored. For each of these priors, the resulting Bayesian HDRs are compared with each other and with intervals derived from maximum likelihood theory.  相似文献   

19.
A latent variable modelling approach is discussed, which can be used to evaluate indices of linear relationship between latent constructs in incomplete data sets. The method is based on an application of maximum-likelihood estimation and inclusion of covariates predictive of missing values. The approach can be employed for point and interval estimation of latent correlations in the presence of missing data, and capitalizes on enhanced plausibility of the assumption of data missing at random through introduction of informative covariates. The method is illustrated on empirical data.  相似文献   

20.
The primary aim of the present 1-year longitudinal study among university employees (N = 1314) was to investigate individual development of perceived employability (PE) by utilizing a person-centred approach. Thus, we identified latent classes of PE across 1 year based on growth mixture modelling. In addition, the latent classes were characterized by perceived job insecurity and the type of employment contract and its changes over the 1-year time period. The results showed four latent classes of PE that differed in the level and the direction of mean-level changes over time. These latent classes were: (1) stable relatively high PE (n = 641); (2) unstable decreasing PE (n = 45); (3) unstable increasing PE (n = 24); and (4) stable relatively low PE (n = 603). Perceived job insecurity associated with the latent class membership of PE. That is, low levels of perceived job insecurity were associated with favourable PE classes (i.e., “stable relatively high” and “unstable increasing employability”), whereas high levels of job insecurity associated with unfavourable PE classes (i.e., “stable relatively low” and “unstable decreasing employability”). Furthermore, transitions from temporary to permanent job contract occurred more often in the favourable than in unfavourable PE classes, but transitions from permanent to temporary contract were more likely in the unfavourable classes. Thus, our study indicated a substantial amount of heterogeneity in the development of PE across 1 year.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号