首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This work compares the sensitivity of five modern analytical techniques for detecting the effects of a design with measures which are partially repeated when the assumptions of the traditional ANOVA approach are not met, namely: the approach of the mixed model adjusted by means of the SAS Proc Mixed module, the Bootstrap-F approach, the Brown-Forsythe multivariate approach, the Welch-James multivariate approach and Welch-James multivariate approach with robust estimators. Previously, Livacic-Rojas, Vallejo and Fernández found out that these methods are comparable in terms of their Type I error rates. The results obtained suggest that the mixed model approach, as well as the Brown-Forsythe and Welch-James approaches, satisfactorily controlled the Type II error rates corresponding to the main effects of the measurement occasions under most of the conditions assessed.  相似文献   

2.
The impact of baseline trend control on visual analyses of AB intervention graphs was examined with simulated data at various values of baseline trend, autocorrelation, and effect size. Participants included 202 undergraduate students with minimal training in visual analysis and 10 graduate students and faculty with more training and experience in visual analysis. In general, results were similar across both groups of participants. Without statistical adjustments to correct for baseline trend, Type I errors greatly increased as baseline trend increased. With corrections for baseline trend, fewer Type I errors were made. As trend increased, participants made fewer Type II errors on the unadjusted graphs as compared to the graphs with baseline trend control. The greater Type II error rate on adjusted graphs could be an artifact of study design (i.e., participants did not know if baseline trend control had been applied), and the impact of MASAJ on Type II errors needs to be explored in detail prior to more widespread use of the method. Implications for future use of baseline trend control techniques by educational professionals are discussed.  相似文献   

3.
Repeated measures analyses of variance are the method of choice in many studies from experimental psychology and the neurosciences. Data from these fields are often characterized by small sample sizes, high numbers of factor levels of the within-subjects factor(s), and nonnormally distributed response variables such as response times. For a design with a single within-subjects factor, we investigated Type I error control in univariate tests with corrected degrees of freedom, the multivariate approach, and a mixed-model (multilevel) approach (SAS PROC MIXED) with Kenward–Roger’s adjusted degrees of freedom. We simulated multivariate normal and nonnormal distributions with varied population variance–covariance structures (spherical and nonspherical), sample sizes (N), and numbers of factor levels (K). For normally distributed data, as expected, the univariate approach with Huynh–Feldt correction controlled the Type I error rate with only very few exceptions, even if samples sizes as low as three were combined with high numbers of factor levels. The multivariate approach also controlled the Type I error rate, but it requires NK. PROC MIXED often showed acceptable control of the Type I error rate for normal data, but it also produced several liberal or conservative results. For nonnormal data, all of the procedures showed clear deviations from the nominal Type I error rate in many conditions, even for sample sizes greater than 50. Thus, none of these approaches can be considered robust if the response variable is nonnormally distributed. The results indicate that both the variance heterogeneity and covariance heterogeneity of the population covariance matrices affect the error rates.  相似文献   

4.
Empirical Type I error and power rates were estimated for (a) the doubly multivariate model, (b) the Welch-James multivariate solution developed by Keselman, Carriere and Lix (1993) using Johansen's results (1980), and for (c) the multivariate version of the modified Brown-Forsythe (1974) procedure. The performance of these procedures was investigated by testing within- blocks sources of variation in a multivariate split-plot design containing unequal covariance matrices. The results indicate that the doubly multivariate model did not provide effective Type I error control while the Welch-James procedure provided robust and powerful tests of the within-subjects main effect, however, this approach provided liberal tests of the interaction effect. The results also indicate that the modified Brown-Forsythe procedure provided robust tests of within-subjects main and interaction effects, especially when the design was balanced or when group sizes and covariance matrices were positively paired.  相似文献   

5.
ABSTRACT Personality moderating variables act to qualify the relationship between a personality trait measure and a relevant behavioral criterion. Two data analytic techniques that can be used to test for significant moderating effects are the "median split" (MS) approach and the "moderated multiple regression" (MMR) approach. The goals of the present research were ( a ) to apply the MS approach to computer-simulated data in which the moderator and trait extremity are confounded, to determine the extent of artifact, and ( b ) to compare the performance (Type I and Type II error rates) of the two approaches when applied to confounded and nonconfounded data. It was found that when the MS approach was applied to confounded data in which no real moderating effect existed, this approach produced an alarming rate of apparent, but spurious, moderating effects. When the MMR approach was applied to the same data, the rate of spurious effects was reduced to that expected by chance. When both approaches were applied to simulated data which contained genuine moderating effects, the MMR approach consistently resulted in more correct detections of these effects than the MS approach. We conclude that researchers should always employ the MMR rather than the MS approach when testing for personality moderator variable effects.  相似文献   

6.
Reaction time IRT) to correct identification (ID) was measured for pairs of different letters presented on a memory drum. There were two types of lists. In Type I, visual and name similarity varied orthogonally at two levels. In Type II, one feature was constant at one level, while the other varied at two levels of similarity. For both types of lists, RT is a function of the feature that is more easily extracted from the stimulus. Relative vivaal and name modality biases were estimated, and name bias is relatively more salient than visual bias under these experimental conditions. Specific letters differ in the amount of feature processing required for correct ID and in the relative contribution of visual and name feature effects on this processing.  相似文献   

7.
Researchers can adopt one of many different measures of central tendency to examine the effect of a treatment variable across groups. These include least squares means, trimmed means, M‐estimators and medians. In addition, some methods begin with a preliminary test to determine the shapes of distributions before adopting a particular estimator of the typical score. We compared a number of recently developed adaptive robust methods with respect to their ability to control Type I error and their sensitivity to detect differences between the groups when data were non‐normal and heterogeneous, and the design was unbalanced. In particular, two new approaches to comparing the typical score across treatment groups, due to Babu, Padmanabhan, and Puri, were compared to two new methods presented by Wilcox and by Keselman, Wilcox, Othman, and Fradette. The procedures examined generally resulted in good Type I error control and therefore, on the basis of this critetion, it would be difficult to recommend one method over the other. However, the power results clearly favour one of the methods presented by Wilcox and Keselman; indeed, in the vast majority of the cases investigated, this most favoured approach had substantially larger power values than the other procedures, particularly when there were more than two treatment groups.  相似文献   

8.
Marsh HW  Wen Z  Hau KT 《心理学方法》2004,9(3):275-300
Interactions between (multiple indicator) latent variables are rarely used because of implementation complexity and competing strategies. Based on 4 simulation studies, the traditional constrained approach performed more poorly than did 3 new approaches--unconstrained, generalized appended product indicator, and quasi-maximum-likelihood (QML). The authors' new unconstrained approach was easiest to apply. All 4 approaches were relatively unbiased for normally distributed indicators, but the constrained and QML approaches were more biased for nonnormal data; the size and direction of the bias varied with the distribution but not with the sample size. QML had more power, but this advantage was qualified by consistently higher Type I error rates. The authors also compared general strategies for defining product indicators to represent the latent interaction factor.  相似文献   

9.
There is a growing use of noncognitive assessments around the world, and recent research has posited an ideal point response process underlying such measures. A critical issue is whether the typical use of dominance approaches (e.g., average scores, factor analysis, and the Samejima's graded response model) in scoring such measures is adequate. This study examined the performance of an ideal point scoring approach (e.g., the generalized graded unfolding model) as compared to the typical dominance scoring approaches in detecting curvilinear relationships between scored trait and external variable. Simulation results showed that when data followed the ideal point model, the ideal point approach generally exhibited more power and provided more accurate estimates of curvilinear effects than the dominance approaches. No substantial difference was found between ideal point and dominance scoring approaches in terms of Type I error rate and bias across different sample sizes and scale lengths, although skewness in the distribution of trait and external variable can potentially reduce statistical power. For dominance data, the ideal point scoring approach exhibited convergence problems in most conditions and failed to perform as well as the dominance scoring approaches. Practical implications for scoring responses to Likert-type surveys to examine curvilinear effects are discussed.  相似文献   

10.
One approach to the analysis of repeated measures data allows researchers to model the covariance structure of the data rather than presume a certain structure, as is the case with conventional univariate and multivariate test statistics. This mixed-model approach was evaluated for testing all possible pairwise differences among repeated measures marginal means in a Between-Subjects x Within-Subjects design. Specifically, the authors investigated Type I error and power rates for a number of simultaneous and stepwise multiple comparison procedures using SAS (1999) PROC MIXED in unbalanced designs when normality and covariance homogeneity assumptions did not hold. J. P. Shaffer's (1986) sequentially rejective step-down and Y. Hochberg's (1988) sequentially acceptive step-up Bonferroni procedures, based on an unstructured covariance structure, had superior Type I error control and power to detect true pairwise differences across the investigated conditions.  相似文献   

11.
12.
Process factor analysis (PFA) is a latent variable model for intensive longitudinal data. It combines P-technique factor analysis and time series analysis. The goodness-of-fit test in PFA is currently unavailable. In the paper, we propose a parametric bootstrap method for assessing model fit in PFA. We illustrate the test with an empirical data set in which 22 participants rated their effects everyday over a period of 90 days. We also explore Type I error and power of the parametric bootstrap test with simulated data.  相似文献   

13.
Traditionally, multinomial processing tree (MPT) models are applied to groups of homogeneous participants, where all participants within a group are assumed to have identical MPT model parameter values. This assumption is unreasonable when MPT models are used for clinical assessment, and it often may be suspect for applications to ordinary psychological experiments. One method for dealing with parameter variability is to incorporate random effects assumptions into a model. This is achieved by assuming that participants’ parameters are drawn independently from some specified multivariate hyperdistribution. In this paper we explore the assumption that the hyperdistribution consists of independent beta distributions, one for each MPT model parameter. These beta-MPT models are ‘hierarchical models’, and their statistical inference is different from the usual approaches based on data aggregated over participants. The paper provides both classical (frequentist) and hierarchical Bayesian approaches to statistical inference for beta-MPT models. In simple cases the likelihood function can be obtained analytically; however, for more complex cases, Markov Chain Monte Carlo algorithms are constructed to assist both approaches to inference. Examples based on clinical assessment studies are provided to demonstrate the advantages of hierarchical MPT models over aggregate analysis in the presence of individual differences.  相似文献   

14.
Experiments often produce a hit rate and a false alarm rate in each of two conditions. These response rates are summarized into a single-point sensitivity measure such as d', and t tests are conducted to test for experimental effects. Using large-scale Monte Carlo simulations, we evaluate the Type I error rates and power that result from four commonly used single-point measures: d', A', percent correct, and gamma. We also test a newly proposed measure called gammaC. For all measures, we consider several ways of handling cases in which false alarm rate = 0 or hit rate = 1. The results of our simulations indicate that power is similar for these measures but that the Type I error rates are often unacceptably high. Type I errors are minimized when the selected sensitivity measure is theoretically appropriate for the data.  相似文献   

15.
When more than one significance test is carried out on data from a single experiment, researchers are often concerned with the probability of one or more Type I errors over the entire set of tests. This article considers several methods of exercising control over that probability (the so-called family-wise Type I error rate), provides a schematic that can be used by a researcher to choose among the methods, and discusses applications to contingency tables.  相似文献   

16.
Serlin RC 《心理学方法》2000,5(2):230-240
Monte Carlo studies provide the information needed to help researchers select appropriate analytical procedures under design conditions in which the underlying assumptions of the procedures are not met. In Monte Carlo studies, the 2 errors that one could commit involve (a) concluding that a statistical procedure is robust when it is not or (b) concluding that it is not robust when it is. In previous attempts to apply standard statistical design principles to Monte Carlo studies, the less severe of these errors has been wrongly designated the Type I error. In this article, a method is presented for controlling the appropriate Type I error rate; the determination of the number of iterations required in a Monte Carlo study to achieve desired power is described; and a confidence interval for a test's true Type I error rate is derived. A robustness criterion is also proposed that is a compromise between W. G. Cochran's (1952) and J. V. Bradley's (1978) criteria.  相似文献   

17.
基于改进的Wald统计量,将适用于两群组的DIF检测方法拓展至多群组的项目功能差异(DIF)检验;改进的Wald统计量将分别通过计算观察信息矩阵(Obs)和经验交叉相乘信息矩阵(XPD)而得到。模拟研究探讨了此二者与传统计算方法在多个群组下的DIF检验情况,结果表明:(1)Obs和XPD的一类错误率明显低于传统方法,DINA模型估计下Obs和XPD的一类错误率接近理论水平;(2)样本量和DIF量较大时,Obs和XPD具有与传统Wald统计量大体相同的统计检验力。  相似文献   

18.
Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special concern as the bias-corrected bootstrap is often recommended and used due to its higher statistical power compared with other tests. The second result is statistical power reaching an asymptote far below 1.0 and in some conditions even declining slightly as the size of the relationship between X and M, a, increased. Two computer simulations were conducted to examine these findings in greater detail. Results from the first simulation found that the increased Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap are a function of an interaction between the size of the individual paths making up the mediated effect and the sample size, such that elevated Type I error rates occur when the sample size is small and the effect size of the nonzero path is medium or larger. Results from the second simulation found that stagnation and decreases in statistical power as a function of the effect size of the a path occurred primarily when the path between M and Y, b, was small. Two empirical mediation examples are provided using data from a steroid prevention and health promotion program aimed at high school football players (Athletes Training and Learning to Avoid Steroids; Goldberg et al., 1996), one to illustrate a possible Type I error for the bias-corrected bootstrap test and a second to illustrate a loss in power related to the size of a. Implications of these findings are discussed.  相似文献   

19.
Many robust regression estimators have been proposed that have a high, finite‐sample breakdown point, roughly meaning that a large porportion of points must be altered to drive the value of an estimator to infinity. But despite this, many of them can be inordinately influenced by two properly placed outliers. With one predictor, an estimator that appears to correct this problem to a fair degree, and simultaneously maintain good efficiency when standard assumptions are met, consists of checking for outliers using a projection‐type method, removing any that are found, and applying the Theil — Sen estimator to the data that remain. When dealing with multiple predictors, there are two generalizations of the Theil — Sen estimator that might be used, but nothing is known about how their small‐sample properties compare. Also, there are no results on testing the hypothesis of zero slopes, and there is no information about the effect on efficiency when outliers are removed. In terms of hypothesis testing, using the more obvious percentile bootstrap method in conjunction with a slight modification of Mahalanobis distance was found to avoid Type I error probabilities above the nominal level, but in some situations the actual Type I error probabilities can be substantially smaller than intended when the sample size is small. An alternative method is found to be more satisfactory.  相似文献   

20.
To simplify the problem of studying how people learn natural language, researchers use the artificial grammar learning (AGL) task. In this task, participants study letter strings constructed according to the rules of an artificial grammar and subsequently attempt to discriminate grammatical from ungrammatical test strings. Although the data from these experiments are usually analyzed by comparing the mean discrimination performance between experimental conditions, this practice discards information about the individual items and participants that could otherwise help uncover the particular features of strings associated with grammaticality judgments. However, feature analysis is tedious to compute, often complicated, and ill-defined in the literature. Moreover, the data violate the assumption of independence underlying standard linear regression models, leading to Type I error inflation. To solve these problems, we present AGSuite, a free Shiny application for researchers studying AGL. The suite’s intuitive Web-based user interface allows researchers to generate strings from a database of published grammars, compute feature measures (e.g., Levenshtein distance) for each letter string, and conduct a feature analysis on the strings using linear mixed effects (LME) analyses. The LME analysis solves the inflation of Type I errors that afflicts more common methods of repeated measures regression analysis. Finally, the software can generate a number of graphical representations of the data to support an accurate interpretation of results. We hope the ease and availability of these tools will encourage researchers to take full advantage of item-level variance in their datasets in the study of AGL. We moreover discuss the broader applicability of the tools for researchers looking to conduct feature analysis in any field.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号