首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Process factor analysis (PFA) is a latent variable model for intensive longitudinal data. It combines P-technique factor analysis and time series analysis. The goodness-of-fit test in PFA is currently unavailable. In the paper, we propose a parametric bootstrap method for assessing model fit in PFA. We illustrate the test with an empirical data set in which 22 participants rated their effects everyday over a period of 90 days. We also explore Type I error and power of the parametric bootstrap test with simulated data.  相似文献   

2.
Researchers often want to demonstrate a lack of interaction between two categorical predictors on an outcome. To justify a lack of interaction, researchers typically accept the null hypothesis of no interaction from a conventional analysis of variance (ANOVA). This method is inappropriate as failure to reject the null hypothesis does not provide statistical evidence to support a lack of interaction. This study proposes a bootstrap‐based intersection–union test for negligible interaction that provides coherent decisions between the omnibus test and post hoc interaction contrast tests and is robust to violations of the normality and variance homogeneity assumptions. Further, a multiple comparison strategy for testing interaction contrasts following a non‐significant omnibus test is proposed. Our simulation study compared the Type I error control, omnibus power and per‐contrast power of the proposed approach to the non‐centrality‐based negligible interaction test of Cheng and Shao (2007, Statistica Sinica, 17, 1441). For 2 × 2 designs, the empirical Type I error rates of the Cheng and Shao test were very close to the nominal α level when the normality and variance homogeneity assumptions were satisfied; however, only our proposed bootstrapping approach was satisfactory under non‐normality and/or variance heterogeneity. In general a × b designs, although the omnibus Cheng and Shao test, as expected, is the most powerful, it is not robust to assumption violation and results in incoherent omnibus and interaction contrast decisions that are not possible with the intersection–union approach.  相似文献   

3.
Psychological experiments often yield data that are hierarchically structured. A number of popular shortcut strategies in cognitive modeling do not properly accommodate this structure and can result in biased conclusions. To gauge the severity of these biases, we conducted a simulation study for a two-group experiment. We first considered a modeling strategy that ignores the hierarchical data structure. In line with theoretical results, our simulations showed that Bayesian and frequentist methods that rely on this strategy are biased towards the null hypothesis. Secondly, we considered a modeling strategy that takes a two-step approach by first obtaining participant-level estimates from a hierarchical cognitive model and subsequently using these estimates in a follow-up statistical test. Methods that rely on this strategy are biased towards the alternative hypothesis. Only hierarchical models of the multilevel data lead to correct conclusions. Our results are particularly relevant for the use of hierarchical Bayesian parameter estimates in cognitive modeling.  相似文献   

4.
新世纪头20年, 国内心理学11本专业期刊一共发表了213篇统计方法研究论文。研究范围主要包括以下10类(按论文篇数排序):结构方程模型、测验信度、中介效应、效应量与检验力、纵向研究、调节效应、探索性因子分析、潜在类别模型、共同方法偏差和多层线性模型。对各类做了简单的回顾与梳理。结果发现, 国内心理统计方法研究的广度和深度都不断增加, 研究热点在相互融合中共同发展; 但综述类论文比例较大, 原创性研究论文比例有待提高, 研究力量也有待加强。  相似文献   

5.
Ayala Cohen 《Psychometrika》1986,51(3):379-391
A test is proposed for the equality of the variances ofk 2 correlated variables. Pitman's test fork = 2 reduces the null hypothesis to zero correlation between their sum and their difference. Its extension, eliminating nuisance parameters by a bootstrap procedure, is valid for any correlation structure between thek normally distributed variables. A Monte Carlo study for several combinations of sample sizes and number of variables is presented, comparing the level and power of the new method with previously published tests. Some nonnormal data are included, for which the empirical level tends to be slightly higher than the nominal one. The results show that our method is close in power to the asymptotic tests which are extremely sensitive to nonnormality, yet it is robust and much more powerful than other robust tests.This research was supported by the fund for the promotion of research at the Technion.  相似文献   

6.
For comparing nested covariance structure models, the standard procedure is the likelihood ratio test of the difference in fit, where the null hypothesis is that the models fit identically in the population. A procedure for determining statistical power of this test is presented where effect size is based on a specified difference in overall fit of the models. A modification of the standard null hypothesis of zero difference in fit is proposed allowing for testing an interval hypothesis that the difference in fit between models is small, rather than zero. These developments are combined yielding a procedure for estimating power of a test of a null hypothesis of small difference in fit versus an alternative hypothesis of larger difference.  相似文献   

7.
Three plausible assumptions of conditional independence in a hierarchical model for responses and response times on test items are identified. For each of the assumptions, a Lagrange multiplier test of the null hypothesis of conditional independence against a parametric alternative is derived. The tests have closed-form statistics that are easy to calculate from the standard estimates of the person parameters in the model. In addition, simple closed-form estimators of the parameters under the alternatives of conditional dependence are presented, which can be used to explore model modification. The tests were applied to a data set from a large-scale computerized exam and showed excellent power to detect even minor violations of conditional independence.  相似文献   

8.
A function, written in R, for testing whether the distribution of responses in one condition can be considered a combination of the distributions from two other conditions is described. The important aspect of this function is that it does not make any assumptions about the shape of the distributions. It is based o nthe Kolmogorov-Smirnov D statistic. The function also allows the user to test more specific and, hence, more statistically powerful hypotheses. One hypothesis, that the mixture does not capture the middle third of the distribution, is included as a built-in option, and code is provided so that other alternatives can easily be run. A power analysis reveals that the function is most likely to detect a difference between the combined conditions' distribution and the other distribution when the center of the other distribution is near the midpoint of the two original distributions. Critical p values are estimated for each set of distributions, using bootstrap methods. An example from human memory research, exploring the blending hypothesis of the misinformation effect, is used for illustrative purposes.  相似文献   

9.
Solving theoretical or empirical issues sometimes involves establishing the equality of two variables with repeated measures. This defies the logic of null hypothesis significance testing, which aims at assessing evidence against the null hypothesis of equality, not for it. In some contexts, equivalence is assessed through regression analysis by testing for zero intercept and unit slope (or simply for unit slope in case that regression is forced through the origin). This paper shows that this approach renders highly inflated Type I error rates under the most common sampling models implied in studies of equivalence. We propose an alternative approach based on omnibus tests of equality of means and variances and in subject-by-subject analyses (where applicable), and we show that these tests have adequate Type I error rates and power. The approach is illustrated with a re-analysis of published data from a signal detection theory experiment with which several hypotheses of equivalence had been tested using only regression analysis. Some further errors and inadequacies of the original analyses are described, and further scrutiny of the data contradict the conclusions raised through inadequate application of regression analyses.  相似文献   

10.

Purpose

This research advances understanding of empirical time modeling techniques in self-regulated learning research. We intuitively explain several such methods by situating their use in the extant literature. Further, we note key statistical and inferential assumptions of each method while making clear the inferential consequences of inattention to such assumptions.

Design/Methodology/Approach

Using a population model derived from a recent large-scale review of the training and work learning literature, we employ a Monte Carlo simulation fitting six variations of linear mixed models, seven variations of latent common factor models, and a single latent change score model to 1500 simulated datasets.

Findings

The latent change score model outperformed all six of the linear mixed models and all seven of the latent common factor models with respect to (1) estimation precision of the average learner improvement, (2) correctly rejecting a false null hypothesis about such average improvement, and (3) correctly failing to reject true null hypothesis about between-learner differences (i.e., random slopes) in average improvement.

Implications

The latent change score model is a more flexible method of modeling time in self-regulated learning research, particularly for learner processes consistent with twenty-first-century workplaces. Consequently, defaulting to linear mixed or latent common factor modeling methods may have adverse inferential consequences for better understanding self-regulated learning in twenty-first-century work.

Originality/Value

Ours is the first study to critically, rigorously, and empirically evaluate self-regulated learning modeling methods and to provide a more flexible alternative consistent with modern self-regulated learning knowledge.
  相似文献   

11.
Valid use of the traditional independent samples ANOVA procedure requires that the population variances are equal. Previous research has investigated whether variance homogeneity tests, such as Levene's test, are satisfactory as gatekeepers for identifying when to use or not to use the ANOVA procedure. This research focuses on a novel homogeneity of variance test that incorporates an equivalence testing approach. Instead of testing the null hypothesis that the variances are equal against an alternative hypothesis that the variances are not equal, the equivalence-based test evaluates the null hypothesis that the difference in the variances falls outside or on the border of a predetermined interval against an alternative hypothesis that the difference in the variances falls within the predetermined interval. Thus, with the equivalence-based procedure, the alternative hypothesis is aligned with the research hypothesis (variance equality). A simulation study demonstrated that the equivalence-based test of population variance homogeneity is a better gatekeeper for the ANOVA than traditional homogeneity of variance tests.  相似文献   

12.
The recent surge of interests in cognitive assessment has led to the development of cognitive diagnosis models. Central to many such models is a specification of the Q-matrix, which relates items to latent attributes that have natural interpretations. In practice, the Q-matrix is usually constructed subjectively by the test designers. This could lead to misspecification, which could result in lack of fit of the underlying statistical model. To test possible misspecification of the Q-matrix, traditional goodness of fit tests, such as the Chi-square test and the likelihood ratio test, may not be applied straightforwardly due to the large number of possible response patterns. To address this problem, this paper proposes a new statistical method to test the goodness fit of the Q-matrix, by constructing test statistics that measure the consistency between a provisional Q-matrix and the observed data for a general family of cognitive diagnosis models. Limiting distributions of the test statistics are derived under the null hypothesis that can be used for obtaining the test p-values. Simulation studies as well as a real data example are presented to demonstrate the usefulness of the proposed method.  相似文献   

13.
结构方程模型是心理学、管理学、社会学等学科中重要的统计工具之一。然而, 大量使用结构方程模型的研究忽视了对该方法的统计检验力进行必要的分析和报告, 在一定程度上降低了这些研究的结果的证明效力。结构方程模型的统计检验力分析方法主要有Satorra-Saris法、MacCallum法与Monte Carlo法三类。其中Satorra-Saris法适用于备择模型清晰、检验对象相对简单、检验方法基于χ2分布的情形; MacCallum法适用于基于χ2分布的模型拟合检验且备择模型不明的情形; Monte Carlo法适用于检验对象相对复杂、采用模拟或重抽样方法进行检验的情形。在实际应用中, 研究者应当首先判断检验的目的、方法以及是否有明确的备择模型, 并根据这些信息选择具体的分析方法。  相似文献   

14.
A new test is proposed for the problem of comparing two independent groups in terms of some measure of location. The proposed test () uses a one‐step M‐estimator and a bootstrap‐t method with the procedure proposed by Özdemir and Kurt (2006) . Eight methods were compared in terms of actual Type I error and power when the underlying distributions differ in skewness and kurtosis under heterogeneity of variances. For the 21 theoretical distributions, the Yuen test with the bootstrap‐t method was the most favourable, followed by test. For the five real data sets, the proposed test and percentile bootstrap method with the one‐step M‐estimator performed best.  相似文献   

15.
In this study, the delta method was applied to estimate the standard errors of the true score equating when using the characteristic curve methods with the generalized partial credit model in test equating under the context of the common-item nonequivalent groups equating design. Simulation studies were further conducted to compare the performance of the delta method with that of the bootstrap method and the multiple imputation method. The results indicated that the standard errors produced by the delta method were very close to the criterion empirical standard errors as well as those yielded by the bootstrap method and the multiple imputation method under all the manipulated conditions.  相似文献   

16.
Several studies aimed at testing the validity of Holland's hexagonal and Roe's circular models of interests showed results on which the null hypothesis of random arrangement can be rejected, and the investigators concluded that the tested models were supported. None of these studies, however, tested each model in its entirety. The present study is based on the assumption that the rejection of the null hypothesis of chance is not rigorous enough. Reanalysis of 13 data sets of published studies, using a more rigorous method, reveals that although the random null hypothesis can in fact be rejected in 11 data sets, the hexagonal-circular model was supported by only 2 data sets and was rejected by 11 data sets. The hierarchical model for the structure of vocational interests (I. Gati, Journal of Vocational Behavior, 1979, 15, 90–106) was submitted to an identical test and was supported by 6 out of 10 data sets, including 4 data sets that rejected the hexagonal-circular model. The predictions of each of the models which tend to be discontinued by empirical data were identified. The implications of the findings for the structure of interests and occupational choice are discussed.  相似文献   

17.
When participants are asked to respond in the same way to stimuli from different sources (e.g., auditory and visual), responses are often observed to be substantially faster when both stimuli are presented simultaneously (redundancy gain). Different models account for this effect, the two most important being race models and coactivation models. Redundancy gains consistent with the race model have an upper limit, however, which is given by the well-known race model inequality (Miller, 1982). A number of statistical tests have been proposed for testing the race model inequality in single participants and groups of participants. All of these tests use the race model as the null hypothesis, and rejection of the null hypothesis is considered evidence in favor of coactivation. We introduce a statistical test in which the race model prediction is the alternative hypothesis. This test controls the Type I error if a theory predicts that the race model prediction holds in a given experimental condition.  相似文献   

18.
篇章形式的阅读测验是一种典型的题组测验,在进行项目功能差异(DIF)检验时需要采用与之匹配的DIF检验方法.基于题组反应模型的DIF检验方法是真正能够处理题组效应的DIF检验方法,能够提供题组中每个项目的DIF效应测量,是题组DIF检验方法中较有理论优势的一种,主要使用的方法是Rasch题组DIF检验方法.该研究将Rasch题组DIF检验方法引入篇章阅读测验的DIF检验中,对某阅读成就测验进行题组DIF检验,结果显示,该测验在内容维度和能力维度的部分子维度上出现了具有显著DIF效应的项目,研究从测验公平的角度对该测验的进一步修改及编制提出了一定的建议.研究中进一步将Rasch题组DIF检验方法与基于传统Rasch模型的DIF检验方法以及变通的题组DIF检验方法的结果进行比较,研究结果体现了进行题组DIF检验的必要性与优越性.研究结果表明,在篇章阅读测验中,能够真正处理题组效应的题组DIF检验方法更加具有理论优势且对于阅读测验的编制与质量的提高具有更重要的意义.  相似文献   

19.
王阳  温忠麟  付媛姝 《心理科学进展》2020,28(11):1961-1969
常用的结构方程模型拟合指数存在一定局限, 如χ 2以传统零假设为目标假设, 无法验证模型, 而RMSEA和CFI等描述性的拟合指数不具备推断统计性质, 等效性检验有效弥补了这些问题。首先说明等效性检验如何评价单个模型的拟合, 并解释其与零假设检验的不同, 然后介绍等效性检验如何分析测量不变性, 接着用实证数据展示了等效性检验在单个模型评价和测量不变性检验中的效果, 并与传统模型评价方法比较。  相似文献   

20.
The concept of a maximum-typical performance dimension has received theoretical and empirical support in research on the construct of job performance. The critical distinction between maximum and typical performance resides in the postulate that under maximum test conditions motivational factors will be constant and maximal. The present study challenges the notion of the maximum performance paradigm by testing the effects of proximal (self-efficacy) and distal (need for achievement) motivation on performance under maximum test conditions. The authors used a walk-through performance test to evaluate the performance of 90 employees. The structural model demonstrates significant pathways between latent measures of motivation and performance ratings. The findings confirm the explanatory power of the motivation construct under maximum test conditions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号