首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The authors demonstrated that the most common statistical significance test used with r(WG)-type interrater agreement indexes in applied psychology, based on the chi-square distribution, is flawed and inaccurate. The chi-square test is shown to be extremely conservative even for modest, standard significance levels (e.g., .05). The authors present an alternative statistical significance test, based on Monte Carlo procedures, that produces the equivalent of an approximate randomization test for the null hypothesis that the actual distribution of responding is rectangular and demonstrate its superiority to the chi-square test. Finally, the authors provide tables of critical values and offer downloadable software to implement the approximate randomization test for r(WG)-type and for average deviation (AD)-type interrater agreement indexes. The implications of these results for studying a broad range of interrater agreement problems in applied psychology are discussed.  相似文献   

2.
J. O. Ramsay 《Psychometrika》1980,45(1):139-144
Some aspects of the small sample behavior of maximum likelihood estimates in multidimensional scaling are investigated by Monte Carlo. An investigation of Model M2 in the MULTISCALE program package shows that the chi-square test of dimensionality requires a correction of tabled chi-square values to be unbiased. A formula for this correction in the case of two dimensions is estimated. The power of the test of dimensionality is acceptable with as few as two replications for 15 stimuli and as few as five replications for 10 stimuli. The biases in the exponent and standard error estimates in this model are also investigated.The research reported here was supported by grant number APA 320 to the author by the National Science and Engineering Research Council of Canada.  相似文献   

3.
It is demonstrated in this paper that two major tests for 2 × 2 talbes are highly related from a Bayesian perspective. Although it is well-known that Fisher's exact and Pearson's chi-square tests are asymptotically equivalent, the present analysis shows that a formal similarity also exists in small samples. The key assumption that leads to the resemblance is the presence of a continuous parameter measuring association. In particular, it is shown that Pearson's probability can be obtained by integrating a two-moment approximation to the posterior distribution of the log-odds ratio. Furthermore, Pearson's chi-square test gave an excellent approximation to the actual Bayes probability in all 2×2 tables examined, except for those with extremely disproportionate marginal frequencies.  相似文献   

4.
D Frazier  R R DeBlassie 《Adolescence》1984,19(74):385-390
The tendency of non-Indian teachers to rate American Indian early adolescents (11-year olds) as behaviorally disordered more frequently than similarly behaving non-Indian children is examined. To test this hypothesis, the observed number of students rated as behavior-disordered by non-Indian teachers were compared with the expected number. A chi-square test revealed no significant tendency for this pattern to occur chi 2 = 2.04 less than .05, df = 1).  相似文献   

5.
In this paper, we show that for some structural equation models (SEM), the classical chi-square goodness-of-fit test is unable to detect the presence of nonlinear terms in the model. As an example, we consider a regression model with latent variables and interactions terms. Not only the model test has zero power against that type of misspecifications, but even the theoretical (chi-square) distribution of the test is not distorted when severe interaction term misspecification is present in the postulated model. We explain this phenomenon by exploiting results on asymptotic robustness in structural equation models. The importance of this paper is to warn against the conclusion that if a proposed linear model fits the data well according to the chi-quare goodness-of-fit test, then the underlying model is linear indeed; it will be shown that the underlying model may, in fact, be severely nonlinear. In addition, the present paper shows that such insensitivity to nonlinear terms is only a particular instance of a more general problem, namely, the incapacity of the classical chi-square goodness-of-fit test to detect deviations from zero correlation among exogenous regressors (either being them observable, or latent) when the structural part of the model is just saturated.  相似文献   

6.
Hoben Thomas 《Psychometrika》1977,42(2):199-206
Individuals are classified in a cross-classification table where two behavioral observations on each individual determine the classification. The problem is to test certain structural models assumed to underlie the cross-classified observations. A minimum chi-square test procedure is proposed.  相似文献   

7.
A 2 x 2 chi-square can be computed from a phi coefficient, which is the Pearson correlation between two binomial variables. Similarly, chi-square for larger contingency tables can be computed from canonical correlation coefficients. The authors address the following series of issues involving this relationship: (a) how to represent a contingency table in terms of a correlation matrix involving r - 1 row and c - 1 column dummy predictors; (b) how to compute chi-square from canonical correlations solved from this matrix; (c) how to compute loadings for the omitted row and column variables; and (d) the possible interpretive advantage of describing canonical relationships that comprise chi-square, together with some examples. The proposed procedures integrate chi-square analysis of contingency tables with general correlational theory and serve as an introduction to some recent methods of analysis more widely known by sociologists.  相似文献   

8.
A 2 × 2 chi-square can be computed from a phi coefficient, which is the Pearson correlation between two binomial variables. Similarly, chi-square for larger contingency tables can be computed from canonical correlation coefficients. The authors address the following series of issues involving this relationship: (a) how to represent a contingency table in terms of a correlation matrix involving r - 1 row and c - 1 column dummy predictors; (b) how to compute chi-square from canonical correlations solved from this matrix; (c) how to compute loadings for the omitted row and column variables; and (d) the possible interpretive advantage of describing canonical relationships that comprise chi-square, together with some examples. The proposed procedures integrate chi-square analysis of contingency tables with general correlational theory and serve as an introduction to some recent methods of analysis more widely known by sociologists.  相似文献   

9.
Fit indices are widely used in order to test the model fit for structural equation models. In a highly influential study, Hu and Bentler (1999) showed that certain cutoff values for these indices could be derived, which, over time, has led to the reification of these suggested thresholds as "golden rules" for establishing the fit or other aspects of structural equation models. The current study shows how differences in unique variances influence the value of the global chi-square model test and the most commonly used fit indices: Root-mean-square error of approximation, standardized root-mean-square residual, and the comparative fit index. Using data simulation, the authors illustrate how the value of the chi-square test, the root-mean-square error of approximation, and the standardized root-mean-square residual are decreased when unique variances are increased although model misspecification is present. For a broader understanding of the phenomenon, the authors used different sample sizes, number of observed variables per factor, and types of misspecification. A theoretical explanation is provided, and implications for the application of structural equation modeling are discussed.  相似文献   

10.
Bowker's test for marginal equality in contingency tables provides a familiar chi-square test to determine whether the marginal distributions are the same across two or more factors or occasions. In this note it is shown how latent trait theory provides a theoretical framework for the development and application of this test.The research reported here was supported by a grant to the senior author from the National Institute on Aging (AG03164).  相似文献   

11.
The continuous strength model of recognition memory was evaluated in a task where Ss were tested for recognition of 10-number lists using a rating procedure. Maximum likelihood estimates of the parameters of the model were obtained by an iterative method on a high-speed computer, and a chi-square goodness-of-fit test was performed for individual Ss. For 15 of 20 Ss, the chi-square values were nonsignificant, p < .05, indicating that the model provided a good fit to the data. Although the model gave a good fit to the data, the Δm measure of sensitivity was highly correlated with a true recognition score computed by subtracting false alarmsfrom correct recognitions.  相似文献   

12.
A problem arises in analyzing the existence of interdependence between the behavioral sequences of two individuals: tests involving a statistic such as chi-square assume independent observations within each behavioral sequence, a condition which may not exist in actual practice. Using Monte Carlo simulations of binomial data sequences, we found that the use of a chi-square test frequently results in unacceptable Type I error rates when the data sequences are autocorrelated. We compared these results to those from two other methods designed specifically for testing for intersequence independence in the presence of intrasequence autocorrelation. The first method directly tests the intersequence correlation using an approximation of the variance of the intersequence correlation estimated from the sample autocorrelations. The second method uses tables of critical values of the intersequence correlation computed by Nakamuraet al. (J. Am. Stat. Assoc., 1976,71, 214–222). Although these methods were originally designed for normally distributed data, we found that both methods produced much better results than the uncorrected chi-square test when applied to binomial autocorrelated sequences. The superior method appears to be the variance approximation method, which resulted in Type I error rates that were generally less than or equal to 5% when the level of significance was set at .05.  相似文献   

13.
A family of scaling corrections aimed to improve the chi-square approximation of goodness-of-fit test statistics in small samples, large models, and nonnormal data was proposed in Satorra and Bentler (1994). For structural equations models, Satorra-Bentler's (SB) scaling corrections are available in standard computer software. Often, however, the interest is not on the overall fit of a model, but on a test of the restrictions that a null model sayM 0 implies on a less restricted oneM 1. IfT 0 andT 1 denote the goodness-of-fit test statistics associated toM 0 andM 1, respectively, then typically the differenceT d =T 0T 1 is used as a chi-square test statistic with degrees of freedom equal to the difference on the number of independent parameters estimated under the modelsM 0 andM 1. As in the case of the goodness-of-fit test, it is of interest to scale the statisticT d in order to improve its chi-square approximation in realistic, that is, nonasymptotic and nonormal, applications. In a recent paper, Satorra (2000) shows that the difference between two SB scaled test statistics for overall model fit does not yield the correct SB scaled difference test statistic. Satorra developed an expression that permits scaling the difference test statistic, but his formula has some practical limitations, since it requires heavy computations that are not available in standard computer software. The purpose of the present paper is to provide an easy way to compute the scaled difference chi-square statistic from the scaled goodness-of-fit test statistics of modelsM 0 andM 1. A Monte Carlo study is provided to illustrate the performance of the competing statistics. This research was supported by the Spanish grants PB96-0300 and BEC2000-0983, and USPHS grants DA00017 and DA01070.  相似文献   

14.
L Shilts 《Adolescence》1991,26(103):613-617
Two hundred and thirty-seven seventh- and eighth-grade students were assessed for levels of drug/alcohol use, involvement in extracurricular activities, peer influence, and personal attitudes. Cross-tabulations and the chi-square test of independence were used to statistically compare the three groups (non-users, users, and abusers). Several trends emerged from the data.  相似文献   

15.
NORMUL is a FORTRAN program that provides a test of whether data conform to a multivariate normal distribution. The method involves correlating Mahalanobis distances for observed data with expected chi-square percentile values. This obtained correlation is then tested for significance by empirically evaluating the probability of its belonging to a distribution generated from multivariate normal data.  相似文献   

16.
本研究考察生命和金钱问题下,获得和损失框架中决策任务类型对风险决策的影响。采用2(任务领域:生命、金钱)×2(决策任务类型:经验、描述)×2(结果框架:获得、损失)被试间设计,使用卡方检验及logistic回归分析后发现:生命和金钱问题下,个体在直接给出方案可能结果的描述性决策中仅表现出损失框架下的风险偏好;在通过自主查看方案可能结果的经验性决策中未发现结果框架作用。描述−经验差距一致性存在于生命和金钱问题中。  相似文献   

17.
McNemar's problem concerns the hypothesis of equal probabilities for the unlike pairs of correlated binary variables. We consider four different extensions to this problem, each for testing simultaneous equality of proportions of unlike pairs inc independent populations of correlated binary variables, but each under different assumptions and/or additional hypotheses. For each extension both the likelihood ratio test and the goodness-of-fit chi-square test are given. Whenc=1, all cases reduce to McNemar's problem. Forc ≥ 2, however, the tests are quite different, depending on exactly how the hypothesis and alternatives of McNemar are extended. An example illustrates how widely the results may differ, depending on which extended framework is appropriate.  相似文献   

18.
This study assessed differences in response rates to a series of three-wave mail surveys when amiable or insistently worded postcards were the third wave of the mailing. Three studies were conducted; one with a sample of 600 health commissioners, one with a sample of 680 vascular nurses, and one with 600 elementary school secretaries. The combined response rates for the first and second wave mailings were 65.8%, 67.6%, and 62.4%, respectively. A total of 308 amiable and 308 insistent postcards were sent randomly to nonrespondents as the third wave mailing. Overall, there were 41 amiable and 52 insistent postcards returned, not significantly different by chi-square test. However, a separate chi-square test for one of the three studies, the nurses' study, did find a significant difference in favor of the insistently worded postcards.  相似文献   

19.
A method for analyzing test item responses is proposed to examine differential item functioning (DIF) in multiple-choice items through a combination of the usual notion of DIF, for correct/incorrect responses and information about DIF contained in each of the alternatives. The proposed method uses incomplete latent class models to examine whether DIF is caused by the attractiveness of the alternatives, difficulty of the item, or both. DIF with respect to either known or unknown subgroups can be tested by a likelihood ratio test that is asymptotically distributed as a chi-square random variable.  相似文献   

20.
A simulation study investigated the effects of skewness and kurtosis on level-specific maximum likelihood (ML) test statistics based on normal theory in multilevel structural equation models. The levels of skewness and kurtosis at each level were manipulated in multilevel data, and the effects of skewness and kurtosis on level-specific ML test statistics were examined. When the assumption of multivariate normality was violated, the level-specific ML test statistics were inflated, resulting in Type I error rates that were higher than the nominal level for the correctly specified model. Q-Q plots of the test statistics against a theoretical chi-square distribution showed that skewness led to a thicker upper tail and kurtosis led to a longer upper tail of the observed distribution of the level-specific ML test statistic for the correctly specified model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号