期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Evaluating interobserver reliability of interval data

Hopkins BL Hermann JA 《Journal of applied behavior analysis》1977,10(1):121-126

Previous recommendations to employ occurrence, nonoccurrence, and overall estimates of interobserver reliability for interval data are reviewed. A rationale for comparing obtained reliability to reliability that would result from a random-chance model is explained. Formulae and graphic functions are presented to allow for the determination of chance agreement for each of the three indices, given any obtained per cent of intervals in which a response is recorded to occur. All indices are interpretable throughout the range of possible obtained values for the per cent of intervals in which a response is recorded. The level of chance agreement simply changes with changing values. Statistical procedures that could be used to determine whether obtained reliability is significantly superior to chance reliability are reviewed. These procedures are rejected because they yield significance levels that are partly a function of sample sizes and because there are no general rules to govern acceptable significance levels depending on the sizes of samples employed. 相似文献

2.

Combining standardized mean differences using the method of maximum likelihood

Ke-Hai Yuan Brad J. Bushman 《Psychometrika》2002,67(4):589-607

A maximum likelihood procedure for combining standardized mean differences based on a noncentratt-distribution is proposed. With a proper data augmentation technique, an EM-algorithm is developed. Information and likelihood ratio statistics are discussed in detail for reliable inference. Simulation results favor the proposed procedure over both the existing normal theory maximum likelihood procedure and the commonly used generalized least squares procedure. 相似文献

3.

The effect of covariate mean differences on the standard error and confidence interval for the comparison of treatment means

Xiaofeng Steven Liu 《The British journal of mathematical and statistical psychology》2011,64(2):310-319

The use of covariates is commonly believed to reduce the unexplained error variance and the standard error for the comparison of treatment means, but the reduction in the standard error is neither guaranteed nor uniform over different sample sizes. The covariate mean differences between the treatment conditions can inflate the standard error of the covariate‐adjusted mean difference and can actually produce a larger standard error for the adjusted mean difference than that for the unadjusted mean difference. When the covariate observations are conceived of as randomly varying from one study to another, the covariate mean differences can be related to a Hotelling's T². Using this Hotelling's T² statistic, one can always find a minimum sample size to achieve a high probability of reducing the standard error and confidence interval width for the adjusted mean difference. 相似文献

4.

Examining the reliability of ADAS-Cog change scores

Joseph H. Grochowalski Ying Liu Karen L. Siedlecki 《Neuropsychology, development, and cognition. Section B, Aging, neuropsychology and cognition》2016,23(5):513-529

The purpose of this study was to estimate and examine ways to improve the reliability of change scores on the Alzheimer’s Disease Assessment Scale, Cognitive Subtest (ADAS-Cog). The sample, provided by the Alzheimer’s Disease Neuroimaging Initiative, included individuals with Alzheimer’s disease (AD) (n = 153) and individuals with mild cognitive impairment (MCI) (n = 352). All participants were administered the ADAS-Cog at baseline and 1 year, and change scores were calculated as the difference in scores over the 1-year period. Three types of change score reliabilities were estimated using multivariate generalizability. Two methods to increase change score reliability were evaluated: reweighting the subtests of the scale and adding more subtests. Reliability of ADAS-Cog change scores over 1 year was low for both the AD sample (ranging from .53 to .64) and the MCI sample (.39 to .61). Reweighting the change scores from the AD sample improved reliability (.68 to .76), but lengthening provided no useful improvement for either sample. The MCI change scores had low reliability, even with reweighting and adding additional subtests. The ADAS-Cog scores had low reliability for measuring change. Researchers using the ADAS-Cog should estimate and report reliability for their use of the change scores. The ADAS-Cog change scores are not recommended for assessment of meaningful clinical change. 相似文献

5.

Computing and comparing correlation and regression coefficients using a pocket calculator

D. V. M. Bishop 《Behavior research methods》1980,12(3):367-371

Programs suitable for pocket calculators using reverse Polish notation are described. Program 1 computes regression coefficients, correlation coefficient, and standard error of estimate for paired data. Program 2 performs at test to compare the slopes of two regression lines. Program 3 computes F ratios to test the departure of a regression slope from zero and to test linearity of the regression. Programs 4 and 5 test the significance of (differences between independent and correlated correlation coefficients, respectively. 相似文献

6.

Examining reliability and multicollinearity of scale items

William M. Holmes 《Behavior research methods》1979,11(1):86-86

相似文献

7.

The relationship between mean square differences and standard error of measurement: comment on Barchard (2012)

Pan T Yin Y 《心理学方法》2012,17(2):309-311

In the discussion of mean square difference (MSD) and standard error of measurement (SEM), Barchard (2012) concluded that the MSD between 2 sets of test scores is greater than 2(SEM)2 and SEM underestimates the score difference between 2 tests when the 2 tests are not parallel. This conclusion has limitations for 2 reasons. First, strictly speaking, MSD should not be compared to SEM because they measure different things, have different assumptions, and capture different sources of errors. Second, the related proof and conclusions in Barchard hold only under the assumptions of equal reliabilities, homogeneous variances, and independent measurement errors. To address the limitations, we propose that MSD should be compared to the standard error of measurement of difference scores (SEMx-y) so that the comparison can be extended to the conditions when 2 tests have unequal reliabilities and score variances. 相似文献

8.

Interpreting the magnitudes of correlation coefficients

Hemphill JF 《The American psychologist》2003,58(1):78-79

相似文献

9.

Examining differences in the levels of false memories in children and adults using child-normed lists

Anastasi JS Rhodes MG 《Developmental psychology》2008,44(3):889-894

Several previous studies have demonstrated that children, when compared with adults, exhibit both lower levels of veridical memory and fewer intrusions when given semantically associated lists. However, researchers have drawn these conclusions using semantically associated word lists that were normed with adults, which may not lead to the same level of activation or gist generation in children. In the current study, the authors used similar associative word lists normed with children and then evaluated the memory of children and adults using these newly normed lists as well as the typical adult-normed lists. Results indicate that children showed lower true and false memories with both the child-normed and adult-normed lists. Thus, these data suggest that the negative relationship between age and false memories in the Deese-Roediger-McDermott (DRM; J. Deese, 1959; H. L. Roediger & K. B. McDermott, 1995) paradigm is not an artifact of the age group used to construct the lists. 相似文献

10.

Use of the estimated intraclass correlation for correcting differences in effect size by level

Ahn S Myers ND Jin Y 《Behavior research methods》2012,44(2):490-502

In a meta-analysis of intervention or group comparison studies, researchers often encounter the circumstance in which the standardized mean differences (d-effect sizes) are computed at multiple levels (e.g., individual vs. cluster). Cluster-level d-effect sizes may be inflated and, thus, may need to be corrected using the intraclass correlation (ICC) before being combined with individual-level d-effect sizes. The ICC value, however, is seldom reported in primary studies and, thus, may need to be computed from other sources. This article proposes a method for estimating the ICC value from the reported standard deviations within a particular meta-analysis (i.e., estimated ICC) when an appropriate default ICC value (Hedges, 2009b) is unavailable. A series of simulations provided evidence that the proposed method yields an accurate and precise estimated ICC value, which can then be used for correct estimation of a d-effect size. The effects of other pertinent factors (e.g., number of studies) were also examined, followed by discussion of related limitations and future research in this area. 相似文献

11.

A breakdown of reliability coefficients by test type and reliability method,and the clinical implications of low reliability

Charter RA 《The Journal of general psychology》2003,130(3):290-304

相似文献

12.

On the computation of zero-order correlation coefficients

Carl F. Kossack 《Psychometrika》1948,13(2):91-93

There appears to be a gap in published computational techniques inasmuch as nowhere in the literature nor in textbooks can one find a model to be followed in computing the numerous zero-order correlation coefficients for a correlation matrix. The purpose of this paper is to present, by means of an illustration, such a model. The model consists of two computational matrices, matrix one being the Summation Matrix and matrix two the Computational Matrix. The entries on these matrices are arranged so as to facilitate the future computations. 相似文献

13.

Deriving coefficients of reliability and agreement for ratings 总被引：1，自引：0，他引：1

A E Maxwell A E Pilliner 《The British journal of mathematical and statistical psychology》1968,21(1):105-116

相似文献

14.

On the mean and variance of the tetrachoric correlation coefficient 总被引：1，自引：0，他引：1

Morton B. Brown Dr. Jacqueline K. Benedetti 《Psychometrika》1977,42(3):347-355

Estimates of the mean and standard deviation of the tetrachoric correlation are compared with their expected values in several 2 × 2 tables. Significant bias in the mean is found when the minimum cell frequency is less than 5. Three formulas for the standard deviation are compared and guidelines given for their use.This research was performed when the first author was on leave at the University of California at Los Angeles and was supported in part by NIH Special Research Resources Grant RR-3. The second author was also supported by NIH Fellowship 5 F22 GM00328-02. 相似文献

15.

Note on the computation of product-moment correlation coefficients

ADKINS DC 《Psychometrika》1949,14(1):69-73

This paper describes a systematic plan for computing all of the product-moment correlation coefficients among a number of variables that has been taught by Professor Toops for many years. It offers several advantages over a scheme presented by Kossack in a recent issue of this journal. 相似文献

16.

A reexamination of black-white mean differences in work performance: more data, more moderators

McKay PF McDaniel MA 《The Journal of applied psychology》2006,91(3):538-554

This study is the largest meta-analysis to date of Black-White mean differences in work performance. The authors examined several moderators not addressed in previous research. Findings indicate that mean racial differences in performance favor Whites (d = 0.27). Effect sizes were most strongly moderated by criterion type and the cognitive loading of criteria, whereas data source and measurement level were influential moderators to a lesser extent. Greater mean differences were found for highly cognitively loaded criteria, data reported in unpublished sources, and for performance measures consisting of multiple item scales. On the basis of these findings, the authors hypothesize several potential determinants of mean racial differences in job performance. 相似文献

17.

Gender differences in the mean level,variability, and profile shape of student achievement: Results from 41 countries

Martin Brunner Katarzyna Marta Gogol Philipp Sonnleitner Ulrich Keller Stefan Krauss Franzis Preckel 《Intelligence》2013

A domain-specific hierarchical conceptualization of mathematics achievement can be represented by the standard psychometric model in which a single latent dimension accounts for observed individual differences in scores on the respective subdomains (e.g., quantity). Alternatively, a fully hierarchical conceptualization of achievement can be represented by a nested-factor model in which individual differences in subdomain-specific scores are explained by both general student achievement and specific mathematics achievement. The authors applied both models to study the gender similarity hypothesis, the greater male variability hypothesis, and the masking hypothesis, which predicts that gender differences in general student achievement mask gender differences in both the means and the variability of specific mathematics achievement. Representative data were obtained from 275,369 15-year-old students in 41 countries. The results supported these hypotheses in most countries, demonstrating that a fully hierarchical conceptualization of achievement in terms of the nested-factor model significantly contributes to a better understanding of gender differences in the mean level, variability, and shape of students' achievement profiles. 相似文献

18.

On the use of the symmetrical square root in G analysis

JASPER W. HOLLEY PAUL KLINE 《Scandinavian journal of psychology》1976,17(1):246-250

Abstract.— Describes a method for computing the symmetrical square root (SSR) for two-dimensional models in G analysis_: Demonstrates the application of the SSR in the rotation of factors, using delegate scores as marker variables. Provides illustrative examples, using hypothetical data. 相似文献

19.

Magnitude of the random sample and level of reliability

D Wendt 《Zeitschrift für Psychologie mit Zeitschrift für angewandte Psychologie》1969,176(3):247-258

相似文献

20.

Estimating the reliability of interview data

Joseph L. Fleiss 《Psychometrika》1970,35(2):143-162

A model for a score based on an interview is presented which identifies the effect due to the subject, to the manner in which the interviewer tends to conduct his interviews, to the criteria he tends to use in scoring subjects' responses, to the compromises he tends to adopt between the demands of interviewing and those of scoring, and to chance errors. A suggested experimental design calls for each ofK investigators to interview a different sample ofN subjects, but for all investigators to score each subject. The drawing of inferences when interest is only in theK participants in the reliability study is considered, and a numerical example is given.This work was supported in part by grant DE R01 00793 from the National Institute of Dental Research, and in part by grants MH 08534 and MH 09191 from the National Institute of Mental Health, and forms part of the author's Ph.D. dissertation at Columbia University. The guidance provided by Professor T. W. Anderson is gratefully acknowledged. 相似文献