期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Considerations in the choice of interobserver reliability estimates

Hartmann DP 《Journal of applied behavior analysis》1977,10(1):103-116

Two types of interobserver reliability values may be needed in treatment studies in which observers constitute the primary data-acquisition system: trial reliability and the reliability of the composite unit or score which is subsequently analyzed, e.g., daily or weekly session totals. Two approaches to determining interobserver reliability are described: percentage agreement and "correlational" measures of reliability. The interpretation of these estimates, factors affecting their magnitude, and the advantages and limitations of each approach are presented. 相似文献

2.

A note on reliability estimation for a test with components of unknown functional lengths

Michelle Liou 《Psychometrika》1989,54(1):153-163

This research note proposes two reliability coefficients for tests with components of unknown functional lengths. The derived coefficients are extensions of the techniques devised by Kristof and Feldt and do not require a reduction of test components into parts. Simulation study indicates that the new coefficients yield reasonably stable reliability estimates when the number of test components is small. 相似文献

3.

On robusiness of the normal-theory based asymptotic distributions of three reliability coefficient estimates

Ke-Hai Yuan Peter M. Bentler 《Psychometrika》2002,67(2):251-259

This paper studies the asymptotic distributions of three reliability coefficient estimates: Sample coefficient alpha, the reliability estimate of a composite score following a factor analysis, and the estimate of the maximal reliability of a linear combination of item scores following a factor analysis. Results indicate that the asymptotic distribution for each of the coefficient estimates, obtained based on a normal sampling distribution, is still valid within a large class of nonnormal distributions. Therefore, a formula for calculating the standard error of the sample coefficient alpha, recently obtained by van Zyl, Neudecker and Nel, applies to other reliability coefficients and can still be used even with skewed and kurtotic data such as are typical in the social and behavioral sciences.This research was supported by grants DA01070 and DA00017 from the National Institute on Drug Abuse and a University of North Texas faculty research grant. We would like to thank the Associate Editor and two reviewers for suggestions that helped to improve the paper. 相似文献

4.

A method of bias correction for maximal reliability with dichotomous measures

Spiridon Penev Tenko Raykov 《The British journal of mathematical and statistical psychology》2010,63(1):163-175

This paper is concerned with the reliability of weighted combinations of a given set of dichotomous measures. Maximal reliability for such measures has been discussed in the past, but the pertinent estimator exhibits a considerable bias and mean squared error for moderate sample sizes. We examine this bias, propose a procedure for bias correction, and develop a more accurate asymptotic confidence interval for the resulting estimator. In most empirically relevant cases, the bias correction and mean squared error correction can be performed simultaneously. We propose an approximate (asymptotic) confidence interval for the maximal reliability coefficient, discuss the implementation of this estimator, and investigate the mean squared error of the associated asymptotic approximation. We illustrate the proposed methods using a numerical example. 相似文献

5.

A generalized expression for the reliability of measures

HORST P 《Psychometrika》1949,14(1):21-31

In certain situations it is important to obtain as many measures as possible, all presumably measuring the same function, for each of a group of persons. In general the number and source of the measures may vary from one member of the group to another. We take the mean of the measures for each person as the best estimate of the function for that person. The conventional formulas can not be used to determine the reliability of a set of means so obtained. A formula is developed which provides a unique estimate of the reliability of such a set of means. The formula is more general than some of the well-known reliability formulas, so that these formulas are shown to be special cases of the more general formula. 相似文献

6.

On the reliability of a weighted composite 总被引：1，自引：0，他引：1

Charles I. Mosier 《Psychometrika》1943,8(3):161-168

A general formula for the reliability of a weighted composite has been derived by which that reliability can be estimated from a knowledge of the weights whatever their source, reliabilities, dispersions, and intercorrelations of the components. The Spearman-Brown formula has been shown to be a special case of the general statement. The effect of the internal consistency or intercorrelation of the components has been investigated and the conditions defining the set of weights yielding maximum reliability shown to be that the weight of a component is proportional to the sum of its intercorrelations with the remaining components and inversely proportional to its error variance. 相似文献

7.

Beyond the Spearman-Brown: a structural approach to maximal reliability

Drewes DW 《心理学方法》2000,5(2):214-227

The requirement of parallel parts has long been the cornerstone of classic reliability theory. By recasting reliability in a structural equation framework, items, raters, or judges no longer need to be treated as equivalent entities. Instead, unique reliability estimates can be determined for each and collectively used to assess the maximal reliability of a weighted composite, with the composite reliability submitted to inferential test. Procedures are shown to generalize from single to multifactor applications. Ramifications of a structural approach to reliability determination are probed, and the dilemma posed by possible falsification of the true score hypothesis presented for individual researcher consideration. 相似文献

8.

Reliability and expected loss: A unifying principle

Bruce Cooil Roland T. Rust 《Psychometrika》1994,59(2):203-216

We provide a unified, theoretical basis on which measures of data reliability may be derived or evaluated, for both quantitative and qualitative data. This approach evaluates reliability as the proportional reduction in loss (PRL) that is attained in a sample by an optimal estimator. The resulting measure is between 0 and 1, linearly related to expected loss, and provides a direct way of contrasting the measured reliability in the sample with the least reliable and most reliable data-generating cases. The PRL measure is a generalization of many of the commonly-used reliability measures.We show how the quantitative measures from generalizability theory can be derived as PRL measures (including Cronbach's alpha and measures proposed by Winer). For categorical data, we develop a new measure for the general case in which each of N judges assigns a subject to one of K categories and show that it is equivalent to a measure proposed by Perreault and Leigh for the case where N is 2.Bruce Cooil is an Associate Professor of Statistics, and Roland T. Rust is a Professor and area head for Marketing. The authors thank three anonymous reviewers and an Associate Editor for their helpful comments and suggestions. This work was supported in part by the Dean's Fund for Faculty Research of the Owen Graduate School of Management, Vanderbilt University. 相似文献

9.

Estimation of maximal reliability: A note on a covariance structure modelling approach

《The British journal of mathematical and statistical psychology》2004,57(1):21-27

A one‐step covariance structure analysis procedure for estimation of maximal reliability of linear composites with congeneric measures is outlined. The approach is readily employed within a single modelling session using popular covariance structure analysis software, and permits simultaneous estimation of the optimal measure weights with standard errors. The method is illustrated by a numerical example. 相似文献

10.

On the reliability of the extreme score

Huynh Huynh 《Psychometrika》1986,51(3):475-478

Under the assumption of normality, a formula is derived for the reliability of the maximum score. It is shown that the maximum score is more reliable than each of the single observations, but less reliable than their composite score. 相似文献

11.

A critical review of theS/L reliability index

Hoi K. Suen Patrick S. C. Lee Jane E. Prochnow-LaGrow 《Journal of psychopathology and behavioral assessment》1985,7(3):277-287

The meaning and properties of a commonly used index of reliability, S/L,were examined critically. It was found that the index does not reflect any conventional concept of reliability. When used for an identical behavioral observation session, it is not statistically correlated with other reliability indices. Within an observation session, the standardizing measure of Lis beyond the control of the investigator. Furthermore, the reason for the choice of Las the standard is unclear. The role of chance agreement in S/Lis not known. The exact interpretation of the index depends on which observer reports L.Overall the conceptual and mathematical meaning of S/Lis dubious. It is suggested that the S/Lindex should not be used until its nature is shown to be a measure of reliability. Other approaches such as the intraclass correlations and generalizability coefficients should be used instead.The authors are indebted to Johnny Matson for his critique of an earlier version of this paper. 相似文献

12.

A note on the estimation of the level of predictive precision of a fitted linear equation

Jorge L. Mendoza 《Psychometrika》1977,42(1):145-147

A procedure that utilizes the sample multiple correlation to form a lower bound for the level of predictive precision of a fitted regression equation is suggested. The procedure is shown to yield probability statements which are true at least 100(1–)% of the time. 相似文献

13.

A short note on the maximal point‐biserial correlation under non‐normality

下载免费PDF全文

Ying Cheng Haiyan Liu 《The British journal of mathematical and statistical psychology》2016,69(3):344-351

The aim of this paper is to derive the maximal point‐biserial correlation under non‐normality. Several widely used non‐normal distributions are considered, namely the uniform distribution, t‐distribution, exponential distribution, and a mixture of two normal distributions. Results show that the maximal point‐biserial correlation, depending on the non‐normal continuous variable underlying the binary manifest variable, may not be a function of p (the probability that the dichotomous variable takes the value 1), can be symmetric or non‐symmetric around p = .5, and may still lie in the range from ?1.0 to 1.0. Therefore researchers should exercise caution when they interpret their sample point‐biserial correlation coefficients based on popular beliefs that the maximal point‐biserial correlation is always smaller than 1, and that the size of the correlation is always further restricted as p deviates from .5. 相似文献

14.

General estimators for the reliability of qualitative data

Bruce Cooil Roland T. Rust 《Psychometrika》1995,60(2):199-220

We study a proportional reduction in loss (PRL) measure for the reliability of categorical data and consider the general case in which each ofN judges assigns a subject to one ofK categories. This measure has been shown to be equivalent to a measure proposed by Perreault and Leigh for a special case when there are two equally competent judges, and the correct category has a uniform prior distribution. We consider a general framework where the correct category is assumed to have an arbitrary prior distribution, and where classification probabilities vary by correct category, judge, and category of classification. In this setting, we consider PRL reliability measures based on two estimators of the correct category—the empirical Bayes estimator and an estimator based on the judges' consensus choice. We also discuss four important special cases of the general model and study several types of lower bounds for PRL reliability.Bruce Cooil is Associate Professor of Statistics, and Roland T. Rust is Professor and area head for Marketing, Owen Graduate School of Management, Vanderbilt University. The authors thank three anonymous reviewers and an Associate Editor for their helpful comments and suggestions. This work was supported in part by the Dean's Fund for Faculty Research of the Owen Graduate School of Management, Vanderbilt University. 相似文献

15.

A latent-trait based reliability estimate and upper bound

W. Alan Nicewander 《Psychometrika》1990,55(1):65-74

An estimate and an upper-bound estimate for the reliability of a test composed of binary items is derived from the multidimensional latent trait theory proposed by Bock and Aitkin (1981). The estimate derived here is similar to internal consistency estimates (such as coefficient alpha) in that it is a function of the correlations among test items; however, it is not a lowerbound estimate as are all other similar methods.An upper bound to reliability that is less than unity does not exist in the context of classical test theory. The richer theoretical background provided by Bock and Aitkin's latent trait model has allowed the development of an index (called here) that is always greater-than or equal-to the reliability coefficient for a test (and is less-than or equal-to one). The upper bound estimate of reliability has practical uses—one of which makes use of the greatest lower bound. 相似文献

16.

A unifying model for the structure of intellectual abilities

Jan-Eric Gustafsson 《Intelligence》1984,8(3):179-203

Models of the structure of cognitive abilities suggested by Spearman, Thurstone, Guilford, Vernon and Cattell-Horn are reviewed. It is noted that some of the models include a general intellectual factor (g) while others do not. It is also noted that some models are nonhierarchical, while in others more narrow abilities are subsumed under broader abilities in a hierarchical pattern. An empirical study in which a test battery of 16 tests was administered to some 1000 subjects in the 6th grade is reported. Using the LISREL technique to test different models, good support is obtained for oblique primary factors in the Thurstone tradition as well as for the second-order factors fluid intelligence, crystallized intelligence, and general visualization hypothesized by Cattell and Horn. It is also found, however, that the second-order factor of fluid intelligence i is identical with a third-order g-factor. On the basis of these results a three-level model (the HILI-model) is suggested, with the g-factor at the top, two broad factors reflecting the ability to deal with verbal and figural information, respectively, at the second-order level, and the primary factors in the Thurstone and Guilford tradition at the lowest level. It is argued that most previously suggested models are special cases of the HILI-model. 相似文献

17.

A test of the hypothesis that Cronbach's alpha reliability coefficient is the same for two tests administered to the same sample

Leonard S. Feldt 《Psychometrika》1980,45(1):99-105

In measurement studies the researcher may wish to test the hypothesis that Cronbach's alpha reliability coefficient is the same for two measurement procedures. A statistical test exists for independent samples of subjects. In this paper three procedures are developed for the situation in which the coefficients are determined from the same sample. All three procedures are computationally simple and give tight control of Type I error when the sample size is 50 or greater.The author is indebted to Jerry S. Gilmer for development of the computer programs used in this study. 相似文献

18.

A decision tree approach to selecting an appropriate observation reliability index

Hoi K. Suen Donald Ary Wesley C. Covalt 《Journal of psychopathology and behavioral assessment》1990,12(4):359-363

Based on the conceptual framework outlined by Cone (1986) and Suen (1988), a practical decision tree is developed as an aid for the selection of observational reliability indices. 相似文献

19.

Lower bounds for the reliability of the total score on a test composed of non-homogeneous items: I: Algebraic lower bounds

Paul H. Jackson Christian C. Agunwamba 《Psychometrika》1977,42(4):567-578

Let Σ_x be the (population) dispersion matrix, assumed well-estimated, of a set of non-homogeneous item scores. Finding the greatest lower bound for the reliability of the total of these scores is shown to be equivalent to minimizing the trace of Σ_x by reducing the diagonal elements while keeping the matrix non-negative definite. Using this approach, Guttman's bounds are reviewed, a method is established to determine whether his λ₄ (maximum split-half coefficient alpha) is the greatest lower bound in any instance, and three new bounds are discussed. A geometric representation, which sheds light on many of the bounds, is described. Present affiliation of the second author: Department of Statistics, University of Nigeria (Nsukka Campus). Work on this paper was carried out while on study leave in Aberystwyth. 相似文献

20.

Evaluating interobserver reliability of interval data

Hopkins BL Hermann JA 《Journal of applied behavior analysis》1977,10(1):121-126

Previous recommendations to employ occurrence, nonoccurrence, and overall estimates of interobserver reliability for interval data are reviewed. A rationale for comparing obtained reliability to reliability that would result from a random-chance model is explained. Formulae and graphic functions are presented to allow for the determination of chance agreement for each of the three indices, given any obtained per cent of intervals in which a response is recorded to occur. All indices are interpretable throughout the range of possible obtained values for the per cent of intervals in which a response is recorded. The level of chance agreement simply changes with changing values. Statistical procedures that could be used to determine whether obtained reliability is significantly superior to chance reliability are reviewed. These procedures are rejected because they yield significance levels that are partly a function of sample sizes and because there are no general rules to govern acceptable significance levels depending on the sizes of samples employed. 相似文献