期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Coherence,striking agreement,and reliability

Michael Schippers 《Synthese》2014,191(15):3661-3684

Striving for a probabilistic explication of coherence, scholars proposed a distinction between agreement and striking agreement. In this paper I argue that only the former should be considered a genuine concept of coherence. In a second step the relation between coherence and reliability is assessed. I show that it is possible to concur with common intuitions regarding the impact of coherence on reliability in various types of witness scenarios by means of an agreement measure of coherence. Highlighting the need to separate the impact of coherence and specificity on reliability it is finally shown that a recently proposed vindication of the Shogenji measure qua measure of coherence vanishes. 相似文献

2.

An information theoretic measure for the evaluation of ordinal scale data

Tastle WJ Wierman MJ 《Behavior research methods》2006,38(3):487-494

This article describes a new measure of dispersion as an indication of consensus and dissention. Building on the generally accepted Shannon entropy, this measure utilizes a probability distribution and the ordered ranking of categories in an ordinal scale distribution to yield a value confined to the unit interval. Unlike other measures that need to be normalized, this measure is always in the interval 0 to 1. The measure is typically applied to the Likert scale to determine degrees of agreement among ordinal-ranked categories when one is dealing with data collection and analysis, although other scales are possible. Using this measure, investigators can easily determine the proximity of ordinal data to consensus (agreement) or dissention. Consensus and dissention are defined relative to the degree of proximity of values constituting a frequency distribution on the ordinal scale measure. The authors identify a set of criteria that a measure must satisfy in order to be an acceptable indicator of consensus and show how the consensus measure satisfies all the criteria. 相似文献

3.

An ordinal coefficient of relational agreement for multiple judges

Robert F. Fagot 《Psychometrika》1994,59(2):241-251

In a recent article, Fagot proposed a generalized family of coefficients of relational agreement for multiple judges, focusing on the concept of empirically meaningful relationships. In this paper an ordinal coefficient of relational agreement, based on ranking data, is presented as a special case of the generalized family. It is shown that the proposed ordinal coefficient encompasses other ordinal coefficients, such as the Kendall coefficient of concordance, the average Spearman rank-order coefficient, and intraclass correlation based on ranks. It is also shown that the Kendall coefficient of concordance, corrected for chance agreement, is equivalent to the ordinal coefficient proposed in this paper. 相似文献

4.

The Struggle of Behavioral Therapists With Exposure: Self-Reported Practicability,Negative Beliefs,and Therapist Distress About Exposure-Based Interventions

Andre Pittig Roxana Kotter Jürgen Hoyer 《Behavior Therapy》2019,50(2):353-366

Exposure-based interventions are a core ingredient of evidence-based cognitive-behavioral treatment (CBT) for anxiety disorders, posttraumatic stress disorder (PTSD), and obsessive-compulsive disorder (OCD). However, previous research has documented that exposure is rarely utilized in routine care, highlighting an ongoing lack of dissemination. The present study examined barriers for the dissemination of exposure from the perspective of behavioral psychotherapists working in outpatient routine care (N = 684). A postal survey assessed three categories of barriers: (a) practicability of exposure-based intervention in an outpatient private practice setting, (b) negative beliefs about exposure, and (c) therapist distress related to the use of exposure. In addition, self-reported competence to conduct exposure for different anxiety disorders, PTSD, and OCD was assessed. High rates of agreement were found for single barriers within each of the three categories (e.g., unpredictable time management, risk of uncompensated absence of the patient, risk of decompensation of the patient, superficial effectiveness, or exposure being very strenuous for the therapist). Separately, average agreement to each category negatively correlated with self-reported utilization of exposure to a moderate degree (-.35 ≤ r ≤ -.27). In a multiple regression model, only average agreement to barriers of practicability and negative beliefs were significantly associated with utilization rates. Findings illustrate that a multilevel approach targeting individual, practical, and systemic barriers is necessary to optimize the dissemination of exposure-based interventions. Dissemination efforts may therefore benefit from incorporating strategies such as modifying negative beliefs, adaptive stress management for therapists, or increasing practicability of exposure-based interventions. 相似文献

5.

Testing intergroup concordance in ranking experiments with two groups of judges

Dekle DJ Leung DH Zhu M 《心理学方法》2008,13(1):58-71

Across many areas of psychology, concordance is commonly used to measure the (intragroup) agreement in ranking a number of items by a group of judges. Sometimes, however, the judges come from multiple groups, and in those situations, the interest is to measure the concordance between groups, under the assumption that there is some within-group concordance. In this investigation, existing methods are compared under a variety of scenarios. Permutation theory is used to calculate the error rates and the power of the methods. Missing data situations are also studied. The results indicate that the performance of the methods depend on (a) the number of items to be ranked, (b) the level of within-group agreement, and (c) the level of between-group agreement. Overall, using the actual ranks of the items gives better results than using the pairwise comparison of rankings. Missing data lead to loss in statistical power, and in some cases, the loss is substantial. The degree of power loss depends on the missing mechanism and the method of imputing the missing data, among other factors. 相似文献

6.

Exploring Child and Parent Factors in the Diagnostic Agreement on the Anxiety Disorders Interview Schedule

Lena Reuterskiöld Lars-Göran Öst Thomas Ollendick 《Journal of psychopathology and behavioral assessment》2008,30(4):279-290

Worryingly low levels of parent–child agreement on child psychiatric diagnosis are reported. This study examined parent–child agreement on diagnostic categories and severity ratings with the Anxiety Disorders Interview Schedule, Child and Parents versions (ADIS-C/P). Children’s age, gender, motivation and self-concept and parent’s general psychopathology and diagnoses were examined. Participants were 110 children (aged 8–14 years) with a principal specific phobia diagnosis, and their parents. Findings revealed excellent parent–child agreement on principal specific phobia diagnosis (97.3%), and fair levels of concordance on most co-occurring secondary diagnoses. As expected, children with high motivation had generally stronger parent–child agreement on diagnoses and severity ratings (for ADHD p?p?p?相似文献

7.

How to Choose a Measure of Resilience: An Organizing Framework for Resilience Measurement

David M. Fisher Rebekah D. Law 《Psychologie appliquee》2021,70(2):643-673

Existing measures of resilience include disparate content that is based on qualitatively different conceptualizations of the construct. Consequently, there is confusion and inconsistency regarding the measurement and application of resilience. To promote clarity, the authors conduct an examination of resilience measurement approaches. This was accomplished by elaborating on three fundamentally distinct conceptualizations of resilience that can serve as an organizing framework for measurement; (1) attribute/resource-focused; (2) process-focused; and (3) outcome-focused. To verify the utility of this framework, qualified and trained subject matter experts (SMEs) completed a content analysis categorization task by sorting 227 items from 11 scales into these categories. Frame-of-reference (FOR) training was used to prepare the SMEs. Results were largely supportive of the three category framework, as overall SME agreement was 86.76 percent and only 10 of 227 items (4.41%) were categorized as “unclear” with regard to the categories. At the same time, SME agreement varied across the scales and sub-dimensions examined, suggesting that some scales/sub-dimensions are more conceptually clear than others in terms of the three categories. Based on the results, the authors provide guidance for how to choose a measure of resilience and discuss different workplace applications that are aligned with the aforementioned categories. 相似文献

8.

Two Theories of Visual Speed

W. KÖhler H. Wallach D. Cartwright 《The Journal of general psychology》2013,140(1):93-109

A measure of clustering in free recall based upon the parameters of the original stimulus list (R/Opt. R) was proposed and compared with five other measures for eight recall protocols. The R/Opt. R measure was shown to be relatively independent of the length of the recall protocol, but yielded higher scores where fewer categories were utilized. Optimum clustering was defined not simply in terms of perfect ordering of elements within the protocol but also in terms of the number of categories and items within those categories in the original stimulus list. 相似文献

9.

Monotonic measures of agreement for ranked data

Gordon R. Stavig 《The British journal of mathematical and statistical psychology》1984,37(2):283-287

The τ_b and y statistics are interpreted as rank-monotonic coefficients of partial agreement. Using a method of transposition employed by Pearson's r_i intraclass correlation coefficient, the τ_bi and y_i intraclass coefficients of total monotonic agreement are created. Transpositional measures of agreement like τ_bi and τ_i measure the combined effects of cell and marginal disagreement which make them particularly suitable for reliability studies. The coefficients are also made applicable to K > 2 sets of ranks. 相似文献

10.

On the existence of prelinguistic categories: A case study

Carolyn B. Mervis 《Infant behavior & development》1985,8(3):293-300

The purpose of this study was to investigate whether infants spontaneously form categories during the prelinguistic period and whether these categories are based on the same principles as adult basic-level categories. A new methodology, using a functional use measure as the determinant of category composition, was employed in a case study of one infant's horn category. Results indicated that this category met Schlesinger's (1982) criteria for a prelinguistic category and that the category was based on similarity relationships, as adult categories are. 相似文献

11.

Index to Volume 112

Donald W. Zimmerman Bruno D. Zumbo 《The Journal of general psychology》2013,140(4):409-412

For various nonnormal distributions, the power of the Student t test can be increased if continuous measures are transformed to ranks before the test is performed. The power of the test can also be increased almost as much and, even more for some distributions, if measures are replaced by dichotomous variables with the values 0 and 1, instead of ranks. Similarly, the power of a significance test of correlation can be increased if scores are transformed to ranks, that is, with the use of the Spearman rank correlation method. Power can also be increased almost as much and in some cases even more if dichotomous variables are introduced, that is, if the phi coefficient is used as a measure of correlation. 相似文献

12.

An implicit enumeration method for an exact test of weighted kappa

Michael J. Brusco Stephanie Stahl Douglas Steinley 《The British journal of mathematical and statistical psychology》2008,61(2):439-452

The kappa coefficient is one of the most widely used measures for evaluating the agreement between two raters asked to assign N objects to one of K nominal categories. Weighted versions of kappa enable partial credit to be awarded for near agreement, most notably in the case of ordinal categories. An exact significance test for weighted kappa can be conducted by enumerating all rater agreement tables with the same fixed marginal frequencies as the observed table, and accumulating the probabilities for all tables that produce a weighted kappa index that is greater than or equal to the observed measure. Unfortunately, complete enumeration of all tables is computationally unwieldy for modest values of N and K. We present an implicit enumeration algorithm for conducting an exact test of weighted kappa, which can be applied to tables of non‐trivial size. The algorithm is particularly efficient for ‘good’ to ‘excellent’ values of weighted kappa that typically have very small p‐values. Therefore, our method is beneficial for situations where resampling tests are of limited value because the number of trials needed to estimate the p‐value tends to be large. 相似文献

13.

A comparison of the Bem Sex-Role Inventory and the Heilbrun Masculinity and Femininity Scales

Small AC Erdwins C Gross RB 《Journal of personality assessment》1979,43(4):393-395

This study compares two instruments which have recently been devised to measure sex-role identification, Heilbrun's Masculinity and Femininity Scales and the Bem Sex-Role Inventory. Correlations between the masculine and feminine scales of these instruments were significant for male but not female subjects; intrascale comparisons found no relationship between the Bem scales but moderate correlations between the Heilbrun scales for male subjects. There was agreement between the two measures in classifying approximately 47% of the subjects into one of the four sex-role categories. Misclassification occurred primarily on categories which have been found to show considerable overlap in personality characteristics. 相似文献

14.

情绪形容词词义的模糊赋值 总被引：5，自引：0，他引：5

凤四海黄希庭《心理学报》2004,36(6):704-711

353名大学生被试用模糊赋值方法对表示喜、怒、哀、惧四种基本情绪的48个形容词的强度和复杂度作语义经验赋值。结果表明：①强度和复杂度是两个不同的情绪评定维度,两者间的情绪形容词模糊语义赋值结果没有必然联系。②在强度和复杂度赋值上,各词的取值基本符合人们日常对这些词所表达情绪体验的理解,性别的量表值存在较大一致性。③男、女生在赋值上的差异主要表现在各词位次和把握度的不同,这种差异可能与性别差异的情绪经验及个体差异有关。④大学生情绪形容词模糊语义赋值的把握度总体上都较高,各词的语义在两个维度上都存在一定的模糊性。⑤根据各词强度赋值隶属度数据计算的模糊距离测度和Phi方关联测度得到了四类形容词相似的聚类分析结果。相似文献

15.

Categorical invariance and structural complexity in human concept learning

Ronaldo Vigo 《Journal of mathematical psychology》2009,53(4):203-4859

An alternative account of human concept learning based on an invariance measure of the categorical stimulus is proposed. The categorical invariance model (CIM) characterizes the degree of structural complexity of a Boolean category as a function of its inherent degree of invariance and its cardinality or size. To do this we introduce a mathematical framework based on the notion of a Boolean differential operator on Boolean categories that generates the degrees of invariance (i.e., logical manifold) of the category in respect to its dimensions. Using this framework, we propose that the structural complexity of a Boolean category is indirectly proportional to its degree of categorical invariance and directly proportional to its cardinality or size. Consequently, complexity and invariance notions are formally unified to account for concept learning difficulty. Beyond developing the above unifying mathematical framework, the CIM is significant in that: (1) it precisely predicts the key learning difficulty ordering of the SHJ [Shepard, R. N., Hovland, C. L., & Jenkins, H. M. (1961). Learning and memorization of classifications. Psychological Monographs: General and Applied, 75(13), 1-42] Boolean category types consisting of three binary dimensions and four positive examples; (2) it is, in general, a good quantitative predictor of the degree of learning difficulty of a large class of categories (in particular, the 41 category types studied by Feldman [Feldman, J. (2000). Minimization of Boolean complexity in human concept learning. Nature, 407, 630-633]); (3) it is, in general, a good quantitative predictor of parity effects for this large class of categories; (4) it does all of the above without free parameters; and (5) it is cognitively plausible (e.g., cognitively tractable). 相似文献

16.

Some Paradoxical Results for the Quadratically Weighted Kappa 总被引：1，自引：0，他引：1

Matthijs?J.?Warrens Email author 《Psychometrika》2012,77(2):315-323

The quadratically weighted kappa is the most commonly used weighted kappa statistic for summarizing interrater agreement on an ordinal scale. The paper presents several properties of the quadratically weighted kappa that are paradoxical. For agreement tables with an odd number of categories n it is shown that if one of the raters uses the same base rates for categories 1 and n, categories 2 and n−1, and so on, then the value of quadratically weighted kappa does not depend on the value of the center cell of the agreement table. Since the center cell reflects the exact agreement of the two raters on the middle category, this result questions the applicability of the quadratically weighted kappa to agreement studies. If one wants to report a single index of agreement for an ordinal scale, it is recommended that the linearly weighted kappa instead of the quadratically weighted kappa is used. 相似文献

17.

Asking Adolescents to Explain Discrepancies in Self-Reported Suicidality

Drew M. Velting Jill H. Rathus Gregory M. Asnis 《Suicide & life-threatening behavior》1998,28(2):187-196

We present a typology of adolescents' most common explanations for discrepant reporting of suicidal behavior. Forty-eight adolescents provided attempt histories by completing a self-report measure of suicidality. A select number of items were subsequently readministered (average interval = 5 days) using a semistructured interview format. Discrepancies in reporting were found among 50% of the sample. Adolescents were also asked to clarify, using an open-ended format, what might have accounted for a particular discrepancy. Based on these responses, seven mutually exclusive and exhaustive categories were derived. High rates of interrater agreement indicated that these categories were reliable. 相似文献

18.

Modifications of the Poggendorff effect as a function of random dot textures between the verticals

Roberto Masini Tommaso Costa Mario Ferraro Angelo De Marco 《Attention, perception & psychophysics》1994,55(5):505-512

In the present research, we investigated the modification of the strength of the Poggendortf illusion as a function of different densities ofrandom dot textures failing the space between the verticals. The results ofExperiment 1 show that the illusory effect is a nonlinear function ofthe texture parameterr, the ratio of black pixels to white and black pixels, with a minimum forr=0.5, approximately, and a maximum forr=0 andr=1. The results may be interpreted by an analytical model of perceptual space dynamics, in which the effect dependson the amount of interaction between points of different light intensity. A computer simulation performed by applying the analytical model to different values ofr shows a good agreement between the predictions and the experimental data. To test the hypothesis underlying the model, a second experiment was carried out to measure the magnitude of the expansion ofthe space between the verticals as a function of the parameterr. The results are consistent with the hypothesis of the model. The overall data are discussed in terms of their implications on various theories proposed for the Poggeadorff illusion. 相似文献

19.

Daily Functioning, Health Status, and Happiness in Older Adults

Erik Angner Jennifer Ghandhi Kristen Williams Purvis Daniel Amante Jeroan Allison 《Journal of Happiness Studies》2013,14(5):1563-1574

The hypothesis that the degree to which disease disrupts daily functioning is inversely associated with happiness is widely accepted, yet existing literature offers little direct evidence in its support. This paper explores the hypothesized association in a community-based sample of 383 older adults. To assess the degree to which disease disrupts daily functioning we developed a measure—called the freedom-from-debility score—based on four Short Form-12 (SF-12) Health Survey questions explicitly designed to represent “limitations in physical activities because of health problems” and “limitations in usual role activities because of physical health problems.” The results were consistent with the hypothesis. When participants were divided into categories based on their freedom-from-debility score, median happiness scores were monotonically increasing across categories. Controlling for demographic and socio-economic factors as well as health status (measured both subjectively and objectively), a one-point increase in freedom-from-debility score (on a scale from 0 to 100) was associated with a three-percent reduction in the odds of lower-quartile happiness. The results support the contention that health status is one of the most influential predictors of happiness, that the association between health status and happiness depends greatly on the manner in which health status is measured, and that the degree to which disease disrupts daily functioning is inversely associated with happiness. 相似文献

20.

Test-retest reliability of the Mirowsky-Ross 2 x 2 Index of the Sense of Control

Wolinsky FD Wyrwich KW Metz SM Babu AN Tierney WM Kroenke K 《Psychological reports》2004,94(2):725-732

This study investigated the short-term stability of the 1991 Mirowsky-Ross 2 x 2 Index of the Sense of Control. From an ongoing longitudinal study, 304 subjects were randomly selected for test-retest interviews occurring 1 to 4 days after their regularly scheduled first follow-up interview. Test-retest reliability was assessed at the item level using percent agreement and weighted kappa. At the scale score level, reliability was assessed with the intraclass correlation coefficient (ICC). ICCs were also calculated within categories of demographic, socioeconomic, psychosocial, and functional status characteristics. There was moderate to substantial item-level agreement (mean weighted kappa = 51; weighted kappa range = .38 to .66). At the scale score level there was substantial agreement (ICC = .71). No appreciable differences in ICC values were found in the demographic, socioeconomic, psychosocial, and functional comparisons of status characteristics. Thus, this sense of control measure has acceptable test-retest reliability and is appropriate for use in longitudinal research. 相似文献