期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Analysis of unreplicated three-way classifications,with applications to rater bias and trait independence

Julian C. Stanley 《Psychometrika》1961,26(2):205-219

The seven analysis-of-variance mean squares for an unreplicated three-way classification may be written as linear combinations of a mean variance and three mean covariances. Formulas are presented for computing the mean variances and mean covariances from linear combinations of mean squares. The relevance of these formulas for assessing rater biases and trait independence is discussed, a numerical example is provided, and proposed extensions are briefly noted.The research reported herein was performed pursuant to a contract with the United States Office of Education, Department of Health, Education, and Welfare. The assistance of Sister M. Jacinta Mann, S. C., at one stage of this investigation is gratefully acknowledged. 相似文献

2.

A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF

Robin Shealy William Stout 《Psychometrika》1993,58(2):159-194

A model-based modification (SIBTEST) of the standardization index based upon a multidimensional IRT bias modeling approach is presented that detects and estimates DIF or item bias simultaneously for several items. A distinction between DIF and bias is proposed. SIBTEST detects bias/DIF without the usual Type 1 error inflation due to group target ability differences. In simulations, SIBTEST performs comparably to Mantel-Haenszel for the one item case. SIBTEST investigates bias/DIF for several items at the test score level (multiple item DIF called differential test functioning: DTF), thereby allowing the study of test bias/DIF, in particular bias/DIF amplification or cancellation and the cognitive bases for bias/DIF.This research was partially supported by Office of Naval Research Cognitive and Neural Sciences Grant N0014-90-J-1940, 4421-548 and National Science Foundation Mathematics Grant NSF-DMS-91-01436. The research reported here is collaborative in every respect and the order of authorship is alphabetical. The assistance of Hsin-hung Li and Louis Roussos in conducting the simulation studies was of great help. Discussions with Terry Ackerman, Paul Holland, and Louis Roussos were very helpful. 相似文献

3.

Capturing rater policies for processing evaluation data

Zedeck S Kafry D 《Organizational behavior and human performance》1977,18(2):269-294

相似文献

4.

Why convene rater teams: An investigation of the benefits of anticipated discussion,consensus, and rater motivation

Sylvia G. Roch 《Organizational behavior and human decision processes》2007

This study explores the importance of anticipated group discussion, the consensus decision rule, and rater motivation in determining how well rater teams identify ratee behaviors, i.e., behavioral accuracy. Results, based on 382 raters in 111 teams, suggest that the anticipation of group discussion can improve behavioral accuracy, but it appears that the benefits of discussion-only teams are limited to this anticipation effect. Furthermore, it also appears that rater motivation plays an important role in this type of team. Rater teams required to reach consensus, however, appear to show improved behavioral accuracy, regardless of whether raters can anticipate the consensus discussion and regardless of rater motivation levels. Implications, especially for assessment centers, are discussed. 相似文献

5.

A system for assessing personal responsibility: validity, reliability and rater trainability

Genthner RW Jones DE 《Journal of personality assessment》1976,40(3):269-275

Summary: Sixty-two subjects completed the California Psjrchological Inventory, the Rotter External-Internal locus of control scale and an audio-taped discussion of their personal problems. The audio-taped problems were rated on a five-point level of personal responsibility scale and were compared wi.th the scores on the California Personality Inventory and the Internal-External locus of control scale in a correlation matrix which was subjected to a factor analysis. The results from these analyses suppovted the hypothesis that the Personal Responsibility Rating System has construct validity as a measure of psychological health. Study I1 assessed the trainability of the Personal Responsibility System. With a four-hour training program it was found that graduate students could be taught to rate personal responsibility in a reliable manner. 相似文献

6.

A person-fit index for polytomous rasch models,latent class models,and their mixture generalizations

Matthias?von?Davier Email author Ivo?W.?Molenaar 《Psychometrika》2003,68(2):213-228

A normally distributed person-fit index is proposed for detecting aberrant response patterns in latent class models and mixture distribution IRT models for dichotomous and polytomous data.This article extends previous work on the null distribution of person-fit indices for the dichotomous Rasch model to a number of models for categorical data. A comparison of two different approaches to handle the skewness of the person-fit index distribution is included.Major parts of this paper were written while the first author worked at the Institute for Science Education, Kiel, Germany. Any opinions expressed in this paper are those of the authors and not necessarily of Educational Testing Service. The results presented in this paper were improved by valuable comments from J. Rost, K. Yamamoto, N.D. Verhelst, E. Bedrick and two anonymous reviewers. 相似文献

7.

Proactive interference in aging: A model-based study

Archambeau Kim Forstmann Birte Van Maanen Leendert Gevers Wim 《Psychonomic bulletin & review》2020,27(1):130-138

Psychonomic Bulletin & Review - Proactive interference occurs when previously learned information interrupts the storage or retrieval of new information. Congruent with previous reports,... 相似文献

8.

A simple computational algorithm of model-based choice preference

Asako Toyama Kentaro Katahira Hideki Ohira 《Cognitive, affective & behavioral neuroscience》2017,17(4):764-783

A broadly used computational framework posits that two learning systems operate in parallel during the learning of choice preferences—namely, the model-free and model-based reinforcement-learning systems. In this study, we examined another possibility, through which model-free learning is the basic system and model-based information is its modulator. Accordingly, we proposed several modified versions of a temporal-difference learning model to explain the choice-learning process. Using the two-stage decision task developed by Daw, Gershman, Seymour, Dayan, and Dolan (2011), we compared their original computational model, which assumes a parallel learning process, and our proposed models, which assume a sequential learning process. Choice data from 23 participants showed a better fit with the proposed models. More specifically, the proposed eligibility adjustment model, which assumes that the environmental model can weight the degree of the eligibility trace, can explain choices better under both model-free and model-based controls and has a simpler computational algorithm than the original model. In addition, the forgetting learning model and its variation, which assume changes in the values of unchosen actions, substantially improved the fits to the data. Overall, we show that a hybrid computational model best fits the data. The parameters used in this model succeed in capturing individual tendencies with respect to both model use in learning and exploration behavior. This computational model provides novel insights into learning with interacting model-free and model-based components. 相似文献

9.

Indexing systematic rater agreement with a latent-class model

Schuster C Smith DA 《心理学方法》2002,7(3):384-395

A latent-class model of rater agreement is presented for which 1 of the model parameters can be interpreted as the proportion of systematic agreement. The latent classes of the model emerge from the factorial combination of the "true" category in which a target belongs and the ease with which raters are able to classify targets into the true category. Several constrained cases of the model are described, and the relations to other well-known agreement models and kappa-type summary coefficients are explained. The differential quality of the rating categories can be assessed on the basis of the model fit. The model is illustrated using data from diagnoses of psychiatric disorders and classifications of individuals in a persuasive communication study. 相似文献

10.

Testing the Rasch model by means of the mixture fit index

《The British journal of mathematical and statistical psychology》2006,59(1):89-95

Rudas, Clogg, and Lindsay (RCL) proposed a new index of fit for contingency table analysis. Using the overparametrized two‐component mixture, where the first component with weight 1?w represents the model to be tested and the second component with weight w is unstructured, the mixture index of fit was defined to be the smallest w compatible with the saturated two‐component mixture. This index of fit, which is insensitive to sample size, is applied to the problem of assessing the fit of the Rasch model. In this application, use is made of the equivalence of the semi‐parametric version of the Rasch model to specifically restricted latent class models. Therefore, the Rasch model can be represented by the structured component of the RCL mixture, with this component itself consisting of two or more subcomponents corresponding to the classes, and the unstructured component capturing the discrepancies between the data and the model. An empirical example demonstrates the application of this approach. Based on four‐item data, the one‐ and two‐class unrestricted latent class models and the one‐ to three‐class models restricted according to the Rasch model are considered, with respect to both their chi‐squared statistics and their mixture fit indices. 相似文献

11.

Optimal choice of rater teams II: Applications

Janet Dixon Elashoff Donald E. Spiegel 《Psychometrika》1969,34(1):33-44

In a previous paper [Elashoff 1969], we derived optimal rater teams for a particular formulation of the dichotomous rater problem. Here, we describe a computer-based procedure for selecting good rater teams in practice; we apply the procedure to the selection of items for a psychological inventory. This research was supported in part by the author's predoctoral fellowship from the National Institutes of Health and by National Science Foundation Grant GS-341, and National Institutes of Health Grants FR-3 and FR-122. 相似文献

12.

Intraclass correlations: uses in assessing rater reliability 总被引：52，自引：0，他引：52

Shrout PE Fleiss JL 《Psychological bulletin》1979,86(2):420-428

Reliability coefficients often take the form of intraclass correlation coefficients. In this article, guidelines are given for choosing among six different forms of the intraclass correlation for reliability studies in which n target are rated by k judges. Relevant to the choice of the coefficient are the appropriate statistical model for the reliability and the application to be made of the reliability results. Confidence intervals for each of the forms are reviewed. 相似文献

13.

Comparative analysis of three approaches for rater agreement

Ato M Benavente A López JJ 《Psicothema》2006,18(3):638-645

相似文献

14.

A frequentist interpretation of probability for model-based inductive inference

Aris Spanos 《Synthese》2013,190(9):1555-1585

The main objective of the paper is to propose a frequentist interpretation of probability in the context of model-based induction, anchored on the Strong Law of Large Numbers (SLLN) and justifiable on empirical grounds. It is argued that the prevailing views in philosophy of science concerning induction and the frequentist interpretation of probability are unduly influenced by enumerative induction, and the von Mises rendering, both of which are at odds with frequentist model-based induction that dominates current practice. The differences between the two perspectives are brought out with a view to defend the model-based frequentist interpretation of probability against certain well-known charges, including [i] the circularity of its definition, [ii] its inability to assign ‘single event’ probabilities, and [iii] its reliance on ‘random samples’. It is argued that charges [i]–[ii] stem from misidentifying the frequentist ‘long-run’ with the von Mises collective. In contrast, the defining characteristic of the long-run metaphor associated with model-based induction is neither its temporal nor its physical dimension, but its repeatability (in principle); an attribute that renders it operational in practice. It is also argued that the notion of a statistical model can easily accommodate non-IID samples, rendering charge [iii] simply misinformed. 相似文献

15.

Optimal choice of rater terms I: Theory

Janet Dixon Elashoff 《Psychometrika》1969,34(1):21-32

How can an investigator choose a good team of raters to use for measuring a continuous variable when each available rater produces only dichotomous responses? We formulate an underlying model, define an index of goodness for rater teams in terms of average mean square error of the estimate, develop a new estimator and derive the optimal rater terms. The optimal raters have characteristic curves which are linear in form and satisfy the requirements for a Guttman scale. 相似文献

16.

A positivity bias in person memory

Klaus Fiedler Ulrike Fiadung Uli Hemmeter 《European journal of social psychology》1987,17(2):243-246

Two experiments demonstrate a positivity bias in person memory. Recall is superior for statements endorsed by a target person than for denied statements. This effect of informational positivity is independent of affective positivity (Experiment 1) and on holds for statements associated with one individual as an organizing category (Experiment 2). 相似文献

17.

A negativity bias in interpersonal evaluation

Teresa M. Amabile Ann H. Glazebrook 《Journal of experimental social psychology》1982,18(1):1-22

Two studies were conducted to demonstrate a bias toward negativity in evaluations of persons or their work in particular social circumstances. In Study 1, subjects evaluated materials written by peers. Those working under conditions that placed them in low status relative to the audience for their evaluations, or conditions that made their intellectual position within a group insecure, showed a clear bias toward negativity in those evaluations. Only individuals who believed their audience to be of relatively low status and at the same time believed their intellectual position to be secure did not show this bias. In Study 2, subjects viewed a videotape of a stimulus person and rated him on several intellectual and social dimensions. Again, subjects believed their audience to be of either relatively high or relatively low status. As a cross dimension, they were given instructions to focus on either the intellectual or the social abilities of the stimulus person while viewing the videotape. A strong main effect of audience status was demonstrated, but only in ratings of intellectual traits; subjects who believed their audience to be of relatively high status rated the stimulus person's intellectual qualities significantly more negatively. Moreover, this effect was independent of the instructional focus subjects had been given. The negativity bias is discussed in the context of previous demonstrations of biases toward weighting negative information more heavily than positive information, as well as previous demonstrations of seemingly pervasive positivity biases in memory and judgment. 相似文献

18.

A biased view of liberal bias

Campbell RS Gibbs BN Guinn JS Josephs RA Newman ML Rentfrow PJ Stone LD 《The American psychologist》2002,57(4):297-298

相似文献

19.

Sample size determinations for the two rater kappa statistic

V. F. Flack A. A. Afifi P. A. Lachenbruch H. J. A. Schouten 《Psychometrika》1988,53(3):321-325

This paper gives a method for determining a sample size that will achieve a prespecified bound on confidence interval width for the interrater agreement measure,. The same results can be used when a prespecified power is desired for testing hypotheses about the value of kappa. An example from the literature is used to illustrate the methods proposed here. 相似文献

20.

A Vocational interest test minus sex bias

Charles F Elton Harriett A Rose 《Journal of Vocational Behavior》1975,7(2):207-214

The Vocational Preference Inventory responses from 290 subjects (110 males and 180 females) were subjected to a Rasch item analysis, one of a class of latent trait models. After elimination of 22 items which did not fit the model, a sex-free form of the VPI was obtained. Group interest scale scores are presented for each of the Holland scales and data are produced which indicate that no violence was done to the Holland coding system. 相似文献