首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The practicality of three appraisal instruments were measured in terms of user preference, namely, behavioral observation scales (BOS), behavioral expectation scales (BES), and trait scales. A questionnaire containing items pertaining to differentiating good from poor performers, objectivity, providing feedback, suggesting training needs, and ease of use was administered to managers and their subordinates. In all instances, BOS were preferred to BES, and in all but two instances, BOS were viewed as superior to trait scales. Trait scales were felt to be as good if not better than BES. A second questionnaire administered to attorneys indicated that BOS would be easier to defend in the courtroom than either BES or trait scales.  相似文献   

2.
EFFECTS OF TRAINING AND RATING SCALES ON RATING ERRORS   总被引:1,自引:0,他引:1  
Ninety business students were randomly assigned to one of three conditions where they used behavioral observation scales (BOS), behavioral expectation scales (BES), or trait scales in observing people on videotape. Half the individuals received four hours of training to minimize rating errors. Rating errors were reduced significantly regardless of the rating scale that was used. However, behavioral criteria were more resistant to rating errors than trait scales. There was no significant difference between BOS and BES on this dimension. With regard to practicality, BOS were evaluated as significantly better than BES and trait scales. BES and trait scales did not differ significantly on this measure.  相似文献   

3.
Behavioral items ( N = 78) critical to the job success of logging supervisors were developed from 1204 critical incidents, the frequency with which a supervisor ( N = 300) engaged in each behavior was rated on a 5-point Likert type scale by two sets of observers. A factor analysis reduced the items to 38 and 33, respectively, for the two sets of observers which in turn constituted 10 and 11 factors or criteria for performance evaluation purposes. Multiple regression equations based on composite scores were used to predict cost-related measures of logging crew effectiveness. The shrinkage in Rs after double cross-validation was moderately small. Moreover, the behavioral observation scales (BOS) that were developed by factor analyzing the observation ratings had moderately high reliability and accounted for more variance in the cost-related measures than did the BOS developed by traditional judgmental clustering techniques. The similarities and differences between BOS and BES procedures are discussed.  相似文献   

4.
The item and scale factor structure of the Basic Personality Inventory (BPI) was examined in a sample of 486 offenders incarcerated for violent and sexual crimes. Separate principal-component analyses of the items for each of the 11 clinical scales, critical item scale, and social desirability scale indicated a one-dimensional factor solution for all scales except Depression and Persecutory Ideation. The Depression scale's two factors were Hopelessness and Depressive Affect and the Persecutory Ideation scale's two factors were General Paranoia and Perception of External Control. Although the factors for these two scales may assist in interpretation, the correlations between the factors and the total score of their respective scale were high. Confirmatory factor analysis of the 220 items from the 11 clinical scales supported the factorial logic of the scoring key. Analysis of the 11 clinical scales resulted in two factors: General Psychopathology/Adjustment and Antisocial Orientation. The results suggest that all but two scales can be viewed as unidimensional thereby allowing for a straightforward clinical interpretation. These analyses support the internal structure of the BPI and lend credence to external validity work with forensic populations.  相似文献   

5.
The Body Esteem Scale (BES; Franzoi and Shields 1984) has been a primary research tool for over 30 years, yet its factor structure has not been fully assessed since its creation, so a two-study design examined whether the BES needed revision. In Study 1, a series of principal components analyses (PCAs) was conducted using the BES responses of 798 undergraduate students, with results indicating that changes were necessary to improve the scale’s accuracy. In Study 2, 1237 undergraduate students evaluated each BES item, along with a select set of new body items, while also rating each item’s importance to their own body esteem. Body items meeting minimum importance criteria were then utilized in a series of PCAs to develop a revised scale that has strong internal consistency and good convergent and discriminant validity. As with the original BES, the revised BES (BES-R) conceives of body esteem as both gender-specific and multidimensional. Given that the accurate assessment of body esteem is essential in better understanding the link between this construct and mental health, the BES-R can now be used in research to illuminate this link, as well as in prevention and treatment programs for body-image issues. Further implications are discussed.  相似文献   

6.
In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often ignore item misfit in score scale calibrations. We also seek to obtain improved model-data fit estimates when calibrating international score scales. To this end, we examine the use of two alternative score scale calibration procedures: (a) a language-based score scale and (b) a more parsimonious international scale wherein a large proportion of international parameters are used with a subset of country-based parameters for items that misfit in the international scale. In our analyses, we used data from all 40 countries participating in the Progress in International Reading Literacy Study. Our findings revealed that current score scale calibration procedures yield large numbers of misfitting items (higher than 25% for some countries). Our proposed approach diminished the effects of proportion of item misfit on score scale calibrations and also yielded enhanced model-data fit estimates. These results lead to enhancing confidence in measurements obtained from international large-scale assessments.  相似文献   

7.
This research examines the processes respondents use to answer personality test items. A total of 158 true/false items from four scales of the Personality Research Form and the California Psychological Inventory were used as stimuli. University students (N = 120) responded to each item and indicated one of nine strategies used in deciding on a response. Obtained response strategy ratings for items were reliable and their frequencies corresponded closely to previous findings with other items. Subsequently, the associations between item response strategy frequencies and item-total correlations were computed. Congruent with previous research, better items avoided behaviours or experiences and evoked responding based on traits and on referring to the statements of others. The associations between item response strategies and other indices of item quality are discussed and implications regarding scale development are offered.  相似文献   

8.
We studied the effects of faking biodata test items by randomly warning 214 of 429 applicants for a nurse's assistant position against faking. While the warning mitigated the propensity to fake, the specific warning effects depended on item transparency. For transparent items, warning reduced the extremeness of item means and increased item variances. For nontransparent items, warning did not have an effect on item means and reduced item variances. These faking effects were best predicted when transparency was operationalized in terms of item-specific job desirability in addition to the item-general social desirability. We also demonstrated a psychometric principle: The effect of warning on means at the item level is preserved in scales constructed from those items, but the effect on variances at the item level is masked at the scale level. These results raise new questions regarding the attenuating effects of faking on validity, and regarding the benefit of warning applicants against faking.  相似文献   

9.
We propose three latent scales within the framework of nonparametric item response theory for polytomously scored items. Latent scales are models that imply an invariant item ordering, meaning that the order of the items is the same for each measurement value on the latent scale. This ordering property may be important in, for example, intelligence testing and person-fit analysis. We derive observable properties of the three latent scales that can each be used to investigate in real data whether the particular model adequately describes the data. We also propose a methodology for analyzing test data in an effort to find support for a latent scale, and we use two real-data examples to illustrate the practical use of this methodology.  相似文献   

10.
ISSUES AND STRATEGIES FOR REDUCING THE LENGTH OF SELF-REPORT SCALES   总被引:3,自引:0,他引:3  
Greater understanding of the complex interrelationships among work-relevant constructs has increased the number of constructs on organizational surveys. Good psychometric practice also dictates the use of multiple items per construct. The net result has been longer surveys. Longer surveys take more time to complete, tend to have more missing data, and have higher refusal rates than short surveys. Arguably, then, techniques for reducing the length of scales while maintaining psychometric quality are worthwhile. Little guidance exists on how to reduce the length of a multi-item scale and we argue that the most common technique, maximizing internal consistency, is problematic and should be avoided. We present a set of item "quality indices" to help conceptualize the competing issues that influence item retention decisions. Statistical analysis of an example case using these indices suggested that there are 3 key aspects of item quality to consider when reducing a scale. We describe strategies that can assist scale developers in using these 3 aspects of item quality when making scale reduction decisions.  相似文献   

11.
A method of analysis specifically designed for binary data was applied to 100 MMPI items. Sixty, items were carefully chosen to represent the nine major clinical scales with respect to direction of keying, social desirability scale value and endorsement frequency. The remaining 40 items were randomly chosen from items not appearing on any of these scales. Although a complete solution was obtained in five dimensions, only three were retained. The three dimensions were related to scale membership, gender of the respondent and various item characteristics. The results clearly support the two major MMPI factors obtained on a scale level and additionally show a strong gender dimension.  相似文献   

12.
双因子模型:多维构念测量的新视角   总被引:1,自引:0,他引:1       下载免费PDF全文
顾红磊  温忠粦  方杰 《心理科学》2014,37(4):973-979
双因子模型是一种既有全局因子又有局部因子的模型,近年来有了许多应用。本文讨论了双因子模型和高阶因子模型在数学模型、参数之间的关系,概念上和应用上的差异;概述了双因子模型在信度研究、平衡量表、探索性因子分析和项目反应理论中的应用。作为例子,在Rosenberg自尊量表结构的研究中,通过双因子模型分析了自尊特质效应与项目表述方法效应。  相似文献   

13.
The problem of how many performance outcomes to use and how specific they should be in predicting satisfaction and behavioral intentions was addressed. 323 soldiers responded to a desirability and instrumentality scale for each of 16 potential outcomes obtainable from outstanding performance. Scores were factor analysed and composites were formed to reflect each dimension. Three criteria (satisfaction, perceived effort and intention to reenlist) were predicted using (a) all 16 outcome items, (b) only 11 items defining four outcome dimensions and (c) 4 items only, each item reflecting an outcome dimension. In all cases, the 11-item set was a better predictor than the 16-item set, and the 4-item set was nearly as effective as the 16-item set. Instrumentalities were found to be significantly better predictors of satisfaction than of effort, while the reverse was true of valences. It was suggested that adequacy of coverage of the out come domain rather than list length or outcome specificity, was the critical issue in improving predictability.  相似文献   

14.
Responses to the Minnesota Multiphasic Personality Inventory (MMPI) were assessed with respect to their relevance to schema theory. The relation between scores on self-reported personality dimensions and the speed of processing test items associated with each dimension was examined. With previously derived factor analytic content scales, negative correlations were obtained between scale scores and mean latencies for endorsing relevant items, and positive correlations were found between scale scores and mean latencies for rejecting relevant items. A similar analysis completed on the traditional clinical scales revealed no such pattern. Results were interpreted as supporting the conceptualization of item responding as a content-based, schema-relevant process.  相似文献   

15.
Responses to the Minnesota Multiphasic Personality Inventory (MMPI) were assessed with respect to their relevance to schema theory. The relation between scores on self-reported personality dimensions and the speed of processing test items associated with each dimension was examined. With previously derived factor analytic content scales, negative correlations were obtained between scale scores and mean latencies for endorsing relevant items, and positive correlations were found between scale scores and mean latencies for rejecting relevant items. A similar analysis completed on the traditional clinical scales revealed no such pattern. Results were interpreted as supporting the conceptualization of item responding as a content-based, schema-relevant process.  相似文献   

16.
In assessments of attitudes, personality, and psychopathology, unidimensional scale scores are commonly obtained from Likert scale items to make inferences about individuals' trait levels. This study approached the issue of how best to combine Likert scale items to estimate test scores from the practitioner's perspective: Does it really matter which method is used to estimate a trait? Analyses of 3 data sets indicated that commonly used methods could be classified into 2 groups: methods that explicitly take account of the ordered categorical item distributions (i.e., partial credit and graded response models of item response theory, factor analysis using an asymptotically distribution-free estimator) and methods that do not distinguish Likert-type items from continuously distributed items (i.e., total score, principal component analysis, maximum-likelihood factor analysis). Differences in trait estimates were found to be trivial within each group. Yet the results suggested that inferences about individuals' trait levels differ considerably between the 2 groups. One should therefore choose a method that explicitly takes account of item distributions in estimating unidimensional traits from ordered categorical response formats. Consequences of violating distributional assumptions were discussed.  相似文献   

17.
Abstract

A scale to measure defensiveness about the marital relationship and another to measure defensiveness about the sexual relationship of couples were developed for each sex. Defensiveness was defined as the tendency to endorse socially desirable items which are unlikely to occur and deny socially undesirable items which characterize most honest responders. The social desirability scale value of the items was empirically determined, and a traditional cross-validation design with two independent groups was used in the item analyses. Cronbach's alpha reliability coefficient ranged between. 75 and 93 for the male and female versions of the scales. The scales correlated higher with another defensiveness scale than with a social desirability scale. Clinicians' ratings of the items in the scales suggested that the scales were not diagnostic of sexual or marital psychopa-thology. Evidence is presented to support that these content specific scales surpass a global defensiveness scale as a measure of defensiveness regarding the sexual or marital relationship of couples.  相似文献   

18.
Latham, Fay, and Saari (1979) discussed the development of behavioral observation scales (BOS) for the appraisal of foremen and managers. They present arguments and data in support of BOS and made several conclusions regarding the relative superiority of BOS versus other appraisal formats, particularly behavioral expectation scales. This critique of their article suggests that most of their conclusions are beset by conceptual problems, questionable statistical analyses, and/or a disregard of previous research.  相似文献   

19.
Little research has been conducted on the psychometrics of the very short scale (36 items) of the Children's Behavior Questionnaire, and no one-item temperament scale has been tested for use in applied work. In this study, 237 United States caregivers completed a survey to define their child's behavioral patterns (i.e., Surgency, Negative Affectivity Effortful Control) using both scales. Psychometrics of the 36-item Children's Behavior Questionnaire were examined using classical test theory, principal factor analysis, and item response modeling. Classical test theory analysis demonstrated adequate internal consistency and factor analysis confirmed a three-factor structure. Potential improvements to the measure were identified using item response modeling. A one-item (three response categories) temperament scale was validated against the three temperament factors of the 36-item scale. The temperament response categories correlated with the temperament factors of the 36-item scale, as expected. The one-item temperament scale may be applicable for clinical use.  相似文献   

20.
Given a hint from Lang, Bradley, and Cutbert's (1997) defense cascade, two cognitive processes, instead of passive versus active behavioral coping, which seem to have differential effects on the provocation of vascular- versus cardiac-dominant reaction pattern during mental stress were advocated: attention (Attent) versus unpleasant affect (UnplAff). Based on this notion the Attention-Affect Check List (AACL) was developed as a self-report measure. In addition, items on uncontrollability (Uncontr) were prepared for the purpose of checking whether heightened Attent and UnplAff are accompanied by alterations in Uncontr. Two hundred and eighty-four students underwent two kinds of mental stress, which seemed to specifically heighten Attent and UnplAff. Four factors with four items each were extracted from the AACL item pool: concentrated and allocated Attent, UnplAff, and pleasant affect. Also, one factor with four items was extracted from the Uncontr item pool. For both the mental stresses, each scale, although very brief, had quite reasonable alpha reliability. Accountability of each scale for the total variance was reasonably high. Some problems are discussed in relation to the validity of AACL.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号