期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Correcting Differences in Answer Sheets for the 1980 Armed Services Vocational Aptitude Battery Reference Population

《Military psychology》2013,25(3):157-169

In the late 1970s, the Department of Defense (DoD) requested that the reference population for the Armed Services Vocational Aptitude Battery (ASVAB), at that time based on a 1944 sample, be changed and updated to reflect the current youth population. In 1980, the data for the new reference group were collected. Data analyses indicate that the new sample's speeded test scores are atypically low; therefore, the sample might be inappropriate as a reference. The problem was traced to the format of an answer sheet used in the 1980 youth population data collection. In our study, we tested the differences between performance on the 1980 youth study answer sheet and that used operationally for the ASVAB. An adjustment was developed to resolve those differences. Data for this study were scores on two ASVAB speeded tests of about 9,500 service applicants at Military Entrance Processing Stations (MEPS). Half of these applicants were tested on the operational ASVAB answer sheet; half were tested with the answer sheet used in the 1980 youth study. The speeded test scores derived from the 1980 youth study answer sheets were then equated to those derived from the operational ASVAB answer sheets. Adjustments based on these equatings resolved the speeded test anomalies observed in the 1980 youth study. 相似文献

2.

Scoring and interpreting the MMPI with a desk-top calculator

Rolf R. Engel Gabriele Kunze 《Behavior research methods》1979,11(3):317-320

The development of a hospital-based routine system for off-line scoring and interpreting the MMPI with a small desk-top computer (HP 9830) is described. Main features of the system include the use of optical mark reader (OMR) cards instead of ordinary answer sheets and a software monitoring system allowing operator-independent processing. In terms of flexibility, speed, and economy, the system compares favorably with other computer-based test systems. 相似文献

3.

The effect of variations in answer-sheet format on aptitude test performance

SEAN BOYLE 《Journal of Occupational & Organizational Psychology》1984,57(4):323-326

Three different answer sheets were used in the administration of the General Aptitude Test Battery to groups of young adults (n = 302). The answer sheets, designed for use with different brands of Optical Mark Reader, varied in the shape of the response space. A two-factor, repeated measures analysis of variance revealed a significant interaction between subtests and type of answer sheet. Further analyses showed that performance on ‘speed’ tests was affected by the format of answer sheet used, whereas performance on mainly ‘power’ tests was not. The implications of the results for testing practice are discussed. 相似文献

4.

Anger proneness in women: Development and validation of the anger situation questionnaire

Stephanie H. M. van Goozen Nico H. Frijda Merel Kindt Nanne E. De Van Poll 《Aggressive behavior》1994,20(2):79-100

In the present series of studies existing hostility and anger trait questionnaires were examined. We found them to be unsuitable for our subjects and purpose, A new questionnaire based on novel principles, the Anger Situation Questionnaire (ASQ), was developed to measure anger proneness in women. The ASQ consists of 33 vignettes or scenarios, Each vignette has three dimensions: emotional experience, intensity of emotional experience, and action readiness mode. The ASQ was administered to 146 female students. Out of this sample two extreme groups were selected of 30 subjects each: one group scoring low on self-reported anger and angry readiness, the other group scoring high on both aspects. An anger-induction paradigm was developed consisting of essentially three elements: a physically aversive situation, the performance of some frustrative tasks, as well as an unpleasantly acting female experimenter. The ASQ was validated in this paradigm: subjects scoring high on the ASQ became more angry due to these manipulations; moreover, in most of the subjects at one moment or another a state of anger was induced. © 1994 wiley-Liss, Inc. 相似文献

5.

Data-driven type checking in open domain question answering

Stefan Schlobach David Ahn Maarten de Rijke Valentin Jijkoun 《Journal of Applied Logic》2007,5(1):121-143

Many open domain question answering systems answer questions by first harvesting a large number of candidate answers, and then picking the most promising one from the list. One criterion for this answer selection is type checking: deciding whether the candidate answer is of the semantic type expected by the question. We define a general strategy for building redundancy-based type checkers, built around the notions of comparison set and scoring method, where the former provide a set of potential answer types and the latter are meant to capture the relation between a candidate answer and an answer type. Our focus is on scoring methods. We discuss nine such methods, provide a detailed experimental comparison and analysis of these methods, and find that the best performing scoring method performs at the same level as knowledge-intensive methods, although our experiments do not reveal a clear-cut answer on the question whether any of the scoring methods we consider should be preferred over the others. 相似文献

6.

Social comparison after success and failure: Biased search for information consistent with a self-serving conclusion

Tom Pyszczynski Jeff Greenberg John LaPrelle 《Journal of experimental social psychology》1985,21(2):195-211

Based on the traditional and attributional perspectives on social comparison, it was hypothesized that the search for social comparison information after performance outcomes is biased so as to provide evidence consistent with a favorable self-evaluation. In Experiment 1, subjects were led to believe that they obtained 16 or 8 out of 20 items correct on a bogus social sensitivity test and were then led to expect that most other students performed either well or poorly on the test. They were then given the opportunity to inspect up to 50 scored answer sheets from previous subjects. Consistent with the hypothesis, failure subjects requested more information when they expected it to reveal that most students performed poorly than when they expected it to reveal that most students performed well; success subjects showed little interest in this additional information, regardless of their expectancies as to what it would reveal. Experiment 2 employed a different approach to manipulating performance outcomes and led subjects to expect that most other subjects performed better, the same, or worse than themselves. Regardless of their own performance, subjects showed the least interest in additional information in the higher score expectancy condition and the most interest in additional information in the lower score expectancy condition. The role that this information search bias may play in producing self-serving attributions for success and failure and maintaining positive self-evaluations was discussed. 相似文献

7.

Psychometric Evaluation of an Alternate Scoring for the Remote Associates Test

Marie Beisemann Boris Forthmann Paul-Christian Bürkner Heinz Holling 《创造性行为杂志》2020,54(4):751-766

The Remote Associates Test (RAT; Mednick, 1962; Mednick & Mednick, 1967) is a commonly employed test of creative convergent thinking. The RAT is scored with a dichotomous scoring, scoring correct answers as 1 and all other answers as 0. Based on recent research into the information processing underlying RAT performance, we argued that the dichotomous scoring may lead to a loss of potentially relevant information. Thus, we proposed an alternate scoring based on semantic similarity between the answer given by the participant and the correct solution using Latent Semantic Analysis (LSA; Landauer & Dumais, 1997). We evaluate the psychometric properties of the alternate LSA scoring and found evidence of construct validity for the LSA scoring which was comparable to findings for the standard scoring, but not better as we would have expected. Thus, our expectations that LSA-based scoring of the RAT counteracts potential information loss were not met. However, LSA based scorings appear to be a promising alternative for hardly solvable RAT items. We conducted additional analyses comparing different RAT item types with regard to their validity as well as evaluating the information uniquely contained in the LSA scoring. Implications of all finding for existing research using RAT items are discussed. 相似文献

8.

A subset selection technique for scoring items on a multiple choice test

Jean D. Gibbons Ingram Olkin Milton Sobel 《Psychometrika》1979,44(3):259-270

On a multiple-choice test in which each item hask alternative responses, the test taker is permitted to choose any subset which he believes contains the one correct answer. A scoring system is devised that depends on the size of the subset and on whether or not the correct answer is eliminated. The mean and variance of the score per item are obtained. Methods are derived for determining the total number of items that should be included on the test so that the average score on all items can be regarded as a good measure of the subject's knowledge. Efficiency comparisons between conventional and the subset selection scoring procedures are made. The analogous problem ofr > 1 correct answers for each item (withr fixed and known) is also considered.The authors are grateful to M. Aitkin, C. Coombs, F. Lord, and the reviewers for their comments and suggestions. 相似文献

9.

Replication of the Self-Concept and Identity Measure (SCIM) Among a Treatment-Seeking Sample

Erin A. Kaufman Megan E. Puzia Sheila E. Crowell Cynthia J. Price 《Identity: An International Journal of Theory and Research》2019,19(1):18-28

Identity distress occurs within a variety of psychiatric conditions. Reliable tools for assessing identity-related functioning among clinical populations are greatly needed. The Self-Concept and Identity Measure (SCIM) is a brief self-report scale designed to assess healthy and disturbed identity dimensions. This measure has been validated within normative but not treatment-seeking samples. The present study used an a priori confirmatory approach to replicate the SCIM’s factor structure among disadvantaged women enrolled in treatment for chemical dependence (N = 216). The original three-factor structure and item loadings generally replicated within this diagnostically diverse, significantly impaired sample. Higher SCIM scores were also associated with other problems, such as emotion dysregulation and depression. Results support the SCIM’s use and scoring with clinical populations. 相似文献

10.

Response selection strategies and realism of confidence judgments

Carl Martin Allwood Henry Montgomery 《Organizational behavior and human decision processes》1987,39(3)

Two studies were conducted to examine how response selection strategy is related to confidence ratings and to performance on general knowledge questions. In both studies subjects were asked to answer 80 general knowledge questions and to rate their confidence in the correctness of the answer selected. A pilot study, in which subjects thought aloud while answering general knowledge questions, was carried out to identify different response selection strategies. In the first study, 40 subjects were asked to indicate which of four strategies (immediate recognition, inference, intuition, or guessing) they used for selecting an answer. In Study 2, think aloud reports from 20 subjects were coded into the same four strategies. The distribution of strategies differed between the studies, but there were very similar relations among strategy, confidence, and correctness of answer in the two studies. Response selection strategy was related to correctness of answer when confidence was partialed out. More specifically, immediate recognition was associated with higher proportion correct than with the other strategies. It was also found that ratings of how difficult the knowledge questions were to fellow students of the subjects were on a much more realistic level than the confidence ratings were. It is concluded that people could improve their confidence judgments by taking into account (a) how difficult a question is to other people, and (b) the response selection strategy used for answering the question. 相似文献

11.

The creative abilities of children with social and emotional problems

Kathleen D. Paget 《Journal of abnormal child psychology》1982,10(1):107-111

The specific creative abilities of children with social and emotional problems were at issue. The children's responses on the Torrance Tests of Creative Thinking were compared to those from the standardization sample. Thirty-eight emotionally disturbed children constituted the initial sample, and 40 emotionally disturbed children made up a cross-validation sample. The children in both samples were close to the average standardization score in their ability to arrive at a number of different ideas, experienced some difficulty in coming up with original ideas, and were substantially below the average in the other areas of creativity. Presenting particular difficulty for the disturbed children was the area of elaboration, that is, the addition of details to ideas. The discussion focused on comparisons with learning-disabled children. 相似文献

12.

Toward an objective evaluation procedure of the Kinetic Family Drawings (KFD).

D V Myers 《Journal of personality assessment》1978,42(4):358-365

The feasibility of employing a quantitative scoring procedure for evaluating the Kinetic Family Drawings (KFD) was examined. A quantitative scoring procedure was developed from the clinical hypotheses Burns and Kaufman (1970, 1972) to score 21 measurable KFD styles, actions, and characteristics. The scoring procedure was employed to evaluate 116 KFDs obtained from four groups of boys to determine the effectiveness of the procedure to differentiate among two levels of emotional adjustment and the two levels of age. The results indicated that four of seven sets of extracted component scores significantly differed between the emotionally well-adjusted and the emotionally disturbed groups. One set of component scores significantly differed between the younger and the older groups, while two sets of component scores did not differ among any of the four groups. The KFD total score was found to differ significantly only between the young emotionally disturbed and the young emotionally well-adjusted groups. It was concluded that a quantitative scoring procedure for the KFD is feasible. 相似文献

13.

A Proposed Number Correct Scoring Procedure Based on Classical True-Score Theory and Multidimensional Item Response Theory

《International Journal of Testing》2013,13(2):131-141

A hybrid procedure for number correct scoring is proposed. The proposed scoring procedure is based on both classical true-score theory (CTT) and multidimensional item response theory (MIRT). Specifically, the hybrid scoring procedure uses test item weights based on MIRT and the total test scores are computed based on CTT. Thus, what makes the hybrid scoring method attractive is that this method accounts for the dimensionality of the test items while test scores remain easy to compute. Further, the hybrid scoring does not require large sample sizes once the item parameters are known. Monte Carlo techniques were used to compare and contrast the proposed hybrid scoring method with three other scoring procedures. Results indicated that all scoring methods in this study generated estimated and true scores that were highly correlated. However, the hybrid scoring procedure had significantly smaller error variances between the estimated and true scores relative to the other procedures. 相似文献

14.

Technique for weighting of choices and items on I.B.M. scoring machines

Grossman Sergeant David 《Psychometrika》1944,9(2):101-105

A technique has been developed which permits the weighting of responses of test items on the I. B. M. scoring machine on the initial scoring, heretofore impossible. This is done by making the length of the response lines on the answer sheet longer or shorter as weights are needed. It is anticipated that this method will prove useful wherever differential weighting serves to increase the validity of tests. 相似文献

15.

A thematic coding system for the intimacy motive

Dan P. McAdams 《Journal of research in personality》1980,14(4):413-432

A new thematic (TAT) measure of intimacy motivation was developed and cross-validated in four separate arousal studies using three different college populations. A brief sketch of the derived thematic scoring system for intimacy motivation was presented. The goal state of the intimacy motive was defined as experiencing a warm, close, and communicative exchange with another person. In a college sample, subjects scoring high on the intimacy motive were rated by friends and acquaintances as significantly more “warm,” “natural,” “sincere,” “loving,” and “appreciative” and less “dominant,” “outspoken,” and “self-centered” than subjects scoring lower. The results were discussed in terms of the theories of Sullivan on the need for interpersonal intimacy, Maslow on growth motivation and “B-love,” Bakan on communion, and Buber on the I-Thou relation. Differences between the new coding system and the need for Affiliation (n Aff) system for scoring imaginative productions were also suggested. 相似文献

16.

A field experiment in interpersonal persuasion using authoritative influence

Richard Centers Robert William Shomer Aroldo Rodrigues 《Journal of personality》1970,38(3):392-403

Thinking that they were simply being interviewed as participants in a public opinion survey, a portion of a cross-sectional adult sample of 1,275 persons were unwitting subjects of an experiment in interpersonal persuasion After committing themselves on the question how the legal machinery should deal with a specific case of a juvenile lawbreaker, subjects were given an argument presented as the view of an expert and contrary to their own It was hypothesized that subjects scoring higher on a scale measuring authoritarianism would more commonly change their opinion in the advocated direction than would those scoring lower The hypothesis received substantial confirmation Two secondary hypotheses also were sustained. These were (a) that persons assigning the locus of causality of juvenile delinquency to the individual himself rather than to circumstances beyond his control would have higher scores on authoritarianism than persons assigning causality to these latter conditions, and (b) that persons scoring higher in authoritarianism would initially more commonly recommend a harsher treatment of the delinquent than would those scoring lower in this trait 相似文献

17.

The use of configural analysis for the evaluation of test scoring methods

H. G. Osburn Ardie Lubin 《Psychometrika》1957,22(4):359-371

A method based on configural analysis has been given whereby test scoring techniques can be evaluated to see if they have optimal validity. Configural analysis has also been used to show how three well known item scoring techniques, multiple regression, total score, and multiple cut-off, imply (for optimal validity) certain conditions on the answer pattern means. The method is illustrated by a worked example.We are indebted to Professor James G. Taylor for his helpful suggestions. 相似文献

18.

Comparing continuous and dichotomous scoring of the balanced inventory of desirable responding

Stöber J Dette DE Musch J 《Journal of personality assessment》2002,78(2):370-389

The Balanced Inventory of Desirable Responding (BIDR; Paulhus, 1994) is a widely used instrument to measure the 2 components of social desirability: self-deceptive enhancement and impression management. With respect to scoring of the BIDR, Paulhus (1994) authorized 2 methods, namely continuous scoring (all answers on the continuous answer scale are counted) and dichotomous scoring (only extreme answers are counted). In this article, we report 3 studies with student samples, and continuous and dichotomous scoring of BIDR subscales are compared with respect to reliability, convergent validity, sensitivity to instructional variations, and correlations with personality. Across studies, the scores from continuous scoring (continuous scores) showed higher Cronbach's alphas than those from dichotomous scoring (dichotomous scores). Moreover, continuous scores showed higher convergent correlations with other measures of social desirability and more consistent effects with self-presentation instructions (fake-good vs. fake-bad instructions). Finally, continuous self-deceptive enhancement scores showed higher correlations with those traits of the Five-factor model for which substantial correlations were expected (i.e., Neuroticism, Extraversion, and Conscientiousness). Consequently, these findings indicate that continuous scoring may be preferable to dichotomous scoring when assessing socially desirable responding with the BIDR. 相似文献

19.

What one intelligence test measures: a theoretical account of the processing in the Raven Progressive Matrices Test 总被引：20，自引：0，他引：20

P A Carpenter M A Just P Shell 《Psychological review》1990,97(3):404-431

The cognitive processes in a widely used, nonverbal test of analytic intelligence, the Raven Progressive Matrices Test (Raven, 1962), are analyzed in terms of which processes distinguish between higher scoring and lower scoring subjects and which processes are common to all subjects and all items on the test. The analysis is based on detailed performance characteristics, such as verbal protocols, eye-fixation patterns, and errors. The theory is expressed as a pair of computer simulation models that perform like the median or best college students in the sample. The processing characteristic common to all subjects is an incremental, reiterative strategy for encoding and inducing the regularities in each problem. The processes that distinguish among individuals are primarily the ability to induce abstract relations and the ability to dynamically manage a large set of problem-solving goals in working memory. 相似文献

20.

Estimating the reliability of interview data

Joseph L. Fleiss 《Psychometrika》1970,35(2):143-162

A model for a score based on an interview is presented which identifies the effect due to the subject, to the manner in which the interviewer tends to conduct his interviews, to the criteria he tends to use in scoring subjects' responses, to the compromises he tends to adopt between the demands of interviewing and those of scoring, and to chance errors. A suggested experimental design calls for each ofK investigators to interview a different sample ofN subjects, but for all investigators to score each subject. The drawing of inferences when interest is only in theK participants in the reliability study is considered, and a numerical example is given.This work was supported in part by grant DE R01 00793 from the National Institute of Dental Research, and in part by grants MH 08534 and MH 09191 from the National Institute of Mental Health, and forms part of the author's Ph.D. dissertation at Columbia University. The guidance provided by Professor T. W. Anderson is gratefully acknowledged. 相似文献