Maximum validity of a test with equivalent items   总被引:1,自引:0,他引:1  
It is assumed that a scale of true scores on a function exists and that the probability of answering an item correctly is a curve of the type of the integral of the normal curve. The product moment correlation between the test score and true score is derived for a normal distribution of subjects and a test composed of equivalent items. Numerical examples demonstrate that the maximum correlation between test scores and true scores occurs for a one hundred item test when the point correlation between items is less than three tenths.  相似文献   

When subjects attempt to fake psychopathology on the MMPI, scores on subtle subscales tend to be lower than those of nonfaking subjects. Our study hypothesized that this paradox comes about because the subtle subscales have no predictive validity, but their face validity for psychopathology is the opposite of the keyed direction for psychopathology. Subjects who attempt to fake psychopathology do so on the basis of item content and thus achieve lower rather than higher scores. Three groups of 80 undergraduates took the MMPI under regular, faking-good, or faking-bad instructions. As expected, faking-bad subjects scored significantly lower than regular subjects on the 100 most subtle items, and this was due to their responses to those. 73 of the items whose face validity was misleading. The results are consistent with other work showing valid uses of subtle items in detecting deception.  相似文献   

We investigated the differential responding of 100 male inmates to subtle and obvious MCMI scale items; subtlety was determined by judgments by college students. It has been predicted that item subtlety would be positively correlated with item endorsement. This prediction was supported across all 175 MCMI items, as well as across items on 7 of 8 personality and 8 of 12 clinical scales. It had also been predicted that education and intelligence would moderate the relationship between subtlety and endorsement, with inmates higher in education and intelligence demonstrating a greater tendency than other inmates to avoid obvious items. Modest support was obtained for this prediction, with statistically significant results found for 4 personality and 5 clinical scales. The significance of the subtle-obvious distinction is discussed, especially when employing the MCMI with an inmate population.  相似文献   

Individual Psychological Assessment (IPA) is a very widely offered service in Organizational Psychology. It generally consists of a psychologist or HR practitioner using a combination of interview and psychometrics to arrive at a detailed assessment of an individual's capabilities in relation to a job they are being considered for. Although much used, this practice has limited supporting evidence of its validity—not least because of the methodological difficulties in conducting research on this subject—and has been criticized accordingly. The current study examines the use of IPAs with 115 middle and senior management level candidates in a civil service context. All candidates completed a set of psychometric measures and had an in‐depth interview with a psychologist as part of a standardized process. The ratings made by the assessors were correlated with a criterion measure of potential for promotion derived from multisource feedback ratings obtained on these candidates some months later. Analysis of the results indicated that three of the four attributes rated by assessors correlated significantly with the criterion measure. Further, assessors’ ratings were found to show incremental validity over that provided by psychometric test scores alone. These findings are discussed in terms of the use of IPAs in senior level assessment.  相似文献   

The therapeutic relationship is the source of major concepts in psychoanalytic clinical theory. Such concepts as resistance, transference, countertransference, and the alliance are fundamental, even though there may be shifts in meaning between theoretical schools and clinical contexts. In the clinical psychoanalytic literature, disagreement exists over the nature of the alliance and its essential components. Empirical studies using reliable patient, therapist, and observer scales to assess the alliance demonstrate a correlation with psychotherapeutic gains. In the study reported here, thirteen patients were followed for 6 to 33 months of psychodynamic psychotherapy, during which time their views of the therapeutic relationship were assessed, and several experiential measures taken, all on a weekly basis. Statistical analyses reveal that the therapeutic relationship, as reflected in the patients' weekly responses to the St. Louis Therapeutic Relationship Rating Scale, has four distinct components: therapeutic alliance, resistance, transference love, and negative transference. On a week-by-week basis, the therapeutic alliance was the strongest predictor of improvement in patient-reported general adjustment, as reflected in such areas as self-esteem, positive affect, social relations, work productivity, satisfaction, and optimism. Time plots of the variables show the typical time course for the components of the therapeutic relationship, as well as for improvement on the experiential variables. Results indicate that the therapeutic alliance, transference, and resistance are central components of the psychotherapeutic relationship, which in turn predict the ongoing life experience of the patient.  相似文献   

Typical selection or classification testing programs should provide for improvement of the predictive efficiency of the test battery. Such provision calls for the administration of experimental tests along with the operational battery administration and follow-up analysis to determine the value of the experimental material. It is possible to determine without waiting for criterion data what the validity of the experimental test must be in order to improve the battery validity. The method together with the proof is presented.  相似文献   

Extensive research has been conducted demonstrating the predictive validity and reliability of the Implicit Association Test (IAT) for a broad array of behaviors and contexts. However, less work has been done examining its underlying construct validity. This contribution focuses on examining whether a core theoretical foundation of the IAT paradigm is valid, specifically, whether the IAT effect draws on the social knowledge structure (SKS). We present four studies within different domains that show that the IAT does indeed appear to draw on the SKS. The data show that activation of the self before the categorization task enhances the predictive validity of the IAT, as one would expect if the IAT reflects the SKS. We discuss theoretical reasons for these findings, with emphasis also on underlying statistical/psychometric issues.  相似文献   

In selection research and practice, there have been many attempts to correct scores on noncognitive measures for applicants who may have faked their responses somehow. A related approach with more impact would be identifying and removing faking applicants from consideration for employment entirely, replacing them with high-scoring alternatives. The current study demonstrates that under typical conditions found in selection, even this latter approach has minimal impact on mean performance levels. Results indicate about .1 SD change in mean performance across a range of typical correlations between a faking measure and the criterion. Where trait scores were corrected only for suspected faking, and applicants not removed or replaced, the minimal impact the authors found on mean performance was reduced even further. By comparison, the impact of selection ratio and test validity is much larger across a range of realistic levels of selection ratios and validities. If selection researchers are interested only in maximizing predicted performance or validity, the use of faking measures to correct scores or remove applicants from further employment consideration will produce minimal effects.  相似文献   

Repeated measures on multivariate responses can be analyzed according to either of two models: a doubly multivariate model (DMM) or a multivariate mixed model (MMM). This paper reviews both models and gives three new results concerning the MMM. The first result is, primarily, of theoretical interest; the second and third have implications for practice. First, it is shown that, given multivariate normality, a condition called multivariate sphericity of the covariance matrix is both necessary and sufficient for the validity of the MMM analysis. To test for departure from multivariate sphericity, the likelihood ratio test can be employed. The second result is an approximation to the null distribution of the likelihood ratio test statistic, useful for moderate sample sizes. Third, for situations satisfying multivariate normality, but not multivariate sphericity, a multivariate correction factor is derived. The correction factor generalizes Box's and can be used to construct an adjusted MMM test.I am grateful to an anonymous referee for carefully attending to the mathematical details of this paper.  相似文献   

Prospection is associated, in varying degrees, with a sense that imagined events will (or will not) happen in the future—referred to as belief in future occurrence. The present research investigated to what extent this belief is justified and predicts the actual occurrence of events in the future. In two studies, participants rated their belief in the future occurrence of events imagined to happen in the coming month (Study 1) or week (Study 2), and the actual occurrence of events was then assessed. Results showed that the odds of event occurrence were about 2 times higher with an increase of 1 unit on the belief scale. Belief was particularly pronounced for temporally close events and was largely determined by the congruence of events with autobiographical knowledge. These results suggest that belief in future occurrence has some truth value and may inform decisions and actions.  相似文献   

This study demonstrates the use of two web-based programs, one to identify video preferences and the other to assess their reinforcing effects. We used the Multiple-Stimulus-Without-Replacement Preference Assessment Tool (MSWO PAT) to identify the video preference hierarchies of seven participants, ages 4–11 years old. We then used a customized reinforcer assessment program that arranged a concurrent-chains preparation with programmed conjugate schedules of reinforcement. Button presses emitted by participants modulated the quality (volume and opacity) of selected videos on a moment-to-moment basis, allowing us to identify the reinforcing effects of the videos in little time. The results showed that the preference assessment had predictive value for five of seven participants. We discuss the MSWO PAT, parameters that may affect the identification of preferences and the use of conjugate schedules to identify reinforcers.  相似文献   

This pilot study was conducted to determine if clinically-oriented test items are judged to be more offensive than job-related test items. Clinical tests typically ask more personal questions while employment tests usually ask job-relevant questions. A random selection of items from three employment tests was analyzed. Two of the tests evolved from clinical-personality tests, while the third test was designed specifically for employment settings. The results suggest that if companies are interested in using employment tests that are perceived as being job-relevant, inoffensive, and non-invasive, then they should consider selecting tests that include job-relevant items as opposed to tests that are derivatives of clinical assessment instruments.  相似文献   

Recently, concern has arisen that meta-analyses overestimate the effects of psychological therapies and that those therapies may not work under clinically representative conditions. This meta-analysis of 90 studies found that therapies are effective over a range of clinical representativeness. The projected effects of an ideal study of clinically representative therapy are similar to effect sizes in past meta-analyses. Effects increase with larger dose and when outcome measures are specific to treatment. Some clinically representative studies used self-selected treatment clients who were more distressed than available controls, and these quasi-experiments underestimated therapy effects. This study illustrates the joint use of fixed and random effects models, use of pretest effect sizes to study selection bias in quasi-experiments, and use of regression analysis to project results to an ideal study in the spirit of response surface modeling.  相似文献   

In order to raise the predictive efficiency of its college entrance test battery, the Educational Testing Service is working on the development of non-academic measures to supplement the standard aptitude and achievement examinations. A test of difficult number series problems was set up to measure persistence by tempting the students to give up early; the students were informed that some of the problems had no solution, and that full credit would be received by so marking them. This test was tried out and found to have some correlation with grades, while having no correlation with the other tests. Adding this test to the battery showed an appreciable rise in the battery's multiple correlation with grades.  相似文献   

For continuous distributions associated with dichotomous item scores, the proportion of common-factor variance in the test,H 2, may be expressed as a function of intercorrelations among items.H 2 is somewhat larger than the coefficienta except when the items have only one common factor and its loadings are restricted in value. The dichotomous item scores themselves are shown not to have a factor structure, precluding direct interpretation of the Kuder-Richardson coefficient,r K-R, in terms of factorial properties. The value ofr K-R is equal to that of a coefficient of equivalence,H 2 , when the mean item variance associated with common factors equals the mean interitem covariance. An empirical study with synthetic test data from populations of varying factorial structure showed that the four parameters mentioned may be adequately estimated from dichotomous data.This study was supported in part by an Air Force project (Contract Number AF18(600-170), monitored by the Crew Research Laboratory, Air Force Personnel and Training Research Center, Randolph Air Force Base, Randolph Field, Texas. Permission is granted for reproduction, translation, publication, use and disposal in whole and in part by or for the United States Government. Further support was given by the Northwestern University Graduate School. The computational assistance of Mr. Norman Miller is acknowledged. Professor Meyer Dwass provided mathematical advice both directly and indirectly relevant to the paper.  相似文献   

The subjects (60 boys) were drawn from the sample of a longitudinal study of social development and represented extremely aggressive, anxious, constructive, and submissive behaviour at the age of 8. They were presented with three question series concerning (1) their responses to aggressive attacks; (2) reactions in frustration situations presented in short stories; and (3) their aggressive initiatives. In each series the type of aggressive behaviour, attacker, victim, and other situational factors were systematically varied. In series 2 the type of response, open-ended or forced-choice, was also varied. The results showed that the most valid way of studying boys' self-observations on their aggressive behaviour was to ask if they attack somebody without a specific reason (series 3). This correlated with contemporaneous overt aggression at the age of 8 and predicted aggressiveness and various characteristics of antisocial aggressive development at the ages of 14 and 19. Self-observations on one's physical aggression were more valid for ratings of overt aggressiveness than on verbal aggression. The open-ended or forced-choice type of response did not affect the validity of aggressive responses. Of the categories of nonaggression, ‘conciliatory responses’ had the highest concurrent and predictive validity for constructiveness and other indicators of strong self-control.  相似文献   

