首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An equation is derived for predicting the effect of chance success, relative to item difficulty, on item-test correlation. The values predicted by this equation and by equations derived by Guilford and Carroll for predicting the effect of chance success on item difficulty and test reliability are compared with empirical values in an experiment which used identical test items in multiple-choice and answer-only form.Condensation of a dissertation presented in partial fulfillment of the requirements for the Ph.D. degree to the University of Chicago. Grateful acknowledgment is made to Professor Harold Gulliksen for his guidance as thesis advisor and to Professor L. L. Thurstone and Dr. D. W. Fiske of the University of Chicago who served as members of the thesis committee. The author is also indebted to Professor S. S. Wilks for review of the derivations and development of statistical tests used in the thesis, to Dr. L. R Tucker for technical advice, and to Dr. W. G. Mollenkopf for critical comments on the derivations and interpretations. The writer expresses appreciation to the Educational Testing Service for making available its technical facilities, and to the University of Chicago for the flexible administrative arrangement which made this thesis possible.  相似文献   

2.
TAYLOR CW 《Psychometrika》1950,15(4):391-406
For any fixed total time of testing it is possible, through proper item-and-time allotment, to combine tests into a battery so that the multiple correlation with a pre-assigned criterion will be maximized. By holding constant the ratio of the length in number of items to the time length for each test, a set of general equations has been derived which will yield this maximum value of the multipleR and will enable one to determine, in any given case, the optimal fraction of total testing time that should be devoted to each type of test under consideration. The set of general equations is applied to a two-test-battery problem to obtain the optimal length of each type of test for one hour total testing time. If two other tests had been selected for the two-test sample problem, different subdivisions of the total time would generally occur. The manner in which the results would change when using other tests with different initial reliability, validity, and intercorrelation values is briefly presented. Some general implications of this method of battery development are also discussed.The writer is indebted to Max Woodbury for his assistance and especially to Dr. N. J. F. Van Steenberg and Dr. Anna S. Henriques, who provided valuable guidance and aid in the development of the solution to this problem. This paper is a revision of a thesis submitted in 1939 at the University of Utah in partial fulfillment of the requirements for the master's degree.  相似文献   

3.
HORST P 《Psychometrika》1949,14(2):79-88
If the lengths of the tests in a battery are altered, their intercorrelations and their validities or correlations with a criterion are also altered. Consequently, the multiple correlation of the battery with the criterion will also be altered. These changes are a function of the reliabilities of the tests. Suppose we have given from a set of experimental data (1) the time allowed for each test in the battery, (2) the reliability of each test, (3) the intercorrelations, and (4) the validities of all the tests. If we specify the over-all testing time we are willing to allow for the test in the future, we can determine the amount by which each test must be altered in order to give the maximum multiple correlation with the criterion. The method is presented, together with numerical examples and the mathematical proof.  相似文献   

4.
WhenK tests are given toN individuals, and for each individual there are two criterion measures, then (1) the multiple regression weight to be applied to the standard score for each test to predict the criterion-difference score equals the difference of the weights for predicting each criterion separately; (2) the difference between the predicted scores equals the predicted difference (each test being assigned the appropriate multiple regression weight); (3) the square of the multiple correlation between predicted and actual criterion-difference scores equals the sum of squares of the multiple correlations of the battery with each criterion less the product of these correlations and the correlation between predicted scores all divided by twice the quantity one minus the criterion intercorrelation; and (4) the variance of errors of estimating the criterion-difference score equals the sum of the variances of errors of estimating each criterion score minus twice the criterion intercorrelation, plus twice the correlation between predicted scores multiplied by the product of the square root of one minus the variance of errors of estimating one criterion and the corresponding square root for the second criterion.The author wishes to express his appreciation for the suggestions and guidance given by Dr. Harold Gulliksen in the preparation of this article. He also wishes to acknowledge the helpful comments of Dr. Paul Horst and Dr. Ledyard Tucker on certain phases of the development.  相似文献   

5.
Several theorems concerning properties of the communaltiy of a test in the Thurstone multiple factor theory are established. The following theorems are applicable to a battery ofn tests which are describable in terms ofr common factors, with orthogonal reference vectors.1. The communality of a testj is equal to the square of the multiple correlation of testj with ther reference vectors.2. The communality of a testj is equal to the square of the multiple correlation of testj with ther reference vectors and then—1 remaining tests. Corollary: The square of the multiple correlation of a testj with then—1 remaining tests is equal to or less than the communality of testj. It cannot exceed the communality.3. The square of the multiple correlation of a testj with then—1 remaining tests equals the communality of testj if the group of tests containsr statistically independent ests teach with a communality of unity.4. With correlation coefficients corrected for attenuation, when the number of tests increases indefinitely while the rank of the correlational matrix remains unchanged, the communality of a testj equals the square of the multiple correlation of testj with then—1 remaining tests.5. With raw correlation coefficients, it is shown in a special case that the square of the multiple correlation of a testj with then—1 remaining tests approaches the communality of testj as a limit when the number of tests increases indefinitely while the rank of correlational matrix remains the same. This has not yet been proved for the general case.The author wishes to express his appreciation of the encouragement and assistance given him by Dr. L. L. Thurstone.  相似文献   

6.
Subjects were 30 fourth grade children with average intellectual ability but reading achievement at least 1.5 years below grade level. Each child was given two word-recognition lists, the first one as a pretest and the second list under one of three different experimental conditions: control, positive reinforcement (1 nickel for each word read correctly), and response cost (1 of 40 nickels taken back for each word read incorrectly). Relative to the control condition, positive reinforcement led to a significant increase in response latency but no change in errors, while response cost led to both a significant increase in latency and a significant decrease in reading errors. The entire group was found to be impulsive on the Matching Familiar Figures test. The successful reduction in impulsive reading errors was interpreted as support for Kagan's hypothesis that the impulsive child evidences low concern about errors on such academic tasks.This report is based on a senior honors thesis by D. E. B., which was the 1977 winner of the Dashiell-Thurstone Prize for the best undergraduate honors thesis in psychology at the University of North Carolina at Chapel Hill Appreciation is expressed to the following persons for their assistance or comments: Dr. W. Anderson, Ms. D. Crew, Ms. C. Earp, Ms. N. Hardy, Dr. K. Jens, Dr. K. Fleishman, Dr. B. Martin, Dr. G. Mesibov, Mr. S. Muller, Ms. E. Pritchett, Mr. Wall, Ms. Wall, and Ms. M. Walton. The research was supported in part by U. S. Public Health Service, Maternal and Child Health Project No. 916, and by Grants HD-03110 and ES-01104 from the National Institutes of Health.  相似文献   

7.
For an amount-limit test homogeneous as to content and varied as to difficulty it is established that an individual's number-right score and his limen score as estimated by the constant process are mathematically related. The experimental and the theoretic relationship between normal deviate and limen score are shown to be in good agreement. It is also found that the two methods of evaluating individual test performance yield equally reliable sets of scores for the procedures used. Accordingly where the assumptions basic to the relationship obtain, the more conveniently computed raw score may be considered to be as valid and reliable an index of individual test performance as the limen score. The concept of the dispersion parameter of the individual as a measure of change or error in test score found no experimental verification. Estimates of individual variability are unrelated to differences in score on equivalent forms.The writer gratefully acknowledges Lt. Colonel M. W. Richardson's invaluable counsel, Dr. H. Gulliksen's helpful suggestions, and Dr. H. H. Long's aid in administering the tests.  相似文献   

8.
Jöreskog  K. G. 《Psychometrika》1962,27(4):335-354
A method for estimation in factor analysis is presented. The method is based on the assumption that the residual (specific and error) variances are proportional to the reciprocal values of the diagonal elements of the inverted covariance (correlation) matrix. The estimation is performed by a modification of Whittle's least squares technique. The method is independent of the unit of scoring in the tests. Applications are given in the form of nine reanalyses of data of various kinds found in earlier literature.The writer wishes to thank Prof. H. Wold, Dr. E. Lyttkens, and Dr. P. Whittle for valuable comments and suggestions.  相似文献   

9.
Music ability     
Two batteries of music tests were factored by the centroid method. From each battery three oblique factors were extracted and in each case were tentatively identified as tonal sensitivity, retentivity (memory for elements), and memory for form. The correlations of the music tests of one battery with subtests of Cattell's intelligence test and with tests of a literary nature are also reported.Karlin, J. E. A multiple factor analysis of musicality. M. A. thesis, University of Cape Town, 1939.Drake, R. M. A factorial analysis of music tests by the Spearman tetrad-difference technique.J. Musicology, 1939, 1, 1.  相似文献   

10.
Speeded and unspeeded tests of vocabulary, spatial relations, and arithmetic reasoning were factorially analyzed, together with certain reference tests and academic grades. Lawley's maximum likelihood method was used, the computations being carried out on the Whirlwind electronic computer. Four different speed factors were isolated, together with a second-order general speed factor. Consistent small positive correlations between the academic grades and the speed factors were found.The writer is indebted to Dr. John French, to Dr. David Saunders, and especially to Dr. Ledyard R Tucker for helpful suggestions and theoretical advice throughout the course of this study. The active cooperation of Dr. William Shields, Educational Advisor, and of many others at the United States Naval Academy at Annapolis has been invaluable. The author is very grateful to Dr. P. Youtz and Dr. C. W. Adams for the opportunity to use Whirlwind I, a high-speed computer sponsored by the Office of Naval Research, and to Dr. H. Denman for help is programming and in putting the program on the computer. He also wishes to express his deep appreciation to Dr. Hubert Brogden and Miss Bertha Harper of The Adjutant General's Office for the opportunity to use their matrix rotator and for helpful guidance in its operation.  相似文献   

11.
Epistemic actions are physical actions people take to simplify internal problem solving rather than to move closer to an external goal. When playing the video game Tetris, for instance, experts routinely rotate falling shapes more than is strictly needed to place the shapes. Maglio and Kirsh [Kirsh, D., & Maglio, P. (1994). On distinguishing epistemic from pragmatic action. Cognitive Science, 18, 513-549; Maglio, P. P. (1995). The computational basis of interactive skill. PhD thesis, University of California, San Diego] proposed that such actions might serve the purpose of priming memory by external means, reducing the need for internal computation (e.g., mental rotation), and resulting in performance improvements that exceed the cost of taking additional actions. The present study tests this priming hypothesis in a set of four experiments. The first three explored precisely the conditions under which priming produces benefits. Results showed that presentation of multiple orientations of a shape led to faster responses than did presentation of a single orientation, and that this effect depended on the interval between preview and test. The fourth explored whether the benefit of seeing shapes in multiple orientations outweighs the cost of taking the extra actions to rotate shapes physically. Benefits were measured using a novel statistical method for mapping reaction-time data onto an estimate of the increase in processing capacity afforded by seeing multiple orientations. Cost was measured using an empirical estimate of time needed to take action in Tetris. Results showed that indeed the increase in internal processing capacity obtained from seeing shapes in multiple orientations outweighed the time to take extra actions.  相似文献   

12.
A method is presented for converting the scores on one form of a test to those on another form of the same test. The method is particularly applicable to the case where each form has been administered to a different group and the only link between the two forms is a subset of items common to both. The proposed method, called theitem method of conversion, has been applied to several tests for which other methods of conversion are available for comparison. The necessary data are limited to tests for which the total score is the criterion for item analyses. The method gives highly satisfactory results for all the tests to which it has been applied, particularly when the two groups are rather different, in which case the delta method (a different item method) is inappropriate.The authors are only two of a group, including W. H. Angoff, F. M. Lord, and M. K. Schultz, all of whom have made important contributions to this paper.  相似文献   

13.
As usually interpreted, the standard error of measurement is assumed to be constant throughout the test-score range. In this investigation the standard error of measurement was assumed to be not higher than a second-degree function of the test score. By conceiving a test score to be made up of the scores on two parallel tests, an equation was derived for predicting the standard error of measurement from the test score. In the derivation the corresponding first four moments of the score distributions for the parallel tests were assumed to be identical, and certain errors of estimate involved in predicting the second test score from the first were assumed to be uncorrelated with powers of the score on the first test. An empirical verification was carried out, using nine synthetic tests and a 1000-case sample, and showed good agreement between predicted and observed results. The findings indicated that the standard error of measurement was constant only for a symmetrical, mesokurtic distribution of scores.This study was carried out while the author was a National Research Council Predoctoral Fellow in psychology at Princeton University.The author wishes to express his appreciation for the guidance given by his thesis adviser, Professor Harold Gulliksen. He wishes also to acknowledge his gratitude to the Educational Testing Service for extensive assistance in the empirical phase of the study, and to Dr. Ledyard Tucker for suggesting efficient methods of handling special computational problems.  相似文献   

14.
This paper explores certain problems which arise within the context of the theory of generalizability put forward by Cornbach, Rajaratnam, and Gleser. In particular, a formal explication of their theory for the single observation is given, and the various coefflcients of generalizability which they define are related to the estimation of universe scores.This work is based on a Master's thesis submitted to the University of Illinois in 1963. The author is particularly indebted to Dr. Lee J. Cornbach and Dr. Ledyard R Tucker for their unstinted advice and help.  相似文献   

15.
There is disagreement among researchers about whether IQ tests or divergent thinking (DT) tests are better predictors of creative achievement. Resolving this dispute is complicated by the fact that some research has shown a relationship between IQ and DT test scores (e.g., Runco & Albert, 1986; Wallach, 1970). The present study conducted meta‐analyses of the relationships between creative achievement and both IQ and DT test scores. The analyses included 17 studies (with 5,544 participants) that established the correlation coefficients between IQ and creative achievement and 27 studies (with 47,197 participants) that established the correlation coefficients between DT test scores and creative achievement. Marginal, but statistically significant, Fisher's Z‐transformed correlation coefficients were revealed. The analysis found a significantly higher relationship between DT test scores and creative achievement (r = .216) than between IQ test scores and creative achievement (r = .167). The differences in the correlation coefficients were explained by differences in DT tests, creative achievement types, predicted time periods, and creativity subscales. The significant independent moderator effect for different DT tests indicates that the Torrance Tests of Creative Thinking (TTCT) predict creative achievement better than any other DT test included in this study. Among the creative achievement types, music is predicted the best by IQ and all others are predicted best by DT tests. Among the time periods evaluated, the relationship between DT test scores and creative achievement had the highest correlation at the period of 11–15 years.  相似文献   

16.
Tests designed to measure what was conceived to be attention were found by factor analysis to involve a factor which is independent of the factors of rote memory, visual-space, number, or perception. To a large extent, at least, this attention factor is independent of content and of mode of presentation of test material. The tests in which the variance is mainly dependent upon this factor are those involving a high degree of sustained or relatively continuous mental effort.An abstract of a thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Psychology in the Graduate School of the University of Illinois, 1942.The writer wishes to express his gratitude to Professor Herbert Woodrow who suggested the present problem and under whose guidance the investigation was conducted. A further debt of gratitude is acknowledged to the enlisted men at the Air Corps Technical School at Chanute Field, Illinois, who served as subjects and to Capt. T. W. Harrell and Capt. Richard Faubion.  相似文献   

17.
Ayala Cohen 《Psychometrika》1986,51(3):379-391
A test is proposed for the equality of the variances ofk 2 correlated variables. Pitman's test fork = 2 reduces the null hypothesis to zero correlation between their sum and their difference. Its extension, eliminating nuisance parameters by a bootstrap procedure, is valid for any correlation structure between thek normally distributed variables. A Monte Carlo study for several combinations of sample sizes and number of variables is presented, comparing the level and power of the new method with previously published tests. Some nonnormal data are included, for which the empirical level tends to be slightly higher than the nominal one. The results show that our method is close in power to the asymptotic tests which are extremely sensitive to nonnormality, yet it is robust and much more powerful than other robust tests.This research was supported by the fund for the promotion of research at the Technion.  相似文献   

18.
From the original Stanford-Binet scale, those items passed by between 10 and 90 per cent of a group of ten-year-old children were analyzed by the centroid method. Upon rotation, there appeared a common factor, for which two explanatory hypotheses are offered, the more tenable being that it is an effect of maturation. Primary factors tentatively identified are Number, Space, Imagery, Verbal Relations and Induction. A sixth factor apparently involves a reasoning ability and a seventh can not be interpreted.The writer is indebted to Dr. L. L. Thurstone for his interest and assistance throughout this study and to Dr. Andrew W. Brown, who made possible the collection of data at the Institute for Juvenile Research, Chicago, Illinois.  相似文献   

19.
Hypermnesia is an increase in recall over repeated tests. A core issue is the role of repeated testing, per se, versus total retrieval time. Prior research implies an equivalence between multiple recall tests and a single test of equal total duration, but theoretical analyses indicate otherwise. Three experiments investigated this issue using various study materials (unrelated word lists, related word lists, and a short story). In the first experimental session, the study phase was followed by a series of short recall tests or by a single, long test of equal total duration. Two days later, participants took a final recall test. The multiple and single test conditions produced equivalent performance in the first session, but the multiple test group exhibited less forgetting and fewer item losses in the final test. In a fourth experiment, using a brief delay (15 min) between the recall sessions, the multiple recall condition produced greater hypermnesia as well as fewer item losses. In addition, final recall was significantly higher in the multiple than in the single test condition in three of the four experiments. Thus, single and repeated recall tests of equal total duration are not functionally equivalent, but rather produce differences observable in subsequent recall tests.  相似文献   

20.
There are a number of methods of factoring the correlation matrix which require the calculation of a table of residual correlations after each factor has been extracted. This is perhaps the most laborious part of factoring. The method to be described here avoids the computation of residuals after each factor has been computed. Since the method turns on the selection of a set of constellations or clusters of test vectors, it will be calleda multiple group method of factoring. The method can be used for extracting one factor at a time if that is desired but it will be considered here for the more interesting case in which a number of constellations are selected from the correlation matrix at the start. The result of this method of factoring is a factor matrixF which satisfies the fundamental relationFF'=R.This study is one of a series of investigations in the development of multiple factor analysis and application to the study of primary mental abilities. We wish to acknowledge the financial assistance from the Social Science Research Committee of The University of Chicago which has made possible the work of the Psychometric Laboratory.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号