首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 38 毫秒
1.
Pan T  Yin Y 《心理学方法》2012,17(2):309-311
In the discussion of mean square difference (MSD) and standard error of measurement (SEM), Barchard (2012) concluded that the MSD between 2 sets of test scores is greater than 2(SEM)2 and SEM underestimates the score difference between 2 tests when the 2 tests are not parallel. This conclusion has limitations for 2 reasons. First, strictly speaking, MSD should not be compared to SEM because they measure different things, have different assumptions, and capture different sources of errors. Second, the related proof and conclusions in Barchard hold only under the assumptions of equal reliabilities, homogeneous variances, and independent measurement errors. To address the limitations, we propose that MSD should be compared to the standard error of measurement of difference scores (SEMx-y) so that the comparison can be extended to the conditions when 2 tests have unequal reliabilities and score variances.  相似文献   

2.
The structural theory of cerebral lateralization has been typically used to explain hemispheric asymmetries. However, the attentional model of brain functioning may help resolve some of the inconsistent findings with groups of learning-disabled children. To test this hypothesis, a visual half-field paradigm for word recognition was employed in a group of 26 learning-disabled and 26 normal children matched for sex, chronological age, and handedness. Three experimental conditions (unilateral, cued unilateral, and bilateral) and two word error types (visually and acoustically confusable words) were analyzed. The results indicated that normals produced the expected RVHF superiority under all experimental conditions, but the learning-disabled produced the expected RVHF superiority only under the cued unilateral condition. Learning-disabled children also made significantly more visually and acoustically confusable errors than normals and unlike normal children increased the number of acoustic errors in the RVF under bilateral stimulation. These results provide evidence that learning-disabled children may process information inefficiently and have brain activation patterns that are more susceptible to attentional effects.  相似文献   

3.
Given the significant increase in the number of students identified as learning-disabled (LD) and the growing concern about the overidentification of LD cases, attention has been focused on methods for determining a severe discrepancy between ability and achievement. Two such methods (a z-score discrepancy and a regression procedure) were compared by means of two different cutoff procedures on scores for 236 LD referrals. These results were then contrasted with a policy dictating that the lowest-achieving of those referred be considered as LD. Each student was evaluated with an individual intelligence scale and an achievement test. The results indicated that the regression procedure identified fewer students than did the z-score method. When the percentage of identified children is held constant, the methods were similar with respect to the types of errors made (false positives and false negatives). Data indicated that selecting the lowest-achieving students would have yielded about the same percentage of correct decisions, as defined by the multidisciplinary team, as did the two discrepancy methods. The policy implications of these findings are also considered.  相似文献   

4.
Selective attention to visual and auditory stimuli and reflection-impulsivity were studied in normal and learning-disabled 8- and 12-year-old boys. Multivariate analyses, followed by univariate and paired-comparison tests, indicated that the normal children increased in selective attention efficiency with age to both visual and auditory stimuli. Learning-disabled children increased in selective attention efficiency with age to auditory, but not to visual, stimuli. Both groups increased with age in reflection as measured by Kagan's Matching Familiar Figures Test (MFF). The 8-year-old learning-disabled children were more impulsive than the 8-year-old normals on MFF error scores, but not on MFF latency scores. No difference occurred between the 12-year-old learning-disabled and normal children on either MFF error or MFF latency scores. Correlations between the selective attention scores and MFF error and latency scores were not significant.This research was supported in part by BEH grant G007507227. The authors are indebted to Eleanor McCandless for her assistance in securing the learning-disabled subjects and to James McLeskey and Michael Popkin for their assistance in collecting and analyzing data.  相似文献   

5.
Dyslexic and nondyslexic boys within a single community's learning-disabled class were given a set of tests; performance on each of these tests has been reported to be significantly impaired in other dyslexic children compared to learning-disabled and normal groups. Linear discriminant function analysis revealed that error types rather than levels of performance best separated the carefully matched learning-disabled groups. Slow naming and high percentage of “dysphasic” errors characterized dyslexic boys. Visual temporal-spatial matching and “configuration-deficient” perceptual errors characterized the adequate readers who have other learning disabilities.  相似文献   

6.
Learning-disabled (LD) children are often identified for services on the basis of a discrepancy between their IQ and achievement scores. Comparisons between regression and standard score difference methods for determining IQ-achievement discrepancies have not considered the comparative effects of these methods on the racial composition of LD classes. A comparison of these methods shows that the standard score difference method produces disproportionate racial representation, whereas the regression method produces proportionate racial representation in LD classes. In addition, the regression method demonstrates advantages in measurement of discrepancies, LD program planning, and conformity with constitutional guarantees of equal protection under the law.  相似文献   

7.
Given the substantial rise in the number of students identified as learning-disabled, increasing attention has centered on methods for determining a severe discrepancy between ability and achievement. Using scores from 86 learning disabilities referrals, we compared four such methods (a z-score discrepancy, an estimated true score discrepancy, an unadjusted regression procedure, and an adjusted regression procedure). Each student was evaluated with the WISC-R, PIAT, and K-ABC. A high degree of agreement was found between z-score and estimated true score difference approaches. Less agreement was found between the unadjusted regression procedure and the other methods. It was concluded that the four methods cannot be used interchangeably in the calculation of severe discrepancies. Of the four methods that were analyzed, the unadjusted regression procedure selected the smallest percentage of students.  相似文献   

8.
A direct comparison was made of the reliability and validity of the standard Matching Familiar Figures Test (MFF) to a recent longer version of the task (MFF20). Subjects comprised two samples of learning-disabled children, matched on age, sex, IQ, and SES. The Salkind and Wright (1977) formulation was used to generate continuous data, and IQ was statistically controlled. Internal reliability estimates showed the MFF20 to be more consistent that the standard version on both error and latency scores. Validity was addressed by comparing the two versions of the task in their ability to predict cognitive and behavioral skills of conceptual relevance to impulsivity. Results indicated that the MFF20 is a more sensitive predictor of academic achievement and attention as observed in a natural setting than is the standard version of the task.This study was supported by a contract (300-77-0495) from the Bureau of Education for the Handicapped, Office of Education, for the University of Virginia Learning Disabilities Research Institute. The authors would like to express appreciation to the school officials who generously assisted in the present study: Elizabeth Bailey, Thomas Cox, Charles Dempsey, Julian King.  相似文献   

9.
This study explored the value of obtaining a just noticeable difference (JND) for a test--the difference in scores needed before observers detect a difference in examinees' behavior--as a means of interpreting the practical meaning of scores. Classical psychophysical methods were adapted and applied to the scores of foreign teaching assistants (TAs) on an achievement test, the Test of Spoken English (TSE), and the ratings for English proficiency that the TAs received from their students. The JND for the TSE scores was substantial, as large as the standard deviation of the scores and much larger than the standard error of measurement and guidelines for the d index of effect size for mean differences, suggesting that both sets of standards may highlight score differences that are not practically significant. This study demonstrates the applicability of JNDs for evaluating scores on educational and psychologists' tests.  相似文献   

10.
The study addresses the external validity of the Woodcock-Johnson Tests of Cognitive Ability (WJTCA) in learning disabled (LD) elementary school children by controlling for two methodological errors Woodcock identified in previous studies: (a) Intellectual ability range was restricted for both normal and LD samples to counteract an artificial inflation of mean WISC-R scores without concomitant effect on WJTCA scores, and (b) the WISC-R was readministered during data collection. In addition, normals were used as controls for LD students. WJTCA scores were correlated and compared with WISC-R scores and reading achievement test scores in 20 normal, 20 mild-to-moderate LD, and 20 severe LD third-, fourth-, and fifth-grade students. Results indicate comparability of mean WISC-R and WJTCA Full Scale scores in the normal sample, but manifest a significantly lower WJTCA Full Scale scores in the LD samples, despite a strong degree of correlation between the two tests in each sample. The significant linear trend of increasing mean WISC-R/WJTCA discrepancy across the severity of LD strongly suggests that the lower WJTCA scores in the LD samples is a function of the instrument's achievement emphasis and refutes the possibility of systematic error in the WJTCA norms. Results suggest that the WJTCA's achievement emphasis jeopardizes its validity for assessing and classifying LD students within the currently accepted and mandated ability-achievement discrepancy model of specific learning disabilities.  相似文献   

11.
Formulas for the standard error of measurement of three measures of change—simple difference scores, residualized difference scores, and the measure introduced by Tucker, Damarin, and Messick—are derived. Equating these formulas by pairs yields additional explicit formulas which provide a practical guide for determining the relative error of the three measures in any pretest-posttest design. The functional relationship between the standard error of measurement and the correlation between pretest and posttest observed scores remains essentially the same for each of the three measures despite variations in other test parameters (reliability coefficients, standard deviations), even when pretest and posttest errors of measurement are correlated.  相似文献   

12.
Statistically based banding is often considered a viable method for minimizing adverse impact in test‐based employment decisions. By utilizing the standard error of the difference (SED), scores are equated based on the assumption that there is substantial unreliability in any single observed score. However, based on the derivations of Dudek, the formula commonly used to calculate the standard error of measurement (SEM) – a component that is typically used to calculate the SED – is incorrect. Specifically, utilizing the SEM when calculating the SED produces a band of observed scores around a true score, not a band of true scores around an observed score as would be appropriate for banding. This study compares the differences between banding‐based selection decisions when the appropriate SED formula – which utilizes the standard error of estimate – is and is not applied. Overall, results suggest that utilizing the appropriate formula for calculating the SED produces substantial variations in employment decisions. The potential legal and ethical implications of these discrepancies are discussed.  相似文献   

13.
All learning-disabled children, dyslexic and nondyslexic, were found to be impaired relative to controls on a variety of naming tests: (1) naming pictured objects (visual name), (2) responding with an object name to a definition (auditory definition), (3) completing a sentence with an object name (auditory sentence), or (4) naming palpated objects (tactual). Only on the sentence completion task (auditory sentence), which has been found to be the simplest response mode, were the dyslexic subjects selectively less accurate than the nondyslexic learning disabled, relative to the control group. Although dyslexic subjects tend to circumlocute when naming objects, they did not find it easier, relative to other groups, to give the function rather than the name of objects. Time scores were not in the same direction. The nondyslexic learning-disabled group responded more rapidly than either the dyselxic subjects or controls and made more perceptual errors, findings that may be related to some other factor, possibly the hyperactivity of many of the children in the nondyslexic learning-disabled group. The finding, also, that most of their naming error scores correlate highly with each other as well as with standardized language measures (WISC-R Vocabulary and PPVT), whereas those of the dyslexic and control groups do not, further suggests some underlying pathology to which their language disability is related. Language impairment, then, may be a common factor in all learning disability, dyslexic and nondyslexic, possibly for different reasons.  相似文献   

14.
As usually interpreted, the standard error of measurement is assumed to be constant throughout the test-score range. In this investigation the standard error of measurement was assumed to be not higher than a second-degree function of the test score. By conceiving a test score to be made up of the scores on two parallel tests, an equation was derived for predicting the standard error of measurement from the test score. In the derivation the corresponding first four moments of the score distributions for the parallel tests were assumed to be identical, and certain errors of estimate involved in predicting the second test score from the first were assumed to be uncorrelated with powers of the score on the first test. An empirical verification was carried out, using nine synthetic tests and a 1000-case sample, and showed good agreement between predicted and observed results. The findings indicated that the standard error of measurement was constant only for a symmetrical, mesokurtic distribution of scores.This study was carried out while the author was a National Research Council Predoctoral Fellow in psychology at Princeton University.The author wishes to express his appreciation for the guidance given by his thesis adviser, Professor Harold Gulliksen. He wishes also to acknowledge his gratitude to the Educational Testing Service for extensive assistance in the empirical phase of the study, and to Dr. Ledyard Tucker for suggesting efficient methods of handling special computational problems.  相似文献   

15.
Using a modified reception paradigm, normal and learning-disabled children were required to solve unidimensional, disjunctive, or conditional connectives under standard attending or enforced attending instructions. The major result was that enforced attending procedures facilitated solution of disjunctive and conditional concepts for normal children, while having minimal effects on rule attainment for the disabled. Within Sternberg's (1979) model, a number of subcomponent analyses were made on attribute combinations (e.g., TT, TF...), but only in the FF instance was there a difference between instructional conditions. Disabled were deficient in TT, TF, FF instances regardless of attending instructions. Results support a “reductive coding deficiency” in that learning-disabled children were unable to effectively utilize attentional instructions to encode certain attribute combinations.  相似文献   

16.
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The objective was to provide bounds of the likely DIF effects on these measurement consequences. Five factors were manipulated: test length, percentage of DIF items per form, item type, sample size, and level of group ability difference. Results indicate that the greatest DIF effect was less than 2 points on the 0 to 60 total score scale and about 0.15 on the IRT ability scale. DIF had a limited effect on the ratio of true-score variance to observed-score variance, but its influence on the standard error of estimation for the IRT ability parameter was evident for certain ability values.  相似文献   

17.
The method of selecting among job applicants using statistically based banding has been proposed over the last 10 years as a way to increase workforce diversity. The method continues to be reviewed by academics and considered by practitioners. Although the goal of increasing workforce diversity is important, statistical banding of scores remains controversial. We present a set of unique, statistically and theoretically based criticisms of a form of banding (top‐score‐referenced banding) that is widely used in hundreds of jobs in the public sector throughout the United States. We suggest that even within the premises of such banding, the wrong formula is used to estimate the standard error of measurement and standard error of the difference. One consequence is that too many individuals are labeled as essentially equal with respect to test scores. A related consequence is that test scores within a single band are statistically different and should therefore be treated as such for selection purposes. A more logically and statistically defensible procedure for responding to diversity concerns is to continue to attend to adverse impact issues at each step of the recruiting and test development process.  相似文献   

18.
The confidence intervals for the Minnesota Multiphasic Personality Inventory (MMPI-2) clinical scales were investigated. Based on the clinical scale reliabilities published in the MMPI-2 manual, estimated true scores, standard errors of measurement for estimated true scores, and 95% confidence intervals centered around estimated true scores were calculated at 5-point MMPI-2 T-score intervals. The relationships between obtained T-scores, estimated true T-scores, scale reliabilities, and confidence intervals are discussed. The possible role of error measurement in defining scale high point and code types is noted.  相似文献   

19.
Detailed time and error analyses of the Tower of Hanoi (TOH) test was performed using four repeated assessments of eight children (ages 9-12 years), who had perceptual and problem solving deficits. The time before each move was measured. In addition to the traditionally counted time scores, new, relative time scores were computed in order to separate the planning time from the general reaction speed. New error scores were defined and sum scores of serious errors (perserative moves, illegal moves, and wrong results) and mild errors (self-corrected moves, almost performed moves, and interrupted trials) were computed. The relative planning time correlated positively with the achieved score, and negatively with the serious errors. The serious errors correlated negatively with the achieved score. The relative planning time seems to measure the quality of planning better than does the raw planning time, and it is a recommended score for TOH analysis. The value of new error scores requires additional research.  相似文献   

20.
This article introduces new statistics for evaluating score consistency. Psychologists usually use correlations to measure the degree of linear relationship between 2 sets of scores, ignoring differences in means and standard deviations. In medicine, biology, chemistry, and physics, a more stringent criterion is often used: the extent to which scores are identically equal. For each test taker (or other unit of measurement), the difference between the 2 scores is calculated. The root mean square difference (RMSD) represents the average change from 1 set of scores to the other, and the concordance correlation coefficient (CCC) rescales this coefficient to have a maximum value of 1. This article shows the relationship of the RMSD and CCC to the intraclass correlation coefficients, product-moment correlation, and standard error of measurement. Finally, this article adapts the RMSD and the CCC for linear, consistency, and absolute definitions of agreement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号