首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Group Embedded Figures Test and the Myers-Briggs Type Indicator were administered to 210 undergraduate and graduate students. Bivariate relations between the embedded figures test and the Indicator scales of Extraversion-Introversion (EI), Thinking-Feeling (TF), and Judgment-Perception (JP) were nonsignificant while the relation between scores on embedded figures and Sensing-Intuition (SN) was statistically significant. ESFP, ISFJ, and ESFJ types were significantly more field-dependent than the INFP and ENTP types.  相似文献   

2.
Our previous research using both Japanese Children's Trait and State Worry and Emotionality Scales indicated that there were several issues that needed resolution in a high-stakes testing environment. High-stakes testing environments are those in which there are serious consequences to the individual of how well he or she scores on a particular test. In this study, the high-stakes environment was characterized by higher state worry and emotionality but not higher trait worry and emotionality scores than in the previous study. A clear two-factor solution was found for the state measures but not for the trait measures. Males performed better on the achievement tests than females. The relationship of state anxiety to performance was non-linear. State worry was more highly predictive of poor performance than state emotionality.  相似文献   

3.
Forty‐seven fourth‐and‐fifth grade students whose raw scores on the Gates‐MacGinitie Level D (on‐level test) were determined to be unsuitable were administered an out‐of‐level test (Level C) to ascertain whether the out‐of‐level test would be a more appropriate test for those students. Two questions were addressed in this study. First, are there significant differences in students’ derived scores on the Gates‐MacGinitie when an on‐level test (Level D) that is judged to be unsuitable is compared to an out‐of‐level test (Level C)? Second, is use of an out‐of‐level test more suitable in terms of Roberts’ criterion of the raw scores achieved by the students? There were no significant differences between the derived scores from on‐level and out‐of‐level testing for each of the subtests. The out‐of‐level raw scores did fall within the accepted range for the test to be considered suitable and reliable.  相似文献   

4.
Although curriculum based measures of oral reading (CBM-R) have strong technical adequacy, there is still a reason to believe that student performance may be influenced by factors of the testing situation, such as errors examiners make in administering and scoring the test. This study examined the construct-irrelevant variance introduced by examiners using a cross-classified multilevel model. We sought to determine the extent of variance in student CBM-R scores attributable to examiners and, if present, the extent to which it was moderated by students' grade level and English learner (EL) status. Fit indices indicated that a cross-classified random effects model (CCREM) best fits the data with measures nested within students, students nested within schools, and examiners crossing schools. Intraclass correlations of the CCREM revealed that roughly 16% of the variance in student CBM-R scores was associated between examiners. The remaining variance was associated with the measurement level, 3.59%; between students, 75.23%; and between schools, 5.21%. Results were moderated by grade level but not by EL status. The discussion addresses the implications of this error for low-stakes and high-stakes decisions about students, teacher evaluation systems, and hypothesis testing in reading intervention research.  相似文献   

5.
A sample of 422 female undergraduate students, attending a university-sector college in Wales specialising in teacher education and liberal arts subjects, completed the Myers-Briggs Type Indicator together with the Troldahl-Powell Dogmatism Scale. The data demonstrated that higher dogmatism scores are most clearly associated with sensing rather than intuition. Higher dogmatism scores are also associated with extraversion rather than introversion, and with judging rather than perceiving. No significant difference in dogmatism scores were found between thinking and feeling.  相似文献   

6.
Mindfulness enhances emotion regulation and cognitive performance. A mindful approach may be especially beneficial in high-stakes academic testing environments, in which anxious thoughts disrupt cognitive control. The current studies examined whether mindfulness improves the emotional response to anxiety-producing testing situations, freeing working memory resources, and improving performance. In Study 1, we examined performance in a high-pressure laboratory setting. Mindfulness indirectly benefited math performance by reducing the experience of state anxiety. This benefit occurred selectively for problems that required greater working memory resources. Study 2 extended these findings to a calculus course taken by undergraduate engineering majors. Mindfulness indirectly benefited students’ performance on high-stakes quizzes and exams by reducing their cognitive test anxiety. Mindfulness did not impact performance on lower-stakes homework assignments. These findings reveal an important mechanism by which mindfulness benefits academic performance, and suggest that mindfulness may help attenuate the negative effects of test anxiety.  相似文献   

7.
Using a latent variable approach, the authors examined whether retesting on a cognitive ability measure resulted in measurement and predictive bias. A sample of 941 candidates completed a cognitive ability test in a high-stakes context. Results of both the within-group between-occasions comparison and the between-groups within-occasion comparison indicated that no measurement bias existed during the initial testing but that retesting induced both measurement and predictive bias. Specifically, the results suggest that the factor underlying the retest scores was less saturated with g and more associated with memory than the latent factor underlying initial test scores and that these changes eliminated the test's criterion-related validity. This study's implications for retesting theory, practice, and research are discussed.  相似文献   

8.
Most of the previous studies on test anxiety have focused on students in higher institutions with little research on test anxiety in secondary school students. The present study examined the contributions of gender, age, parent's occupation and self-esteem on test anxiety among secondary school students. Participants were 281 students (males?=?156, females?=?125; mean age?=?17.05, SD?=?1.87) who were candidates for centralised, high-stakes examinations in two randomly selected secondary schools in Onitsha, Anambra state, Nigeria. Data were collected using questionnaires comprising the State Self-esteem Scale, the Test Anxiety Inventory and spaces for the provision of relevant socio-demographic information. Results of a hierarchical multiple regression analysis indicated that age and gender did not significantly contribute to test anxiety. Parent's occupation explained 2% of the variance in test anxiety and self-esteem contributed 10% in explaining test anxiety. Based on the findings, personal predispositions explain test anxiety among school students more than do their demographics.  相似文献   

9.
The authors investigated subgroup differences on a multiple-choice and constructed-response test of scholastic achievement in a sample of 197 African American and 258 White test takers. Although both groups had lower mean scores on the constructed-response test, the results showed a 39% reduction in subgroup differences compared with the multiple-choice test. The results demonstrate that the lower subgroup differences were explained by more favorable test perceptions for African Americans on the constructed-response test. In addition, the two test formats displayed comparable levels of criterion-related validity. The results suggest that the constructed-response test format may be a viable alternative to the traditional multiple-choice test format in efforts to simultaneously use valid predictors of performance and minimize subgroup differences in high-stakes testing.  相似文献   

10.
The use of response cards during whole‐class English vocabulary instruction was evaluated. Five low‐participating students were observed during hand‐raising conditions and response‐card conditions to observe the effects of response cards on student responding and test scores and teacher questions and feedback. Responding and test scores were higher for all targeted students in the response‐card condition. The teacher asked a similar number of questions in both conditions; however, she provided more feedback in the response‐card condition.  相似文献   

11.
SUMMARY

Although it is important to evaluate the intended outcomes of high-stakes testing, it is also important to evaluate the unintended outcomes, which might be as important or more important than the intended outcomes. The purpose of this paper is to examine some of the unintended outcomes of high-stakes testing, including those related to: (a) using tests as a means to hold educators accountable, (b) the effects on instruction, (c) the effects on student and teacher motivation, and (d) the effects on students who are at-risk of school failure. In examining the evidence, I conclude that while some unintended outcomes of high-stakes testing have been positive, many of the unintended outcomes have been negative. Hopefully, through a greater awareness of the unintended outcomes, school psychologists can work to minimize the negative effects of testing on students and educators.  相似文献   

12.
This study examined the relationship between subjects' actual test derived scores and their estimates of what those scores would be. Fifty-six subjects completed three questionnaires (Morningness-Eveningness Questionnaire; FIRO-B; Myers—Briggs Type Indicator MBTI), and then estimated the scores on each dimension (15 in all) for themselves and another person that they knew well. The results showed significant positive correlations on 10 of the 15 dimensions for themselves. The dimensions that they were best at estimating were Morningness-Eveningness; Extraversion, and Thinking on the MBTI; and Wanted and Expressed Inclusion on the FIRO-B. Eight correlations reached significance concerning their ability to predict another known person's scores but were lower than for their own estimate-actual score correlations. Whereas subjects believed that they were like the other person they nominated (12 of the 15 correlations were significantly positive), in actual fact their test derived scores showed only five significant findings, two positive and the others negative. The results are discussed in terms of lay theories of personality and their relationship to personality assessment.  相似文献   

13.
A contingency contracting program designed to increase study rate and subsequent test performance was implemented with a group of undergraduate psychology students. The function of the contingency contracting program in producing increased study rate was evaluated by individual experiments with each student in an experimental contracting group. The overall effect of the program on test performance was assessed by comparing the final scores for the course earned by the experimental group with those earned by two matched control groups. A reversal procedure established that contingency contracting did significantly increase the study rate of students of a wide range of ability. However, it was selectively effective in improving the test performance of below-average students only. Study rate gains in contracted courses did not generalize to noncontracted courses. Self-recording of study time in the absence of scheduled differential consequences did not improve test performance. Study rate under no-consequence conditions varied with test schedule. For both consequence and no-consequence groups, the correlation between study time and final score for the course was only moderate.  相似文献   

14.
Background. Despite a large body of international literature concerning the antecedents, correlates of and treatments for test anxiety, there has been little research until recently using samples of students drawn from the UK. There is a need to establish some basic normative data for test anxiety scores in this population of students, in order to establish whether international research findings may generalize to UK schoolchildren. Aim. To collect some exploratory data regarding test anxiety scores in a sample of UK schoolchildren, along with socio‐demographic variables identified in the existing literature as theoretically significant sources of individual and group differences in test anxiety scores. Sample. Key Stage 4 students (1348): 690 students in the Year 10 cohort and 658 students in the Year 11 cohort, drawn from seven secondary schools in the North of the UK. Method. Data on test anxiety were collected using a self‐report questionnaire, the Test Anxiety Inventory ( Spielberger, 1980 ) and additional demographic variables through the Student Profile Questionnaire. The factor structure of the Test Anxiety Inventory was explored using principal components analysis and multiple regression analysis used to predict variance in self‐reported test anxiety scores from individual and group variables. Results. The principal components analysis extracted two factors, worry and emotionality, in line with theoretical predictions. Gender, ethnic and socio‐economic background were identified as significant predictors of variance in test anxiety scores in this dataset. Whether English was an additional, or native, language of students did not predict variance in test anxiety scores and year group was identified as a predictor of emotionality scores only. Conclusion. Variance in the test anxiety scores of Key Stage 4 students can be predicted from a number of socio‐demographic variables. Further research is now required to assess the implications for assessment performance, examination arrangements and appropriateness of using a North American measure of test anxiety in a UK context.  相似文献   

15.
On the consistency of individual classification using short scales   总被引:2,自引:0,他引:2  
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level, proportions of correct classifications were computed for varying test length, cut-scores, item scoring, and choices of item parameters. Short tests were found to classify at most 50% of a group consistently. Results were much better for tests containing 20 or 40 items. Small differences were found between dichotomous and polytomous (5 ordered scores) items. It is recommended that short tests for high-stakes decision making be used in combination with other information so as to increase reliability and classification consistency.  相似文献   

16.
Maintaining a stable score scale over time is critical for all standardized educational assessments. Traditional quality control tools and approaches for assessing scale drift either require special equating designs, or may be too time-consuming to be considered on a regular basis with an operational test that has a short time window between an administration and its score reporting. Thus, the traditional methods are not sufficient to catch unusual testing outcomes in a timely manner. This paper presents a new approach for score monitoring and assessment of scale drift. It involves quality control charts, model-based approaches, and time series techniques to accommodate the following needs of monitoring scale scores: continuous monitoring, adjustment of customary variations, identification of abrupt shifts, and assessment of autocorrelation. Performance of the methodologies is evaluated using manipulated data based on real responses from 71 administrations of a large-scale high-stakes language assessment.  相似文献   

17.
In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed.  相似文献   

18.
A significant body of research has demonstrated that IQs obtained from different intelligence tests substantially correlate at the group level. Yet, there is minimal research investigating whether different intelligence tests yield comparable results for individuals. Examining this issue is paramount given that high-stakes decisions are based on individual test results. Consequently, we investigated whether seven current and widely used intelligence tests yielded comparable results for individuals between the ages of 4–20 years. Results mostly indicated substantial correlations between tests, although several significant mean differences at the group level were identified. Results associated with individual-level comparability indicated that the interpretation of exact IQ scores cannot be empirically supported, as the 95% confidence intervals could not be reliably replicated with different intelligence tests. Similar patterns also appeared for the individual-level comparability of nonverbal and verbal intelligence factor scores. Furthermore, the nominal level of intelligence systematically predicted IQ differences between tests, with above- and below-average IQ scores associated with larger differences as compared to average IQ scores. Analyses based on continuous data confirmed that differences appeared to increase toward the above-average IQ score range. These findings are critical as these are the ranges in which diagnostic questions most often arise in practice. Implications for test interpretation and test construction are discussed.  相似文献   

19.
Studied the reliability of the Washington University Sentence Completion Test by giving 51 9th graders and 26 college students the test twice, a week apart. For 9th graders the design included a test-retest group and two groups given half of the test at each session. Although test-retest correlations were high for the 9th graders, retest scores dropped significantly. With college students (a) test-retest correlations through positive and significant were lower, (b) retest scores did not change systematically, and (c) percentage agreement between test and retest scores was high. Discrepant results were related to motivational set and variance in test scores. Split-half correlations and internal consistency coefficients were high. Likelihood of lower retest scores makes problematic the use of this test for short term pretest-posttest studies seeking to stimulate ego development.  相似文献   

20.
The present study was designed to replicate McCrae and Costa's research findings on the relation of NEO-4 domains with the Myers-Briggs Type Indicator scales in a Polish sample of 300 psychology student volunteers (175 women, 125 men). Their mean age was 22.3 yr. (SD = 4.5). Correlations for scores on the MBTI scales with NEO-4 domains ranged from .72 to .02 for Extraversion, from -.60 to -.16 for Openness to experience, from -.56 to -.04 for Agreeableness, and from .55 to -.07 for Conscientiousness. Two domains assessed with the NEO-4 correspond to preferences measured by the Myers-Briggs Type Indicator.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号