首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
采用多侧面Rasch模型对28位评委在托幼机构教育质量评价中的评委偏差进行了分析。分析结果显示:28名评委评分宽严度差异显著;3名评委内部一致性较差,其余25名评委内部一致性较稳定;评委与评价班级的交互作用不显著,与评价项目的交互作用显著。研究结果表明MFRM可以对托幼机构教育质量评价的评委偏差进行个体层面的具体分析,从项目反应理论的视角为托幼机构教育质量评价的评委针对性培训、评估评委的合格性从而建立合格评委库等提供现代教育、心理测量学依据。  相似文献   

2.
采用项目反应理论(IRT)的多侧面Rasch模型(MFRM),分析评价中心技术中无领导小组讨论(LGD)的测评结果,探讨被试能力水平、评委评分宽严度、评分内部一致性、维度难度和评定等级等问题,进而讨论各种偏差。通过 MFRM 分析人事测评结果,可深入了解被试能力的真实差异、甑别维度难度、探查测评误差源,从而完善测评试题编制、评估或诊断评委合格性、提高测评维度与测评目的匹配性,为拓展项目反应理论在人事测评中的应用提供独特视角。  相似文献   

3.
关丹丹 《心理学探新》2014,34(5):437-440
为了评价和改进硕士研究生入学考试一般能力测试的写作评分,研究者采用概化理论和多面Rasch分析对113位考生的写作样本的评分误差来源、评分信度等进行了探讨.概化理论研究显示,评分者和题目对评分准确性影响不大,以两道写作题的考试设计而言,评分者为2人即可保证评分信度在0.75以上.多面Rasch分析显示,评分者宽严度的估计值及其误差均在可接受的范围内,评分者之间在宽严度上不存在显著差异,且评分者自身在评分时总体上比较稳定.但个别评分者在特定考生特定题目上表现出特殊偏向.概化理论和多面Rasch分析丰富了写作评分研究的量化指标,证实了硕士研究生入学考试一般能力测试的写作评分具有较高的信度.  相似文献   

4.
We present a method for studying experimental data based on a psychometric model, the “Rasch model” (Rasch, 1966; Thissen & Steinberg, 1986). We illustrate the method with the use of a data set in the field of concept research. More specifically, we investigate whether a conjunctive concept can be seen as an additive combination of its constituents. High correlations between model and data are obtained, but a formal goodness-of-fit test indicates that the model does not completely account for the data. We then alter the Rasch model in such a way as to capture our idea of why the model deviates from the data. This results in higher correlations and a strong increase in goodness-of-fit. It is concluded that our ideas, as incorporated in the model, adequately summarize the data. More generally, this research illustrates that applying the Rasch model and altering it according to one’s hypotheses is an excellent way to analyze experimental data.  相似文献   

5.
主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,介绍了国外相关的典型应用,并且讨论了该模型的应用条件。  相似文献   

6.
HSK主观考试评分的Rasch实验分析   总被引:1,自引:0,他引:1  
主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,设计了基于该模型的HSK主观考试评分质量控制应用框架,利用HSK作文评分数据进行了实验验证。  相似文献   

7.
In this paper, the distributional properties and power rates of the Lz, Eci2z, and Eci4z statistics when they are used as item fit statistics were explored. The results were compared to t-transformation of Outfit and Infit mean square. Four sample sizes were selected: 100, 250, 500, and 1000 examinees. The abilities were uniform and normal with mean 0 and standard deviation 1, and uniform and normal with mean -1 and standard deviation 1. The pseudo-guessing parameter was fixed at .25. Two ranges of difficulty parameters were selected: +/- 1 logits and +/- 2 logits. Two test lengths were selected: 15 and 30 items. The results showed important differences between the T-infit, T-outfit, Lz, Eci2z, and Eci4z statistics. The T-oufit, T-infit, and Lz statistics showed poor standardization with estimated parameters because their distributional properties were not close to the expected values. However, the Eci2z and Eci4z statistics showed satisfactory standardization on all conditions. Further, the power rates of Eci2z and Eci4z were 5% to 10% higher than the power rates of Lz, T-outfit, and T-infit to detect items that do not fit Rasch model.  相似文献   

8.
There have been two basic approaches for the study of minority group prejudice against the majority: to adapt instruments from the majority group, and to use qualitative techniques by analyzing the content of the discourse of the groups involved. Neither of these procedures solves the problem of measuring intergroup attitudes of majorities and minorities in interaction. This study shows the result of a prejudice scale which was developed to measure the attitude of both the minority and majority groups. Prejudice is conceived as an attitude which requires the beliefs or opinions about the out-group, the emotions it elicits, and the behavior or intentional behavior toward it to be known for its evaluation. The innovation in this work is that the psychometric development of the scale was based on the item response theory, and more specifically, the rating scale model.  相似文献   

9.
Rasch analysis is a popular statistical tool for developing and validating instruments that aim to measure human performance, attitudes and perceptions. Despite the availability of various software packages, constructing a good instrument based on Rasch analysis is still considered to be a complex, labour-intensive task, requiring human expertise and rather subjective judgements along the way. In this paper we propose a semi-automated method for Rasch analysis based on first principles that reduces the need for human input. To this end, we introduce a novel criterion, called in-plus-out-of-questionnaire log likelihood (IPOQ-LL). On artificial data sets, we confirm that optimization of IPOQ-LL leads to the desired behaviour in the case of multi-dimensional and inhomogeneous surveys. On three publicly available real-world data sets, our method leads to instruments that are, for all practical purposes, indistinguishable from those obtained by Rasch analysis experts through a manual procedure.  相似文献   

10.
晏子 《心理科学进展》2010,18(8):1298-1305
Rasch模型是在国外学术界受到广泛关注和深入研究的一个潜在特质模型。该模型为解决心理科学领域内测量的客观性问题提供了一个可行性很高的解决方案。而国内关于Rasch模型的理论探讨和应用研究却并不多见。不同于一般项目反应理论, Rasch模型要求所收集的数据必须符合模型的先验要求, 而不是使用不同的参数去适应数据的特点。Rasch模型的主要特点(包括个体与题目共用标尺、线性数据、参数分离)确保了客观测量的实现。未来关于Rasch模型的研究方向包括多维度Rasch模型、测验的等值与链接、计算机自适应性考试, 大型应用测量系统(比如Lexile系统)等等。  相似文献   

11.

Objective

The Coping Scale for Chinese Athletes (CSCA) was developed and validated using classic testing theory in 2004 (Chung, Si, Lee, & Liu, 2004). This study aimed to validate CSCA using multidimensional Rasch analysis with the ConQuest software programme.

Method

The sample in this study comprised 367 athletes from mainland China. A Multidimensional Rating Scale model was applied to investigate the validity of the four-dimension scale. Standard fit statistics (Infit and Outfit MNSQ) and Differential item functioning (DIF) were computed to examine the model-data fit. Test reliability and category functioning were also checked.

Results

The item difficulty and the athletes’ trait level of coping were calibrated along the same latent trait scale. Three items were removed from the scale due to misfit with the Rasch model. No DIF across gender was found for the remaining 21 items. Test reliabilities for the four subscales ranged from 0.66 to 0.76. The results also indicated that the original 5-category rating scale structure did not function well.

Conclusion

The multidimensional Rasch analysis supported that the 21-item CSCA measures four latent traits of coping of Chinese athletes as expected. The results also demonstrated advantages of multidimensional Rasch analysis over unidimensional Rasch analysis as well as traditional approach in examining the quality of multidimensional scale in sport settings.  相似文献   

12.
该研究应用GT和多面Rasch模型对结构化面试数据进行分析,并提出一些建议针对某辅导员招聘面试数据,运用GT从宏观上分析应聘者、考官和项目所带来的总体误差大小,在此基础上,运用多面Rasch模型从微观上进一步探查考官严厉度、应聘者能力差异、项目难易度及侧面偏差.结果表明:1)GT分析表明应聘者产生的变异较大(90.65%),说明面试可靠性较高,且当考官数为2时可靠性已较好.2)多面Rasch模型分析出了各侧面效应中的非拟合因素及交互效应中的偏差因素,表明面试误差主要来自考官间严厉度的差异及其自身一致性的不稳定。将GT与多面Rasch模型相结合分析面试数据不仅能测查出评价过程各方面的问题因素,并能更好地作整体把握。  相似文献   

13.
Empathic responding is implicated in antisocial behaviors such as bullying, sexual offending, and violent crime. Identifying children and adolescents at risk for antisocial behavior and evaluating interventions designed to address problem behaviors require valid and reliable measures. Definitional controversies and limited measurement models have hindered measurement. This study describes the development and analysis of the Children's Empathic Attitudes Questionnaire (CEAQ) using both classical and modern techniques. Rasch analyses provided probabilistic results over large item and person groups, enabling meaningful inferences from patterns of responses at the construct level. Analyses of fifth through seventh graders' responses to the final version of the CEAQ provide support for its reliability, validity, and functionality. Four meaningful item clusters were identified, each reflecting more cognitively advanced empathic attitudes. These analyses suggest that the CEAQ provides a theoretically sound, hierarchically meaningful measure of empathic attitudes that will be useful in identification and intervention with children and adolescents at risk for antisocial behavior.  相似文献   

14.
Background. Bullying is a problem in schools in many countries. There would be a benefit in the availability of a psychometrically sound instrument for its measurement, for use by teachers and researchers. The Olweus Bully/Victim Questionnaire has been used in a number of studies but comprehensive evidence on its validity is not available. Aims. To examine the conceptual design, construct validity and reliability of the Revised Olweus Bully/Victim Questionnaire (OBVQ) and to provide further evidence on the prevalence of different forms of bullying behaviour. Sample. All 335 pupils (160 [47.8%] girls; 175 [52.2%]) boys, mean age 11.9 years [range 11.2–12.8 years]), in 21 classes of a stratified sample of 7 Greek Cypriot primary schools. Method. The OBVQ was administered to the sample. Separate scales were created comprising (a) the items of the questionnaire concerning the extent to which pupils are being victimized; and (b) those concerning the extent to which pupils express bullying behaviour. Using the Rasch model, both scales were analysed for reliability, fit to the model, meaning, and validity. Both scales were also analysed separately for each of two sample groups (i.e. boys and girls) to test their invariance. Results. Analysis of the data revealed that the instrument has satisfactory psychometric properties; namely, construct validity and reliability. The conceptual design of the instrument was also confirmed. The analysis leads also to suggestions for improving the targeting of items against student measures. Support was also provided for the relative prevalence of verbal, indirect and physical bullying. As in other countries, Cypriot boys used and experienced more bullying than girls, and boys used more physical and less indirect forms of bullying than girls. Conclusions. The OBVQ is a psychometrically sound instrument that measures two separate aspects of bullying, and whose use is supported for international studies of bullying in different countries. However, improvements to the questionnaire were also identified to provide increased usefulness to teachers tackling this significant problem facing schools in many countries.  相似文献   

15.
This paper proposes a structural analysis for generalized linear models when some explanatory variables are measured with error and the measurement error variance is a function of the true variables. The focus is on latent variables investigated on the basis of questionnaires and estimated using item response theory models. Latent variable estimates are then treated as observed measures of the true variables. This leads to a two-stage estimation procedure which constitutes an alternative to a joint model for the outcome variable and the responses given to the questionnaire. Simulation studies explore the effect of ignoring the true error structure and the performance of the proposed method. Two illustrative examples concern achievement data of university students. Particular attention is given to the Rasch model.  相似文献   

16.
Inhibition-reduction theory (L. Hasher & R. Zacks, 1988) hypothesizes that the age-related decline in working memory (WM) span is a result of a decrease in the ability to inhibit irrelevant information in WM. Using the Rasch psychometric model, this study found that later trials on 2 WM span tasks were more difficult for older adults than for younger adults, consistent with inhibition-reduction theory's hypothesis that older adults are more susceptible to the effects of proactive interference (PI). Furthermore, after accounting for differential susceptibility to the effects of PI, age-related variance in WM span was reduced by about half. These results suggest that differential susceptibility to PI may account for a substantial portion, although not all, of the age-related decline in WM span.  相似文献   

17.
This paper shows how to use the log-linear subroutine of SPSS to fit the Rasch model. It also shows how to fit less restrictive models obtained by relaxing specific assumptions of the Rasch model. Conditional maximum likelihood estimation was achieved by including dummy variables for the total scores as covariates in the models. This approach greatly simplifies the specification of the Rasch models. We illustrate these procedures in an analysis of four items selected from the Reiss Premarital Sexual Permissiveness Scale. We found that a modified version of the Rasch model with item dependencies fits the data significantly better than the simple Rasch model. We also found that the item difficulties are the same for men and women, but that the item dependencies are significantly greater for men. Apart from any substantive issues these results raise, the value of this exercise lies in its demonstration of how researchers can use the procedures of popular, accessible software packages to study an increasingly important set of measurement models.  相似文献   

18.
The current study aimed to extend the evaluation of the utility of the Social Performance Rating Scale (SPRS) [Behav. Res. Ther. 36 (1998) 995]. We examined the utility of a modified SPRS for the behavioral assessment of public-speaking anxiety among patients with social phobia (n = 49). The videotaped performance of public-speaking fearful patients in a public-speaking task was rated using four of the five SPRS ratings and was compared to global ratings by patients and observers, as well as to self-report and clinician-administered measures of social anxiety. The pattern of correlations with criterion measures of social anxiety provided evidence for the convergent and divergent validity of this modified SPRS for the behavioral assessment of public-speaking anxiety.  相似文献   

19.
In this paper, optimal designs will be derived for estimating the ability parameters of the Rasch model when difficulty parameters are known. It is well established that a design is locally D-optimal if the ability and difficulty coincide. But locally optimal designs require that the ability parameters to be estimated are known. To attenuate this very restrictive assumption, prior knowledge on the ability parameter may be incorporated within a Bayesian approach. Several symmetric weight distributions, e.g., uniform, normal and logistic distributions, will be considered. Furthermore, maximin efficient designs are developed where the minimal efficiency is maximized over a specified range of ability parameters.  相似文献   

20.
曹亦薇  毛成美 《心理学报》2008,40(4):427-435
对1952名大学新生进行适应性调查,其中285人接受了2次以上的追踪调查,所得的多级评分重复测量数据采用纵向Rasch模型进行统计分析。研究应用SAS的GLIMMIX过程对多层Rasch模型参数估计作了新的尝试。结果表明:(1)新生在第一学年内,学习和情绪适应总体呈上升趋势,人际适应呈下降趋势;(2)不同个体入学时的适应状况差异显著,但是随时间变化的趋势、快慢相同;(3)学习适应分量表的项目稳定性较好,而人际、情绪适应的部分项目难度存在时间效应。研究结果对新生辅导具有启示意义  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号