首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Methods for the identification of differential item functioning (DIF) in Rasch models are typically restricted to the case of two subgroups. A boosting algorithm is proposed that is able to handle the more general setting where DIF can be induced by several covariates at the same time. The covariates can be both continuous and (multi‐)categorical, and interactions between covariates can also be considered. The method works for a general parametric model for DIF in Rasch models. Since the boosting algorithm selects variables automatically, it is able to detect the items which induce DIF. It is demonstrated that boosting competes well with traditional methods in the case of subgroups. The method is illustrated by an extensive simulation study and an application to real data.  相似文献   

2.
主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,介绍了国外相关的典型应用,并且讨论了该模型的应用条件。  相似文献   

3.
HSK主观考试评分的Rasch实验分析   总被引:1,自引:0,他引:1  
主观评分中存在的不一致性导致主观评分的信度降低。多面Rasch模型基于项目反应理论,可以应用于评分员效应的识别和消除,从而提高主观评分的信度。该文介绍多面Rasch模型的理论和应用框架,设计了基于该模型的HSK主观考试评分质量控制应用框架,利用HSK作文评分数据进行了实验验证。  相似文献   

4.
采用多侧面Rasch模型对28位评委在托幼机构教育质量评价中的评委偏差进行了分析。分析结果显示:28名评委评分宽严度差异显著;3名评委内部一致性较差,其余25名评委内部一致性较稳定;评委与评价班级的交互作用不显著,与评价项目的交互作用显著。研究结果表明MFRM可以对托幼机构教育质量评价的评委偏差进行个体层面的具体分析,从项目反应理论的视角为托幼机构教育质量评价的评委针对性培训、评估评委的合格性从而建立合格评委库等提供现代教育、心理测量学依据。  相似文献   

5.
With the purpose of increasing the knowledge of the psychometric properties of the 70-item Danish Word Association Test, data from three samples of non-patients and psychiatric patients (N = 326) were used to provide two measures of affectivity of the stimulus words, response heterogeneity and reaction time prolongation. It was possible to fit an item response theory one-parameter measurement (Rasch) model to the number of reaction time prolongations (> or =3 seconds) for 54 of the stimulus words. Correlation between Rasch-model item parameters and response heterogeneity was high (r = 0.86), while no correlation was found between either of these measures and frequency of the stimulus words in the Danish language. Both measures of stimulus affectivity supported a theoretically based classification of stimulus words as emotional or neutral. Response heterogeneity measures and Rasch measurement item and person parameters for reaction time prolongations are provided.  相似文献   

6.
A model-based modification (SIBTEST) of the standardization index based upon a multidimensional IRT bias modeling approach is presented that detects and estimates DIF or item bias simultaneously for several items. A distinction between DIF and bias is proposed. SIBTEST detects bias/DIF without the usual Type 1 error inflation due to group target ability differences. In simulations, SIBTEST performs comparably to Mantel-Haenszel for the one item case. SIBTEST investigates bias/DIF for several items at the test score level (multiple item DIF called differential test functioning: DTF), thereby allowing the study of test bias/DIF, in particular bias/DIF amplification or cancellation and the cognitive bases for bias/DIF.This research was partially supported by Office of Naval Research Cognitive and Neural Sciences Grant N0014-90-J-1940, 4421-548 and National Science Foundation Mathematics Grant NSF-DMS-91-01436. The research reported here is collaborative in every respect and the order of authorship is alphabetical. The assistance of Hsin-hung Li and Louis Roussos in conducting the simulation studies was of great help. Discussions with Terry Ackerman, Paul Holland, and Louis Roussos were very helpful.  相似文献   

7.
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood estimation methods (conditional, marginal, and joint). Three information criteria fit indices (Akaike information criterion, Bayesian information criterion, and sample size adjusted BIC) were used in a simulation study and an empirical study. Findings of this study showed that the spurious latent class problem was observed with marginal maximum likelihood and joint maximum likelihood estimations. However, conditional maximum likelihood estimation showed no overextraction problem with non-normal ability distributions.  相似文献   

8.
曹亦薇  毛成美 《心理学报》2008,40(4):427-435
对1952名大学新生进行适应性调查,其中285人接受了2次以上的追踪调查,所得的多级评分重复测量数据采用纵向Rasch模型进行统计分析。研究应用SAS的GLIMMIX过程对多层Rasch模型参数估计作了新的尝试。结果表明:(1)新生在第一学年内,学习和情绪适应总体呈上升趋势,人际适应呈下降趋势;(2)不同个体入学时的适应状况差异显著,但是随时间变化的趋势、快慢相同;(3)学习适应分量表的项目稳定性较好,而人际、情绪适应的部分项目难度存在时间效应。研究结果对新生辅导具有启示意义  相似文献   

9.
Research has demonstrated that individual differences in numeracy may have important consequences for decision making. In the present paper, we develop a shorter, psychometrically improved measure of numeracy—the ability to understand, manipulate, and use numerical information, including probabilities. Across two large independent samples that varied widely in age and educational level, participants completed 18 items from existing numeracy measures. In Study 1, we conducted a Rasch analysis on the item pool and created an eight‐item numeracy scale that assesses a broader range of difficulty than previous scales. In Study 2, we replicated this eight‐item scale in a separate Rasch analysis using data from an independent sample. We also found that the new Rasch‐based numeracy scale, compared with previous measures, could predict decision‐making preferences obtained in past studies, supporting its predictive validity. In Study, 3, we further established the predictive validity of the Rasch‐based numeracy scale. Specifically, we examined the associations between numeracy and risk judgments, compared with previous scales. Overall, we found that the Rasch‐based scale was a better linear predictor of risk judgments than prior measures. Moreover, this study is the first to present the psychometric properties of several popular numeracy measures across a diverse sample of ages and educational level. We discuss the usefulness and the advantages of the new scale, which we feel can be used in a wide range of subject populations, allowing for a more clear understanding of how numeracy is associated with decision processes. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

10.
In the present paper a model for describing dynamic processes is constructed by combining the common Rasch model with the concept of structurally incomplete designs. This is accomplished by mapping each item on a collection of virtual items, one of which is assumed to be presented to the respondent dependent on the preceding responses and/or the feedback obtained. It is shown that, in the case of subject control, no unique conditional maximum likelihood (CML) estimates exist, whereas marginal maximum likelihood (MML) proves a suitable estimation procedure. A hierarchical family of dynamic models is presented, and it is shown how to test special cases against more general ones. Furthermore, it is shown that the model presented is a generalization of a class of mathematical learning models, known as Luce's beta-model.  相似文献   

11.
The justice perspective is the current dominant framework for research on applicant perceptions of test fairness. Recently, an emerging perspective suggests that self-serving bias mechanisms may be operative in the development of test fairness perceptions. Using data from 494 actual applicants to an entry-level State Police Trooper position, this study integrates both the justice and self-serving bias perspectives to achieve a better understanding of test fairness perceptions. Results from structural equation modeling show that perceived job-relevance affects perceived fairness. In addition, test performance affects both perceptions indirectly through perceived performance.  相似文献   

12.
Up to the current time, psychology has provided very limited space for qualitative research to contribute to the discipline, even though psychology has a lot to do with subjectivity and intersubjectivity in its work. This article discusses autoethnography, which by some qualitative researchers is still being debated. The author argues that autoethnography contains strong narrative components or analysis with potential contribution to provide new understanding and to build knowledge. The article discusses criticisms against autoethnography, followed by its distinctive characteristics, which at the same time bring significant potential power. Clear guidelines and steps are needed to minimize biases and to bring about the potential power of autoethnography, and this article aims to address the issues through discussions on intersubjectivity, reflexivity, and ethics. At the end, it might be concluded that autoethnography is a method to investigate not merely the researchers, but to reveal certain phenomena and issues. Autoethnography is one good alternative among other methods that can contribute to developing understanding and knowledge through the construction of substantive theories about a particular issue.  相似文献   

13.
The concept of transliminality ("a hypothesized tendency for psychological material to cross thresholds into or out of consciousness") was anticipated by William James (1902/1982), but it was only recently given an empirical definition by Thalbourne in terms of a 29-item Transliminality Scale. This article presents the 17-item Revised Transliminality Scale (or RTS) that corrects age and gender biases, is unidimensional by a Rasch criterion, and has a reliability of.82. The scale defines a probabilistic hierarchy of items that address magical ideation, mystical experience, absorption, hyperaesthesia, manic experience, dream interpretation, and fantasy proneness. These findings validate the suggestions by James and Thalbourne that some mental phenomena share a common underlying dimension with selected sensory experiences (such being overwhelmed by smells, bright lights, sights, and sounds). Low scores on transliminality remain correlated with "tough mindedness" in on Cattell 16PF test, as well as "self-control" and "rule consciousness," whereas high scores are associated with "abstractedness" and an "openness to change" on that test. An independent validation study confirmed the predictions implied by our definition of transliminality. Implications for test construction are discussed.  相似文献   

14.
Anagrams are frequently used by experimental psychologists interested in how the mental lexicon is organized. Until very recently, research has overlooked the importance of syllable structure in solving anagrams and assumed that solution difficulty was mainly due to frequency factors (e.g., bigram statistics). The present study uses Rasch analysis to demonstrate that the number of syllables is a very important factor influencing anagram solution difficulty for both good and poor problem solvers, with polysyllabic words being harder to solve. Furthermore, it suggests that syllable frequency may have an impact on solution times for polysyllabic words, with more frequent syllables being more difficult to solve. The study illustrates the advantages of Rasch analysis for reliable and unidimensional measurement of item difficulty.  相似文献   

15.
本研究以高兴、愤怒和中性面孔图片为实验材料,采用空间线索任务,借助事件相关电位技术(ERP)探讨低自尊个体注意偏向的内在机制及生理基础,即从电生理的角度,探讨注意偏向的内在机制是反映了注意的快速定向还是注意的解脱困难,亦或是既有快速注意定向又伴随注意的解脱困难。行为数据发现,高低自尊个体在有效提示下的反应显著快于无效提示条件。脑电数据发现,无效提示条件下,愤怒面孔后的靶子比高兴和中性面孔后的靶子在低自尊个体中诱发了更大的P1和更小的N1波幅,有效提示下无显著差异;高自尊个体在N1和P1波幅上无显著结果。晚期P300成分上,无效提示比有效提示诱发了更正的波幅,未发现自尊相关的显著差异。结果表明,低自尊个体对评价性威胁信息(愤怒)的注意偏向是对威胁信息(愤怒)的注意解脱困难。  相似文献   

16.
Kohlberg's characterization of moral development as displaying an invariant hierarchical order of structurally consistent stages is losing ground. However, by applying Rasch analysis, Dawson recently gave new interpretation and support to his characterization of stage development. Using Rasch models, we replicated and strengthened her findings in a re-analysis of three sets of longitudinal socio-moral reasoning data collected in Iceland. A new application of Rasch analysis provided support for upward development. Our results supported Kohlberg's characterization of stage development and the cross-cultural stability of Dawson's findings that were exclusively based on US samples. We conclude that proposals to replace Kohlberg's characterization of moral development are premature.  相似文献   

17.
We give an account of Classical Test Theory (CTT) in terms of the more fundamental ideas of Item Response Theory (IRT). This approach views classical test theory as a very general version of IRT, and the commonly used IRT models as detailed elaborations of CTT for special purposes. We then use this approach to CTT to derive some general results regarding the prediction of the true-score of a test from an observed score on that test as well from an observed score on a different test. This leads us to a new view of linking tests that were not developed to be linked to each other. In addition we propose true-score prediction analogues of the Dorans and Holland measures of the population sensitivity of test linking functions. We illustrate the accuracy of the first-order theory using simulated data from the Rasch model, and illustrate the effect of population differences using a set of real data.This research is collaborative in every respect and the order of authorship is alphabetical. It was begun when both authors were on the faculty of the Graduate School of Education at the University of California, Berkeley.We would like to thank both Neil Dorans, Skip Livingston and two anonymous referees for many suggestions that have greatly improved this paper.  相似文献   

18.
Huynh Huynh 《Psychometrika》1994,59(1):111-119
Given a Masters partial credit item withn known step difficulties, conditions are stated for the existence of a set of (locally) independent Rasch binary items such that their raw score and the partial credit raw score have identical probability density functions. The conditions are those for the existence ofn positive values with predetermined elementary symmetric functions and include the requirement that then step difficulties form an increasing sequence.  相似文献   

19.
《Behavior Therapy》2022,53(5):843-857
Clinical perfectionism contributes to the onset and maintenance of multiple psychological concerns. We conducted a randomized, longitudinal test of the efficacy of a web-based intervention for perfectionism (specifically, cognitive bias modification, interpretation retraining; CBM-I), compared to an active treatment comparison condition (specifically, guided visualization relaxation training) for reducing perfectionism and related psychopathology. College students (N = 167) with elevated perfectionism were randomized to one of the two study conditions and were asked to complete their assigned intervention twice weekly for 4 weeks. Participants completed measures of perfectionism and psychological symptoms at baseline, 2 weeks (midway through the intervention period), 4 weeks (at the conclusion of the intervention period), and 8 weeks (1 month follow-up). CBM-I was rated as acceptable overall, though relaxation training was rated slightly more favorably. CBM-I outperformed relaxation training on improving perfectionism-relevant interpretation biases (i.e., increasing nonperfectionistic interpretations and decreasing perfectionistic interpretations), though with small effect sizes and inconsistency across study timepoints. Self-reported perfectionism showed small decreases across time in both intervention conditions. Support was found for a key hypothesized mechanism of CBM-I, such that randomization to CBM-I had a longitudinal, indirect effect on decreasing psychopathology symptom scores through improving perfectionism-relevant interpretation biases. However, in light of small effect sizes, the present study failed to provide compelling evidence that CBM-I for perfectionism contributes meaningfully to the treatment of perfectionism.  相似文献   

20.
Depressed mood is associated with making negatively biased interpretations of ambiguous everyday events. Experimental modification towards a more optimistic interpretation has become a focus of recent research. However, to date, no measures exist that have been tested with respect to their psychometric properties that justify repeated administration to capture change. We aimed to develop and evaluate a pragmatic assessment instrument, consisting of a 30-item questionnaire (long version) and two 15-item parallel short versions (A and B). Items were generated as ambiguous sentences, reflecting three relevant content areas based on Beck's cognitive triad. The sentences were rated for pleasantness, and this emotional appraisal task indicates the emotional valence of the interpretation. Due to the intention to develop a parallel test version, item-twins were generated. All three versions of the instrument were found to be structurally stable, internally consistent and valid. In line with Beck's cognitive triad in depression, confirmatory factor analyses determined a three-factor solution (i.e. self, experiences and future). Significant correlations were found between all scales and depressive mood. The two short versions represent the same underlying constructs, share identical psychometric properties and possess high parallel-test reliability. This study is the first to evaluate and confirm the factorial validity as well as the parallel-test reliability of an interpretation bias measure. It is suitable to measure bias modification and has therefore great potential for research and clinical practice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号