首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
It is shown that in the context of the Model with Internal Restrictions on the Item Difficulties (MIRID), different componential theories about an item set may lead to equivalent models. Furthermore, we provide conditions for the identifiability of the MIRID model parameters, and it will be shown how the MIRID model relates to the Linear Logistic Test Model (LLTM). While it is known that the LLTM is a special case of the MIRID, we show that it is possible to construct an LLTM that encompasses the MIRID. The MIRID model places a bilinear restriction on the item parameters of the Rasch model. It is explained how this fact is used to simplify the results of Bechger, Verhelst, and Verstralen (2001) and Bechger, Verstralen, and Verhelst (2002), and extend their scope to a wider class of models.  相似文献   

2.
余嘉元 《心理学报》1994,27(2):219-224
为探讨线性逻辑斯谛模型(LLTM)的拟合条件及其和解题策略同质性之间的关系,让被试比较两个负整数指数幂的大小,发现全体被试的数据不能与拉希模型及LLTM相拟合。把被试按其解题策略分成不同策略组后,同一策略组被试的数据可以拟合于拉希模型,但对于LLTM,同一策略组的数据中部分项目的拟合较好,另外一些项目的拟合较差。这一结果表明,解题策略的同质性是LLTM拟合的必要条件,但还不是充分条件。  相似文献   

3.
Visual perceptual skills of school-age children are often assessed using the Supplemental Developmental Test of Visual Perception of the Developmental Test of Visual-Motor Integration. The study purpose was to consider the construct validity of this test by evaluating its scalability (interval level measurement), unidimensionality, differential item functioning, and hierarchical ordering of its items. Visual perceptual performance scores from a sample of 356 typically developing children (171 boys and 185 girls ages 5 to 11 years) were used to complete a Rasch analysis of the test. Seven items were discarded for poor fit, while none of the items exhibited differential item functioning by sex. The construct validity, scalability, hierarchical ordering, and lack of differential item functioning requirements were met by the final test version. Since 7 test items did not fit the Rasch analysis specifications, the clinical value of the test is questionable and limited.  相似文献   

4.
相比多参数多维度IRT模型通过增加参数的方式来提升模型拟合度和解释度,Rasch模型流派强调“理论驱动研究”和“数据符合模型”,推崇单参数单维度的测量模型能最大限度地减少额外因素对真实测量目的的影响和干扰,从而保证测量的客观性和准确性。Rasch模型关注测量目标与测量工具的对应关系,它的“简单”特性有助于研究者更准确地评估和解释被测目标与测量工具间的适配性,且在将非线性数据转化为等距数据时具有天然的优势。  相似文献   

5.
初中词汇理解能力量表的编制   总被引:4,自引:2,他引:2  
曹亦薇 《心理学报》1999,32(2):215-221
应用项目反应理论为初中各年级编制了词汇理解能力的测验,其中包含了143个多项选择的词汇项目,经过反复预测和大规模的正式测试,证关了这三个测验的量表拟全于2PL模型,项目特征曲线拟合度良好的项目占全体项目数90%以上,能力的一维性也得以确认,经等值化后,各年级的区分度均值分别为0.61(初一),0.59(初二),0.55(初三)难度均值分别为-1.61,-1.30,-0.56。  相似文献   

6.
采用Rosenberg自尊量表(RSES)对425名在校大学生进行施测,应用项目反应理论的Rasch模型对项目指标进行分析及DIF检验。结果表明,Rosenberg自尊量表具有单维性,量表的信度为0.84; 除项目8以外,其他项目拟合指标良好,较适用来区分中等及偏低自尊水平的个体,项目功能差异检验发现在项目1和项目5上存在DIF,表现为男生自尊水平要高于女生。相对于经典测量理论,应用Rasch模型分析Rosenberg自尊量表具有优势,为进一步的完善和使用该自尊量表提供依据。  相似文献   

7.
A general latent trait model for response processes   总被引:1,自引:0,他引:1  
The purpose of the current paper is to propose a general multicomponent latent trait model (GLTM) for response processes. The proposed model combines the linear logistic latent trait (LLTM) with the multicomponent latent trait model (MLTM). As with both LLTM and MLTM, the general multicomponent latent trait model can be used to (1) test hypotheses about the theoretical variables that underlie response difficulty and (2) estimate parameters that describe test items by basic substantive properties. However, GLTM contains both component outcomes and complexity factors in a single model and may be applied to data that neither LLTM nor MLTM can handle. Joint maximum likelihood estimators are presented for the parameters of GLTM and an application to cognitive test items is described.This research was partially supported by the National Institute of Education grant number NIE-6-7-0156 to Susan Embretson (Whitely), principal investigator. However the optinions expressed herein do not necessarily reflect the position or policy of the National Institute of Education, and no official endorsement by the National Institute of Education should be inferred.  相似文献   

8.
刘昊  刘肖岑  冯晓霞 《心理科学》2013,36(2):484-488
本研究的目的在于应用Rasch模型编制和分析数学入学准备测验,从而分析Rasch模型的有效性和优势。自编数学入学准备测试,对150名平均年龄为6.6岁的儿童进行测查,应用Rasch模型对题目和评分等级做出修正并分析结果。结果表明修正后的测试具有较好的信效度,较好地拟合了Rasch模型,评分等级设置合理,测试的整体难度相对较低。儿童的Rasch分数和性别无关,但受到年龄、家庭社会经济地位的影响。相对于经典测量理论而言,应用Rasch模型进行入学准备测试的编制和分析具有优势。  相似文献   

9.
Various definitions and different approaches for assessing the complex construct of parental involvement (PI) have led to inconsistent findings regarding the impact of PI on child development. To date, limited information is available regarding the measurement invariance of PI measures across time and groups (e.g., children’s gender, ethnicity, and socio-economic status), leaving a concern that group differences in PI might reflect item bias instead of true differences in PI. The present study aimed to obtain a set of optimal items for measuring PI from kindergarten through the elementary school years and investigate whether they could be used for parents from different groups. A Rasch measurement model was implemented to investigate item difficulty, step calibrations, and measurement invariance (differential item functioning; DIF, here). The results from the Early Childhood Longitudinal Study, Kindergarten Class of 1998–1999 data set showed that 20 items can be used to measure three dimensions of PI—namely school/home involvement, family educational investment, and family routines—across four time points. Administrative time, children’s gender, ethnicity, and social economic status showed different levels of effect on item difficulty for half of these items. Practitioners and researchers should be cautious when using these items and are suggested to freely estimate the item parameters of DIF items as well as add more items to the PI scale to improve reliability.  相似文献   

10.
This article attempts to present emotioncy as a potential source of test bias to inform the analysis of test item performance. Emotioncy is defined as a hierarchy, ranging from exvolvement (auditory, visual, and kinesthetic) to involvement (inner and arch), to emphasize the emotions evoked by the senses. This study hypothesizes that when individuals have high levels of emotioncy for specific words, their test performance may systematically change, resulting in test bias. To this end, 355 individuals were asked to take a 40-item vocabulary test along with the emotioncy scale. Mixed Rasch model was employed to flag differential item functioning items. Results illustrated that the test takers with high emotioncy toward specific words outperformed the ones in the low-emotioncy group, characterizing emotioncy as a potential source of test bias.  相似文献   

11.
This paper uses an extension of the network algorithm originally introduced by Mehta and Patel to construct exact tail probabilities for testing the general hypothesis that item responses are distributed according to the Rasch model. By assuming that item difficulties are known, the algorithm is applicable to the statistical tests either given the maximum likelihood ability estimate or conditioned on the total score. A simulation study indicates that the network algorithm is an efficient tool for computing the significance level of a person fit statistic based on test lengths of 30 items or less.  相似文献   

12.
The paper addresses three neglected questions from IRT. In section 1, the properties of the “measurement” of ability or trait parameters and item difficulty parameters in the Rasch model are discussed. It is shown that the solution to this problem is rather complex and depends both on general assumptions about properties of the item response functions and on assumptions about the available item universe. Section 2 deals with the measurement of individual change or “modifiability” based on a Rasch test. A conditional likelihood approach is presented that yields (a) an ML estimator of modifiability for given item parameters, (b) allows one to test hypotheses about change by means of a Clopper-Pearson confidence interval for the modifiability parameter, or (c) to estimate modifiability jointly with the item parameters. Uniqueness results for all three methods are also presented. In section 3, the Mantel-Haenszel method for detecting DIF is discussed under a novel perspective: What is the most general framework within which the Mantel-Haenszel method correctly detects DIF of a studied item? The answer is that this is a 2PL model where, however, all discrimination parameters are known and the studied item has the same discrimination in both populations. Since these requirements would hardly be satisfied in practical applications, the case of constant discrimination parameters, that is, the Rasch model, is the only realistic framework. A simple Pearsonx 2 test for DIF of one studied item is proposed as an alternative to the Mantel-Haenszel test; moreover, this test is generalized to the case of two items simultaneously studied for DIF.  相似文献   

13.
Rasch models are characterised by sufficient statistics for all parameters. In the Rasch unidimensional model for two ordered categories, the parameterisation of the person and item is symmetrical and it is readily established that the total scores of a person and item are sufficient statistics for their respective parameters. In contrast, in the unidimensional polytomous Rasch model for more than two ordered categories, the parameterisation is not symmetrical. Specifically, each item has a vector of item parameters, one for each category, and each person only one person parameter. In addition, different items can have different numbers of categories and, therefore, different numbers of parameters. The sufficient statistic for the parameters of an item is itself a vector. In estimating the person parameters in presently available software, these sufficient statistics are not used to condition out the item parameters. This paper derives a conditional, pairwise, pseudo-likelihood and constructs estimates of the parameters of any number of persons which are independent of all item parameters and of the maximum scores of all items. It also shows that these estimates are consistent. Although Rasch’s original work began with equating tests using test scores, and not with items of a test, the polytomous Rasch model has not been applied in this way. Operationally, this is because the current approaches, in which item parameters are estimated first, cannot handle test data where there may be many scores with zero frequencies. A small simulation study shows that, when using the estimation equations derived in this paper, such a property of the data is no impediment to the application of the model at the level of tests. This opens up the possibility of using the polytomous Rasch model directly in equating test scores.  相似文献   

14.
In this study, we compared classical test theory (CTT) and item response theory (IRT) approaches in analyzing the Center for Epidemiological Studies Depression (CES-D) Scale (Radloff, 1977). Standard item analyses, as well as Rasch (1960) analyses, both revealed item departures from unidimensionality in a sample of 2,455 older persons responding to the CES-D. Positive affect items in the scale performed poorly overall, their removal reducing the scale's bandwidth only slightly. Modeling depression scores derived from Rasch measures and raw totals showed subtle but important differences for statistical inference. The assessment of depressive risk was slightly enhanced by using 16-item scale measures obtained from the results of the Rasch analysis as the dependent variable. Confirmatory factor analysis and parallel analysis verified the advantages of removing positively worded items. IRT and CTT techniques proved to be complementary in this study and can be usefully combined to improve measuring depression.  相似文献   

15.
Anagrams are frequently used by experimental psychologists interested in how the mental lexicon is organized. Until very recently, research has overlooked the importance of syllable structure in solving anagrams and assumed that solution difficulty was mainly due to frequency factors (e.g., bigram statistics). The present study uses Rasch analysis to demonstrate that the number of syllables is a very important factor influencing anagram solution difficulty for both good and poor problem solvers, with polysyllabic words being harder to solve. Furthermore, it suggests that syllable frequency may have an impact on solution times for polysyllabic words, with more frequent syllables being more difficult to solve. The study illustrates the advantages of Rasch analysis for reliable and unidimensional measurement of item difficulty.  相似文献   

16.
Anagrams are frequently used by experimental psychologists interested in how the mental lexicon is organized. Until very recently, research has overlooked the importance of syllable structure in solving anagrams and assumed that solution difficulty was mainly due to frequency factors (e.g., bigram statistics). The present study uses Rasch analysis to demonstrate that the number of syllables is a very important factor influencing anagram solution difficulty for both good and poor problem solvers, with polysyllabic words being harder to solve. Furthermore, it suggests that syllable frequency may have an impact on solution times for polysyllabic words, with more frequent syllables being more difficult to solve. The study illustrates the advantages of Rasch analysis for reliable and unidimensional measurement of item difficulty.  相似文献   

17.
多维题组效应Rasch模型   总被引:2,自引:0,他引:2  
首先, 本文诠释了“题组”的本质即一个存在共同刺激的项目集合。并基于此, 将题组效应划分为项目内单维题组效应和项目内多维题组效应。其次, 本文基于Rasch模型开发了二级评分和多级评分的多维题组效应Rasch模型, 以期较好地处理项目内多维题组效应。最后, 模拟研究结果显示新模型有效合理, 与Rasch题组模型、分部评分模型对比研究后表明:(1)测验存在项目内多维题组效应时, 仅把明显的捆绑式题组效应进行分离而忽略其他潜在的题组效应, 仍会导致参数的偏差估计甚或高估测验信度; (2)新模型更具普适性, 即便当被试作答数据不存在题组效应或只存在项目内单维题组效应, 采用新模型进行测验分析也能得到较好的参数估计结果。  相似文献   

18.
The Rasch model is an item analysis model with logistic item characteristic curves of equal slope,i.e. with constant item discriminating powers. The proposed goodness of fit test is based on a comparison between difficulties estimated from different scoregroups and over-all estimates. Based on the within scoregroup estimates and the over-all estimates of item difficulties a conditional likelihood ratio is formed. It is shown that—2 times the logarithm of this ratio isx 2-distributed when the Rasch model is true. The power of the proposed goodness of fit test is discussed for alternative models with logistic item characteristic curves, but unequal discriminating items from a scholastic aptitude test.  相似文献   

19.
This article describes a generalized longitudinal mixture item response theory (IRT) model that allows for detecting latent group differences in item response data obtained from electronic learning (e-learning) environments or other learning environments that result in large numbers of items. The described model can be viewed as a combination of a longitudinal Rasch model, a mixture Rasch model, and a random-item IRT model, and it includes some features of the explanatory IRT modeling framework. The model assumes the possible presence of latent classes in item response patterns, due to initial person-level differences before learning takes place, to latent class-specific learning trajectories, or to a combination of both. Moreover, it allows for differential item functioning over the classes. A Bayesian model estimation procedure is described, and the results of a simulation study are presented that indicate that the parameters are recovered well, particularly for conditions with large item sample sizes. The model is also illustrated with an empirical sample data set from a Web-based e-learning environment.  相似文献   

20.
The Remote Associates Test (RAT) is a well-known measure of creativity, with each item on the RAT is composed of three unrelated stimulus words. The participant’s task is to find an answer in the form of a word that could combine with each of the stimulus words, thus forming three new actual nouns. Researchers have modified the RAT to develop compound remote associate problems that emphasize combining vocabulary to form compound words. In the field of creativity research for Mandarin speakers, the Chinese RAT has been widely applied for over 10 years. The original RAT, compound remote associate problems, and Chinese RAT have various common advantages, such as being convenient to use and having objective scoring; additionally, the development of items for certain tests is easy and satisfies the requirements of psychological assessments in terms of the quantity of items. Currently, many language editions of the RAT and compound remote associate problems already exist. In particular, the English and Italian versions of these tests already have derived normative data. Because approximately 20% of the world’s population are native Mandarin speakers, and because increasing numbers of people are choosing Mandarin as a second language, the need to increase Mandarin-language resources is growing; however, normative data for the Chinese RAT still do not exist. To address this issue, in the present study we developed Chinese compound remote associate problems and analyzed the passing rates by items, problem solving times, and various normative data, using the responses of 253 subjects in three experiments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号