首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
The logistic function is proposed as an alternative to the integrated normal function when estimating parameters of test items. The logistic curve is described; an iterative method for finding maximum likelihood estimates of its parameters is given, and an example of its use is presented.  相似文献   

2.
It is common to assume that the proportion of correct answers to an item has a normal-ogive or logistic relationship to total test score. However, this is shown to be a mistaken and an undesirable notion.  相似文献   

3.
Indexes of skewness and kurtosis for a test-score distribution are expressed in terms of item parameters. Both are shown to depend, in part, on item means, variances, and covariances. The index of skewness depends also on trivariances. A trivariance is a product moment involving first powers of deviation scores for three items. The index of kurtosis depends on quadrivariances, as well as trivariances. A quadrivariance is a product moment involving first powers of deviation scores for four items. Empirical data are presented for responses of groups of subjects to 25 triads and 25 tetrads of items from five tests.Certain parts of this article represent the results of doctoral research conducted by Hundleby and Goldstein under the direction of Ray in the Department of Psychology at Pennsylvania State University. The authors are indebted to Professor Lester Guest and Professor William Lepley for their supervisory assistance in the final stages of the two dissertations during the absence of the senior author.  相似文献   

4.
Most indexes of item validity and difficulty vary systematically with changes in the mean and variance of the group. Formulas are presented showing how certain item parameters will vary with these alterations in group mean and variance. Item parameters are also suggested which should remain invariant under such changes. These parameters are developed under two different assumptions: first, the assumption that thetotal distribution of the item ability variable is normal, and, second, that the distribution of the item ability variablefor each array of the explicit selection variable is normal. The writer wishes to acknowledge helpful discussions of this paper with Paul Horst and Herbert S. Sichel who have worked on various aspects of the problem of invariant item parameters.  相似文献   

5.
The item response function (IRF) for a polytomously scored item is defined as a weighted sum of the item category response functions (ICRF, the probability of getting a particular score for a randomly sampled examinee of ability ). This paper establishes the correspondence between an IRF and a unique set of ICRFs for two of the most commonly used polytomous IRT models (the partial credit models and the graded response model). Specifically, a proof of the following assertion is provided for these models: If two items have the same IRF, then they must have the same number of categories; moreover, they must consist of the same ICRFs. As a corollary, for the Rasch dichotomous model, if two tests have the same test characteristic function (TCF), then they must have the same number of items. Moreover, for each item in one of the tests, an item in the other test with an identical IRF must exist. Theoretical as well as practical implications of these results are discussed.This research was supported by Educational Testing Service Allocation Projects No. 79409 and No. 79413. The authors wish to thank John Donoghue, Ming-Mei Wang, Rebecca Zwick, and Zhiliang Ying for their useful comments and discussions. The authors also wish to thank three anonymous reviewers for their comments.  相似文献   

6.
The purpose of this note is twofold: (a) to present the formula for the item information function (IIF) in any direction for the Multidimensional 3-Parameter Logistic (M3-PL) model and (b) to give the equation for the location of maximum item information (θmax) in the direction of the item discrimination vector. Several corollaries are given. Implications for future research are discussed.This research was supported in part by an Educational Testing Service (ETS) Harold T. Gulliksen Psychometric Research Fellowship to the author.This revised article was published online in August 2005 with the PDF paginated correctly.  相似文献   

7.
Given that a key function of tests is to serve as evaluation instruments and for decision making in the fields of psychology and education, the possibility that some of their items may show differential behaviour is a major concern for psychometricians. In recent decades, important progress has been made as regards the efficacy of techniques designed to detect this differential item functioning (DIF). However, the findings are scant when it comes to explaining its causes. The present study addresses this problem from the perspective of multilevel analysis. Starting from a case study in the area of transcultural comparisons, multilevel logistic regression is used: 1) to identify the item characteristics associated with the presence of DIF; 2) to estimate the proportion of variation in the DIF coefficients that is explained by these characteristics; and 3) to evaluate alternative explanations of the DIF by comparing the explanatory power or fit of different sequential models. The comparison of these models confirmed one of the two alternatives (familiarity with the stimulus) and rejected the other (the topic area) as being a cause of differential functioning with respect to the compared groups.  相似文献   

8.
The purpose of the present research was to compare memory for an item with memory for the item’s source. Experiment 1 investigated discrimination between two external sources: each item in a list of words was spoken in either a male or a female voice. Subjects received a test of item recognition and a test of source monitoring at each of four delay intervals (immediate, 30 min, 48 h, 1 week). In contrast with previous research, no evidence of differential forgetting rates for item and source information was found. With delay intervals of 0 and 48 h, Experiment 2 replicated Experiment 1 while adding a reality monitoring condition that required discrimination between an internal (i.e., self-generated) and an external source. Subjects were better at making internal-external discriminations than at making external-external discriminations, but both types of source monitoring declined at the same rate as memory for the items themselves.  相似文献   

9.
Algarabel S  Pitarque A 《Psicothema》2007,19(1):163-170
Conflicting theories argue that recognition is achieved either by familiarity exclusively, or by a mixture of familiarity and recollection. We explore in three experiments the goodness of fit of both positions to experimental data in which context information is manipulated. In Experiments 1 and 2, we explore the availability of context information in recognition, testing the focus stimulus, its context, and their associative relation. In Experiment 3, participants were confronted with a plurality task in an attempt to force them to use the peripheral information in recognition. The results show that people acquire specific associative information, and although overall recognition performance was not affected by the use of context, receiver operating characteristic (ROC) analysis showed that people use a duality of processes in recognition.  相似文献   

10.
A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales.  相似文献   

11.
Current interest in the assessment of measurement equivalence emphasizes 2 major methods of analysis. The authors offer a comparison of a linear method (confirmatory factor analysis) and a nonlinear method (differential item and test functioning using item response theory) with an emphasis on their methodological similarities and differences. The 2 approaches test for the equality of true scores (or expected raw scores) across 2 populations when the latent (or factor) score is held constant. Both approaches can provide information about when measurement nonequivalence exists and the extent to which it is a problem. An empirical example is used to illustrate the 2 approaches.  相似文献   

12.
13.
14.
针对双目标CD-CAT,将六种项目区分度(鉴别力D、一般区分度GDI、优势比OR、2PL的区分度a、属性区分度ADI、认知诊断区分度CDI)分别与IPA方法结合,得到新的选题策略。模拟研究比较了它们的表现,还考察了区分度分层在控制项目曝光的表现。结果发现:新方法都能明显提高知识状态的判准率和能力估计精度;分层选题均能很好地提高题库利用率。总体上,OR加权能显著提高测量精度;OR分层选题在保证测量精度条件下显著提高项目曝光均匀性。  相似文献   

15.
The effect of prolonged practice upon item recognition performance was investigated under conditions of nested positive sets and complete response consistency. Nesting is defined by each positive set containing all the items contained in smaller positive sets. Response consistency is defined by each item in the stimulus set consistently requiring only a positive or only a negative response. A low error level was maintained. Twelve Ss worked with three positive set sizes in each of 36 sessions. Half the Ss worked with digit stimuli and half with pictures. The item recognition function (that function relating response latency and positive set size) was found to be negatively accelerated throughout the course of practice. The effect of positive set size decreased significantly (p < .001) with practice, and set size effects were significantly (p < .03) greater for positive response trials than for negative response trials. Kind of item had no effect on the set size effect. A theoretical framework consistent with these results is suggested. Results from the present study are compared with findings obtained previously from visual search studies. It is concluded that when the procedures in both tasks include response consistency, nested positive sets, and low error levels, the effects of prolonged practice upon the set size from item recognition and visual search are qualitatively very similar.  相似文献   

16.
Caution indices based on item response theory   总被引:2,自引:0,他引:2  
A new family of indices was introduced earlier as a link between two approaches: One based on item response theory and the other on sample statistics. In this study, the statistical properties of these indices are investigated and then the relationships to Guttman Scales, and to item and person response curves are discussed. Further, these indices are standardized, and an example of their potential usefulness for diagnosing students' misconceptions is shown.This research was sponsored by the Personnel and Training Research Program, Psychological Sciences Division, Office of Naval Research, under contract No. N00014-82-K-0604.  相似文献   

17.
This paper discusses two forms of separability of item and person parameters in the context of response time (RT) models. The first is separate sufficiency: the existence of sufficient statistics for the item (person) parameters that do not depend on the person (item) parameters. The second is ranking independence: the likelihood of the item (person) ranking with respect to RTs does not depend on the person (item) parameters. For each form a theorem stating sufficient conditions, is proved. The two forms of separability are shown to include several (special cases of) models from psychometric and biometric literature. Ranking independence imposes no restrictions on the general distribution form, but on its parametrization. An estimation procedure based upon ranks and pseudolikelihood theory is discussed, as well as the relation of ranking independence to the concept of double monotonicity.I am indebted to Wim van der Linden for bringing Thissen's (1983) paper to my notice, and to Martijn Berger, Frans Tan, and the anonymous reviewers for their constructive comments on earlier drafts of this paper.  相似文献   

18.
In tailored testing, it is important to determine the optimal difficulty of the next item to present to the examinee. This paper shows that the difference that maximizes information for the three-parameter normal ogive response model is approximately 1.7 times the optimal differenceb for the three-parameter logistic model. Under the normal model, calculation of the optimal difficulty for minimizing the Bayes risk is equivalent to maximizing an associated information function.The views expressed herein, are those of the author and do not necessarily reflect those of the Department of the Navy.  相似文献   

19.
Asymptotic formulas are derived for the bias in the maximum likelihood estimators of the item parameters in the logistic item response model when examinee abilities are known. Numerical results are given for a typical verbal test for college admission.  相似文献   

20.
The four-parameter logistic (4PL) item response model, which includes an upper asymptote for the correct response probability, has drawn increasing interest due to its suitability for many practical scenarios. This paper proposes a new Gibbs sampling algorithm for estimation of the multidimensional 4PL model based on an efficient data augmentation scheme (DAGS). With the introduction of three continuous latent variables, the full conditional distributions are tractable, allowing easy implementation of a Gibbs sampler. Simulation studies are conducted to evaluate the proposed method and several popular alternatives. An empirical data set was analysed using the 4PL model to show its improved performance over the three-parameter and two-parameter logistic models. The proposed estimation scheme is easily accessible to practitioners through the open-source IRTlogit package.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号