共查询到20条相似文献,搜索用时 15 毫秒
1.
Paul R. Rosenbaum 《Psychometrika》1987,52(2):217-233
Test items are often evaluated and compared by contrasting the shapes of their item characteristics curves (ICC's) or surfaces. The current paper develops and applies three general (i.e., nonparametric) comparisons of the shapes of two item characteristic surfaces: (i) proportional latent odds, (ii) uniform relative difficulty, and (iii) item sensitivity. Two items may be compared in these ways while making no assumption about the shapes of item characteristic surfaces for other items, and no assumption about the dimensionality of the latent variable. Also studied is a method for comparing the relative shapes of two item characteristic curves in two examinee populations.The author is grateful to Paul Holland, Robert Mislevy, Tue Tjur, Rebecca Zwick, the editor and reviewers for valuable comments on the subject of this paper, to Mari A. Pearlman for advice on the pairing of items in the examples, and to Dorothy Thayer for assistance with computing. 相似文献
2.
Paul R. Rosenbaum 《Psychometrika》1988,53(3):349-359
An item bundle is a small group of multiple choice items that share a common reading passage or graph, or a small group of matching items that share distractors. Item bundles are easily identified by paging through a copy of a test. Bundled items may violate the latent conditional independence assumption of unidimensional item response theory (IRT), but such a violation would not typically suggest the existence of a new fundamental human ability to read one specific reading passage or to interpret one specific graph. It is important, therefore, to have theoretical concepts and empirical checks that distinguish between, on the one hand, anticipated violations of latent conditional independence within item bundles, and, on the other hand, violations that cannot be attributed to idiosyncratic features of test format and instead suggest departures from unidimensionalty. To this end, two theorems on unidimensional IRT are extended to describe observable item response distributions when there is conditional independencebetween but not necessarilywithin item bundles.The author is grateful to Ivo Molenaar and the referees for many helpful suggestions, and to D. Thayer for assistance with computing. 相似文献
3.
Item response theory posits local independence, or conditional independence of item responses given item parameters and examinee proficiency parameters. The usual definition of local independence, however, addresses the context of fixed tests, and initially appears to yield incorrect response-pattern probabilities in the context of adaptive testing. The paradox is resolved by introducing additional notation to deal with the item selection mechanism.We are grateful to Charlie Lewis, Ming-Mei Wang, and Pao-Kuei Wu for discussions on this topic, and to the Editor, the reviewers, and Howard Wainer for helpful comments on an earlier version of the paper. The first author's work was supported in part by the National Center for Research on Evaluation, Standards, Student Testing (CRESST), Educational Research and Development Program, cooperative agreement number R117G10027 and CFDA catalog number 84.117G, as administered by the Office of Educational Research and Improvement, U.S. Department of Education. 相似文献
4.
This is a reaction to Borsboom's (2006) discussion paper on the issue that psychology takes so little notice of the modern
developments in psychometrics, in particular, latent variable methods. Contrary to Borsboom, it is argued that latent variables
are summaries of interesting data properties, that construct validation should involve studying nomological networks, that
psychological research slowly but definitely will incorporate latent variable methods, and that the role of psychometrics
in psychology is that of partner, not role model.
Requests for reprints should be sent to Klaas Sijtsma, Department of Methodology and Statistics, FSW, Tilburg University,
PO Box 90153, 5000 LE, Tilburg, The Netherlands. 相似文献
5.
Dylan Molenaar Daniel Oberski Jeroen Vermunt Paul De Boeck 《Multivariate behavioral research》2016,51(5):606-626
Current approaches to model responses and response times to psychometric tests solely focus on between-subject differences in speed and ability. Within subjects, speed and ability are assumed to be constants. Violations of this assumption are generally absorbed in the residual of the model. As a result, within-subject departures from the between-subject speed and ability level remain undetected. These departures may be of interest to the researcher as they reflect differences in the response processes adopted on the items of a test. In this article, we propose a dynamic approach for responses and response times based on hidden Markov modeling to account for within-subject differences in responses and response times. A simulation study is conducted to demonstrate acceptable parameter recovery and acceptable performance of various fit indices in distinguishing between different models. In addition, both a confirmatory and an exploratory application are presented to demonstrate the practical value of the modeling approach. 相似文献
6.
A general latent trait model for response processes 总被引:1,自引:0,他引:1
Susan Embretson 《Psychometrika》1984,49(2):175-186
The purpose of the current paper is to propose a general multicomponent latent trait model (GLTM) for response processes. The proposed model combines the linear logistic latent trait (LLTM) with the multicomponent latent trait model (MLTM). As with both LLTM and MLTM, the general multicomponent latent trait model can be used to (1) test hypotheses about the theoretical variables that underlie response difficulty and (2) estimate parameters that describe test items by basic substantive properties. However, GLTM contains both component outcomes and complexity factors in a single model and may be applied to data that neither LLTM nor MLTM can handle. Joint maximum likelihood estimators are presented for the parameters of GLTM and an application to cognitive test items is described.This research was partially supported by the National Institute of Education grant number NIE-6-7-0156 to Susan Embretson (Whitely), principal investigator. However the optinions expressed herein do not necessarily reflect the position or policy of the National Institute of Education, and no official endorsement by the National Institute of Education should be inferred. 相似文献
7.
Although the Defense Mechanism Test (DMT) has been in use for almost half a century, there are still quite contradictory views about whether it is a reliable instrument, and if so, what it really measures. Thus, based on data from 39 female students, we first examined DMT inter-coder reliability by analyzing the agreement among trained judges in their coding of the same DMT protocols. Second, we constructed a "parallel" photographic picture that retained all structural characteristic of the original and analyzed DMT parallel-test reliability. Third, we examined the construct validity of the DMT by (a) employing three self-report defense-mechanism inventories and analyzing the intercorrelations between DMT defense scores and corresponding defenses in these instruments, (b) studying the relationships between DMT responses and scores on trait and state anxiety, and (c) relating DMT-defense scores to measures of self-esteem. The main results showed that the DMT can be coded with high reliability by trained coders, that the parallel-test reliability is unsatisfactory compared to traditional psychometric standards, that there is a certain generalizability in the number of perceptual distortions that people display from one picture to another, and that the construct validation provided meager empirical evidence for the conclusion that the DMT measures what it purports to measure, that is, psychological defense mechanisms. 相似文献
8.
Yanyan Fu Tyler Strachan Edward H. Ip John T. Willse Shyh-Huei Chen Terry Ackerman 《International Journal of Testing》2020,20(2):169-186
This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and conditions. When a test measured weakly discriminated dimensions, it became harder to recover the latent correlation. Results also showed that increasing the sample size, test length, or using simpler models (i.e., two-parameter logistic rather than three-parameter logistic, compensatory rather than noncompensatory) could improve the recovery of latent correlation. 相似文献
9.
John H. Wolfe 《Psychometrika》1981,46(4):461-464
In tailored testing, it is important to determine the optimal difficulty of the next item to present to the examinee. This paper shows that the difference that maximizes information for the three-parameter normal ogive response model is approximately 1.7 times the optimal difference –b for the three-parameter logistic model. Under the normal model, calculation of the optimal difficulty for minimizing the Bayes risk is equivalent to maximizing an associated information function.The views expressed herein, are those of the author and do not necessarily reflect those of the Department of the Navy. 相似文献
10.
Thomas E. Love 《Psychometrika》1997,62(1):51-62
A latent variable representation for multiple-choice item and option characteristic curves is presented. Under standard assumptions of conditional independence of item responses and monotonicity of item characteristic curves, a criterion for distractors is proposed based on distractor selection ratios. A connection is made between the proposed criterion and the theory of individual choice behavior, providing new insight. The main results allow for the testing of the criterion from observable data without first specifying a parametric form for the characteristic curves. A series of examples apply the method. 相似文献
11.
We consider latent variable models for an infinite sequence (or universe) of manifest (observable) variables that may be discrete, continuous or some combination of these. The main theorem is a general characterization by empirical conditions of when it is possible to construct latent variable models that satisfy unidimensionality, monotonicity, conditional independence, andtail-measurability. Tail-measurability means that the latent variable can be estimated consistently from the sequence of manifest variables even though an arbitrary finite subsequence has been removed. The characterizing,necessary and sufficient, conditions that the manifest variables must satisfy for these models are conditional association and vanishing conditional dependence (as one conditions upon successively more other manifest variables). Our main theorem considerably generalizes and sharpens earlier results of Ellis and van den Wollenberg (1993), Holland and Rosenbaum (1986), and Junker (1993). It is also related to the work of Stout (1990).The main theorem is preceded by many results for latent variable modelsin general—not necessarily unidimensional and monotone. They pertain to the uniqueness of latent variables and are connected with the conditional independence theorem of Suppes and Zanotti (1981). We discuss new definitions of the concepts of true-score and subpopulation, which generalize these notions from the stochastic subject, random sampling, and domain sampling formulations of latent variable models (e.g., Holland, 1990; Lord & Novick, 1968). These definitions do not require the a priori specification of a latent variable model.The authors made equivalent contributions to the results of this article. Ellis' research was supported by the Dutch Interuniversitary Graduate School of Psychometrics and Sociometrics. Junker's research was supported by ONR Grant N00014-87-K-0277, NIMH Grant MH15758, and a Carnegie Mellon University Faculty Development Grant. In addition Junker would like to acknowledge the hospitality of the Nijmegen Institute for Cognition and Information during his visit to the University of Nijmegen in August 5–10, 1993. 相似文献
12.
Fumiko Samejima 《Psychometrika》2000,65(3):319-335
The paper addresses and discusses whether the tradition of accepting point-symmetric item characteristic curves is justified by uncovering the inconsistent relationship between the difficulties of items and the order of maximum likelihood estimates of ability. This inconsistency is intrinsic in models that provide point-symmetric item characteristic curves, and in this paper focus is put on the normal ogive model for observation. It is also questioned if in the logistic model the sufficient statistic has forfeited the rationale that is appropriate to the psychological reality. It is observed that the logistic model can be interpreted as the case in which the inconsistency in ordering the maximum likelihood estimates is degenerated.The paper proposes a family of models, called the logistic positive exponent family, which provides asymmetric item chacteristic curves. A model in this family has a consistent principle in ordering the maximum likelihood estimates of ability. The family is divided into two subsets each of which has its own principle, and includes the logistic model as a transition from one principle to the other. Rationale and some illustrative examples are given. 相似文献
13.
Maria Bolsinova Gunter Maris 《The British journal of mathematical and statistical psychology》2016,69(1):62-79
An important distinction between different models for response time and accuracy is whether conditional independence (CI) between response time and accuracy is assumed. In the present study, a test for CI given an exponential family model for accuracy (for example, the Rasch model or the one‐parameter logistic model) is proposed and evaluated in a simulation study. The procedure is based on the non‐parametric Kolmogorov–Smirnov tests. As an illustrative example, the CI test was applied to data from an arithmetics test for secondary education. 相似文献
14.
Mark G. Ehrhart Karen Holcombe Ehrhart Scott C. Roesch Beth G. Chung-Herrera Kristy Nadler Kelsey Bradshaw 《Personality and individual differences》2009,47(8):900-905
Recent efforts have aimed to develop relatively short measures of the Five-Factor Model (FFM) of personality, particularly for when time and/or space is limited. We evaluate the Ten-Item Personality Inventory (TIPI), a non-proprietary FFM measure with two items per dimension. We use a latent variable methodology to examine the TIPI’s factor structure and convergent validity with the 50-item International Personality Item Pool (IPIP) FFM measure. We provide correlations between the scale scores and latent factors, and compare each measure’s pattern of correlations with measures of other individual difference constructs. Results were favorable in terms of the factor structure and convergent validity of the TIPI, particularly regarding the correlations between the respective latent factors of the TIPI and the IPIP–FFM measures. 相似文献
15.
Peter van Rijn Frank Rijmen 《The British journal of mathematical and statistical psychology》2015,68(1):1-22
Many probabilistic models for psychological and educational measurements contain latent variables. Well‐known examples are factor analysis, item response theory, and latent class model families. We discuss what is referred to as the ‘explaining‐away’ phenomenon in the context of such latent variable models. This phenomenon can occur when multiple latent variables are related to the same observed variable, and can elicit seemingly counterintuitive conditional dependencies between latent variables given observed variables. We illustrate the implications of explaining away for a number of well‐known latent variable models by using both theoretical and real data examples. 相似文献
16.
17.
Sayed H. Kadhem Aristidis K. Nikoloulopoulos 《The British journal of mathematical and statistical psychology》2021,74(3):365-403
We develop factor copula models to analyse the dependence among mixed continuous and discrete responses. Factor copula models are canonical vine copulas that involve both observed and latent variables, hence they allow tail, asymmetric and nonlinear dependence. They can be explained as conditional independence models with latent variables that do not necessarily have an additive latent structure. We focus on important issues of interest to the social data analyst, such as model selection and goodness of fit. Our general methodology is demonstrated with an extensive simulation study and illustrated by reanalysing three mixed response data sets. Our studies suggest that there can be a substantial improvement over the standard factor model for mixed data and make the argument for moving to factor copula models. 相似文献
18.
Mark Hansen Li Cai Scott Monroe Zhen Li 《The British journal of mathematical and statistical psychology》2016,69(3):225-252
Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full‐information test statistics such as Pearson's X2 and the likelihood ratio statistic G2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited‐information fit statistics such as Maydeu‐Olivares and Joe's (2006, Psychometrika, 71, 713) M2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu‐Olivares and Joe's (2006, Psychometrika, 71, 713) M2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q‐matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M2 was largely insensitive to misspecifications in the distribution of higher‐order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M2, we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M2 and statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144). 相似文献
19.
20.
A number of models for categorical item response data have been proposed in recent years. The models appear to be quite different.
However, they may usefully be organized as members of only three distinct classes, within which the models are distinguished
only by assumptions and constraints on their parameters. “Difference models” are appropriate for ordered responses, “divide-by-total”
models may be used for either ordered or nominal responses, and “left-side added” models are used for multiple-choice responses
with guessing. The details of the taxonomy and the models are described in this paper.
The present study was supported in part by two postdoctoral fellowships awarded to Lynne Steinberg: an Educational Testing
Service Postdoctoral Fellowship at ETS, Princeton, NJ and an NIMH Individual National Research Service Award at Stanford University,
Stanford, CA. Helpful comments by the editor and three anonymous reviewers are gratefully acknowledged. 相似文献