首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
任赫  黄颖诗  陈平 《心理科学进展》2022,30(5):1168-1182
计算机化分类测验(Computerized Classification Testing, CCT)能够高效地对被试进行分类, 已广泛应用于合格性测验及临床心理学中。作为CCT的重要组成部分, 终止规则决定测验何时停止以及将被试最终划分到何种类别, 因此直接影响测验效率及分类准确率。已有的三大类终止规则(似然比规则、贝叶斯决策理论规则及置信区间规则)的核心思想分别为构造假设检验、设计损失函数和比较置信区间相对位置。同时, 在不同测验情境下, CCT的终止规则发展出不同的具体形式。未来研究可以继续开发贝叶斯规则、考虑多维多类别情境以及结合作答时间和机器学习算法。针对测验实际需求, 三类终止规则在合格性测验上均有应用潜力, 而临床问卷则倾向应用贝叶斯规则。  相似文献   

2.
The issue of measurement invariance commonly arises in factor-analytic contexts, with methods for assessment including likelihood ratio tests, Lagrange multiplier tests, and Wald tests. These tests all require advance definition of the number of groups, group membership, and offending model parameters. In this paper, we study tests of measurement invariance based on stochastic processes of casewise derivatives of the likelihood function. These tests can be viewed as generalizations of the Lagrange multiplier test, and they are especially useful for: (i) identifying subgroups of individuals that violate measurement invariance along a continuous auxiliary variable without prespecified thresholds, and (ii) identifying specific parameters impacted by measurement invariance violations. The tests are presented and illustrated in detail, including an application to a study of stereotype threat and simulations examining the tests’ abilities in controlled conditions.  相似文献   

3.
When considering dyadic data, one of the questions is whether the roles of the two dyad members can be considered equal. This question may be answered empirically using indistinguishability tests in the actor–partner interdependence model. In this paper several issues related to such indistinguishability tests are discussed: the difference between maximum likelihood and restricted maximum likelihood based tests for equality in variance parameters; the choice between the structural equation modelling and multilevel modelling framework; and the use of sequential testing rather than one global test for a set of indistinguishability tests. Based on simulation studies, we provide guidelines for best practice. All different types of tests are illustrated with cross-sectional and longitudinal data, and corroborated with corresponding R code.  相似文献   

4.
The present paper is concerned with testing the fit of the Rasch model. It is shown that this can be achieved by constructing functions of the data, on which model tests can be based that have power against specific model violations. It is shown that the asymptotic distribution of these tests can be derived by using the theoretical framework of testing model fit in general multinomial and product-multinomial models. The model tests are presented in two versions: one that can be used in the context of marginal maximum likelihood estimation and one that can be applied in the context of conditional maximum likelihood estimation.I am indebted to Norman Verhelst and Niels Veldhuijzen for their helpful comments. Requests for reprints should be sent to Cees A. W. Glas, Cito, PO Box 1034, 6801 MG Arnhem, THE NETHERLANDS.  相似文献   

5.
Two new tests for a model for the response times on pure speed tests by Rasch (1960) are proposed. The model is based on the assumption that the test response times are approximately gamma distributed, with known index parameters and unknown rate parameters. The rate parameters are decomposed in a subject ability parameter and a test difficulty parameter. By treating the ability as a gamma distributed random variable, maximum marginal likelihood (MML) estimators for the test difficulty parameters and the parameters of the ability distribution are easily derived. Also the model tests proposed here pertain to the framework of MML. Two tests or modification indices are proposed. The first one is focused on the assumption of local stochastic independence, the second one on the assumption of the test characteristic functions. The tests are based on Lagrange multiplier statistics, and can therefore be computed using the parameter estimates under the null model. Therefore, model violations for all items and pairs of items can be assessed as a by-product of one single estimation run. Power studies and applications to real data are included as numerical examples.  相似文献   

6.
Changes in dichotomous data caused by treatments can be analyzed by means of the so-called linear logistic model with relaxed assumptions (LLRA). The LLRA does not require observable criteria representing a single underlying latent trait, but it postulates the generalizability of the treatment effects over criteria and subjects. To test this latter crucial assumption, the mixture LLRA was proposed that allows directly unobservable types of subjects to have different treatment effects. As the earlier methods for estimating the parameters of the mixture LLRA have specific drawbacks, a further method based on the conditional maximum likelihood principle will be presented here. In contrast to the earlier conditional methods, it uses all of the dichotomous change data while having fewer parameters. Further, its goodness-of-fit tests become more sensitive to a falsely specified number of change-types even though the treatment effects are biased. For typically occurring small to moderate sample sizes, however, parametric bootstrapping of the distributions of the fit statistics is recommended for performing hypotheses tests. Finally, three applications of the new method to empirical data are described: first, about the effect of the so-called Trager psychophysical integration, second, about the effect of autogenic therapy on patients with psychosomatic symptoms, and, third, about the effect of religious education on the attitude towards sects. The mixture LLRA is implemented in the menu-driven program MIXLLRA which can be obtained from Ivo Ponocny via e-mail (ivo.ponocny@univie.ac.at).  相似文献   

7.
For computer-administered tests, response times can be recorded conjointly with the corresponding responses. This broadens the scope of potential modelling approaches because response times can be analysed in addition to analysing the responses themselves. For this purpose, we present a new latent trait model for response times on tests. This model is based on the Cox proportional hazards model. According to this model, latent variables alter a baseline hazard function. Two different approaches to item parameter estimation are described: the first approach uses a variant of the Cox model for discrete time, whereas the second approach is based on a profile likelihood function. Properties of each estimator will be compared in a simulation study. Compared to the estimator for discrete time, the profile likelihood estimator is more efficient, that is, has smaller variance. Additionally, we show how the fit of the model can be evaluated and how the latent traits can be estimated. Finally, the applicability of the model to an empirical data set is demonstrated.  相似文献   

8.
Null hypothesis significance tests are commonly used to provide a link between empirical evidence and theoretical interpretation. However, this strategy is prone to the "p-value fallacy" in which effects and interactions are classified as either "noise" or "real" based on whether the associated p value is greater or less than .05. This dichotomous classification can lead to dramatic misconstruals of the evidence provided by an experiment. For example, it is quite possible to have similar patterns of means that lead to entirely different patterns of significance, and one can easily find the same patterns of significance that are associated with completely different patterns of means. Describing data in terms of an inventory of significant and nonsignificant effects can thus completely misrepresent the results. An alternative analytical technique is to identify competing interpretations of the data and then use likelihood ratios to assess which interpretation provides the better account. Several different methods of calculating the likelihood ratios are illustrated. It is argued that this approach satisfies a principle of "graded evidence," according to which similar data should provide similar evidence.  相似文献   

9.
Martin-Löf  P. 《Synthese》1977,36(2):195-206
This paper proposes a uniform method for constructing tests, confidence regions and point estimates which is called exact since it reduces to Fisher's so-called exact test in the case of the hypothesis of independence in a 2 × 2 contingency table. All the wellknown standard tests based on exact sampling distributions are instances of the exact test in its general form. The likelihood ratio and x2 tests as well as the maximum likelihood estimate appears as asymptotic approximations to the corresponding exact procedures.  相似文献   

10.
Measurement invariance is a fundamental assumption in item response theory models, where the relationship between a latent construct (ability) and observed item responses is of interest. Violation of this assumption would render the scale misinterpreted or cause systematic bias against certain groups of persons. While a number of methods have been proposed to detect measurement invariance violations, they typically require advance definition of problematic item parameters and respondent grouping information. However, these pieces of information are typically unknown in practice. As an alternative, this paper focuses on a family of recently proposed tests based on stochastic processes of casewise derivatives of the likelihood function (i.e., scores). These score-based tests only require estimation of the null model (when measurement invariance is assumed to hold), and they have been previously applied in factor-analytic, continuous data contexts as well as in models of the Rasch family. In this paper, we aim to extend these tests to two-parameter item response models, with strong emphasis on pairwise maximum likelihood. The tests’ theoretical background and implementation are detailed, and the tests’ abilities to identify problematic item parameters are studied via simulation. An empirical example illustrating the tests’ use in practice is also provided.  相似文献   

11.
In the study of perceptual organization, the Occamian simplicity principle (which promotes efficiency) and the Helmholtzian likelihood principle (which promotes veridicality) have been claimed to be equivalent. Proposed models of these principles may well yield similar outcomes (especially in everyday situations), but as argued here, claims that the principles are equivalent confused subjective probabilities (which are used in Bayesian models of the Occamian simplicity principle) and objective probabilities (which are needed in Bayesian models of the Helmholtzian likelihood principle). Furthermore, Occamian counterparts of Bayesian priors and conditionals have led to another confusion, which seems to have been triggered by a dual role of regularity in perception. This confusion is discussed by contrasting complete and incomplete Occamian approaches to perceptual organization.  相似文献   

12.
Applications of item response theory, which depend upon its parameter invariance property, require that parameter estimates be unbiased. A new method, weighted likelihood estimation (WLE), is derived, and proved to be less biased than maximum likelihood estimation (MLE) with the same asymptotic variance and normal distribution. WLE removes the first order bias term from MLE. Two Monte Carlo studies compare WLE with MLE and Bayesian modal estimation (BME) of ability in conventional tests and tailored tests, assuming the item parameters are known constants. The Monte Carlo studies favor WLE over MLE and BME on several criteria over a wide range of the ability scale.  相似文献   

13.
A goodness-of-fit test based on the maximum likelihood criterion is derived for use in evaluating models of choice reaction time that predict choice probabilities and means and variances of latency. Special cases of the test involving models that predict only one or two of these statistics are considered and shown to be asymptotically identical to the traditional goodness-of-fit tests appropriate for these special cases.  相似文献   

14.
We demonstrate the use of a multidimensional extension of the latent Markov model to analyse data from studies with repeated binary responses in developmental psychology. In particular, we consider an experiment based on a battery of tests which was administered to pre-school children, at three time periods, in order to measure their inhibitory control (IC) and attentional flexibility (AF) abilities. Our model represents these abilities by two latent traits which are associated to each state of a latent Markov chain. The conditional distribution of the test outcomes given the latent process depends on these abilities through a multidimensional one-parameter or two-parameter logistic parameterisation. We outline an EM algorithm for likelihood inference on the model parameters; we also focus on likelihood ratio testing of hypotheses on the dimensionality of the model and on the transition matrices of the latent process. Through the approach based on the proposed model, we find evidence that supports that IC and AF can be conceptualised as distinct constructs. Furthermore, we outline developmental aspects of participants’ performance on these abilities based on inspection of the estimated transition matrices.  相似文献   

15.
Latent trait models for responses and response times in tests often lack a substantial interpretation in terms of a cognitive process model. This is a drawback because process models are helpful in clarifying the meaning of the latent traits. In the present paper, a new model for responses and response times in tests is presented. The model is based on the proportional hazards model for competing risks. Two processes are assumed, one reflecting the increase in knowledge and the second the tendency to discontinue. The processes can be characterized by two proportional hazards models whose baseline hazard functions correspond to the temporary increase in knowledge and discouragement. The model can be calibrated with marginal maximum likelihood estimation and an application of the ECM algorithm. Two tests of model fit are proposed. The amenability of the proposed approaches to model calibration and model evaluation is demonstrated in a simulation study. Finally, the model is used for the analysis of two empirical data sets.  相似文献   

16.
In applications of item response theory, assessment of model fit is a critical issue. Recently, limited‐information goodness‐of‐fit testing has received increased attention in the psychometrics literature. In contrast to full‐information test statistics such as Pearson’s X2 or the likelihood ratio G2, these limited‐information tests utilize lower‐order marginal tables rather than the full contingency table. A notable example is Maydeu‐Olivares and colleagues’M2 family of statistics based on univariate and bivariate margins. When the contingency table is sparse, tests based on M2 retain better Type I error rate control than the full‐information tests and can be more powerful. While in principle the M2 statistic can be extended to test hierarchical multidimensional item factor models (e.g., bifactor and testlet models), the computation is non‐trivial. To obtain M2, a researcher often has to obtain (many thousands of) marginal probabilities, derivatives, and weights. Each of these must be approximated with high‐dimensional numerical integration. We propose a dimension reduction method that can take advantage of the hierarchical factor structure so that the integrals can be approximated far more efficiently. We also propose a new test statistic that can be substantially better calibrated and more powerful than the original M2 statistic when the test is long and the items are polytomous. We use simulations to demonstrate the performance of our new methods and illustrate their effectiveness with applications to real data.  相似文献   

17.
The triple-match principle (TMP) proposes that the strongest, interactive relationships between job demands and job resources are observed when job demands, job resources and job-related outcomes are based on qualitatively identical dimensions. This principle is tested with regard to three outcomes: cognitive failure, emotional exhaustion, and physical health complaints. Data were collected in a large sample of employees in the technology sector (n = 1533). Results demonstrate that the positive association between emotional job demands and emotional exhaustion is compensated by the availability of emotional job resources. No triple-match interactions are found with regard to cognitive failure or physical health complaints. In line with the TMP, results show that the likelihood of finding theoretically valid interactions is related to the degree of match between job demands, job resources, and outcomes.  相似文献   

18.
Networks of relationships between individuals influence individual and collective outcomes and are therefore of interest in social psychology, sociology, the health sciences, and other fields. We consider network panel data, a common form of longitudinal network data. In the framework of estimating functions, which includes the method of moments as well as the method of maximum likelihood, we propose score-type tests. The score-type tests share with other score-type tests, including the classic goodness-of-fit test of Pearson, the property that the score-type tests are based on comparing the observed value of a function of the data to values predicted by a model. The score-type tests are most useful in forward model selection and as tests of homogeneity assumptions, and possess substantial computational advantages. We derive one-step estimators which are useful as starting values of parameters in forward model selection and therefore complement the usefulness of the score-type tests. The finite-sample behaviour of the score-type tests is studied by Monte Carlo simulation and compared to t-type tests.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号