首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For item responses fitting the Rasch model, the assumptions underlying the Mokken model of double monotonicity are met. This makes non‐parametric item response theory a natural starting‐point for Rasch item analysis. This paper studies scalability coefficients based on Loevinger's H coefficient that summarizes the number of Guttman errors in the data matrix. These coefficients are shown to yield efficient tests of the Rasch model using p‐values computed using Markov chain Monte Carlo methods. The power of the tests of unequal item discrimination, and their ability to distinguish between local dependence and unequal item discrimination, are discussed. The methods are illustrated and motivated using a simulation study and a real data example.  相似文献   

2.
Multiple imputation under a two‐way model with error is a simple and effective method that has been used to handle missing item scores in unidimensional test and questionnaire data. Extensions of this method to multidimensional data are proposed. A simulation study is used to investigate whether these extensions produce biased estimates of important statistics in multidimensional data, and to compare them with lower benchmark listwise deletion, two‐way with error and multivariate normal imputation. The new methods produce smaller bias in several psychometrically interesting statistics than the existing methods of two‐way with error and multivariate normal imputation. One of these new methods clearly is preferable for handling missing item scores in multidimensional test data.  相似文献   

3.
Missing data, such as item responses in multilevel data, are ubiquitous in educational research settings. Researchers in the item response theory (IRT) context have shown that ignoring such missing data can create problems in the estimation of the IRT model parameters. Consequently, several imputation methods for dealing with missing item data have been proposed and shown to be effective when applied with traditional IRT models. Additionally, a nonimputation direct likelihood analysis has been shown to be an effective tool for handling missing observations in clustered data settings. This study investigates the performance of six simple imputation methods, which have been found to be useful in other IRT contexts, versus a direct likelihood analysis, in multilevel data from educational settings. Multilevel item response data were simulated on the basis of two empirical data sets, and some of the item scores were deleted, such that they were missing either completely at random or simply at random. An explanatory IRT model was used for modeling the complete, incomplete, and imputed data sets. We showed that direct likelihood analysis of the incomplete data sets produced unbiased parameter estimates that were comparable to those from a complete data analysis. Multiple-imputation approaches of the two-way mean and corrected item mean substitution methods displayed varying degrees of effectiveness in imputing data that in turn could produce unbiased parameter estimates. The simple random imputation, adjusted random imputation, item means substitution, and regression imputation methods seemed to be less effective in imputing missing item scores in multilevel data settings.  相似文献   

4.
This paper discusses the limitations of Cronbach's alpha as a sole index of reliability, showing how Cronbach's alpha is analytically handicapped to capture important measurement errors and scale dimensionality, and how it is not invariant under variations of scale length, interitem correlation, and sample characteristics. In addition, the limitations and strengths of several recommendations on how to ameliorate these problems were critically reviewed. It was shown that the reliance on Cronbach's alpha as a sole index of reliability is no longer sufficiently warranted. This requires that other indices of internal consistency be reported along with alpha coefficient, and that when a scale is composed of large number of items, factor analysis should be performed, and appropriate internal consistency estimation method applied. This approach, if adopted, will largely minimize and guard against uncritical use of Cronbach's alpha coefficient.  相似文献   

5.
The perceived stress scale (PSS) has been translated to several languages and validated in many cultures. The longer 14‐item version (PSS‐14), has been translated to Swedish and validated for Swedish use. However, the Swedish version of the shorter 10‐item version (PSS‐10) has not been validated before. Therefore, the aim of this study was to evaluate the Swedish version of the PSS‐10 with regard to reliability and validity, and to provide normative data. Data from 3,406 individuals who took part in the Västerbotten Environmental Health Study in Sweden were used. The respondents constitute a random sample, aged 18 to 79 years, and stratified for age and sex. They responded to the Swedish version of the PSS‐10 as well as to the hospital anxiety and depression scale, and the Shirom Melamed burnout questionnaire for assessment of construct validity. The results show that the PSS‐10 provides approximately normally distributed data, has good internal reliability (Cronbach's alpha 0.84), and has good construct validity with anxiety (= 0.68), depression (= 0.57), and mental/physical exhaustion (= 0.71). The favorable psychometric properties of the Swedish version of the PSS‐10 suggest use of the instrument for assessing perceived stress in Swedish and similar populations.  相似文献   

6.
For item response theory (IRT) models, which belong to the class of generalized linear or non‐linear mixed models, reliability at the scale of observed scores (i.e., manifest correlation) is more difficult to calculate than latent correlation based reliability, but usually of greater scientific interest. This is not least because it cannot be calculated explicitly when the logit link is used in conjunction with normal random effects. As such, approximations such as Fisher's information coefficient, Cronbach's α, or the latent correlation are calculated, allegedly because it is easy to do so. Cronbach's α has well‐known and serious drawbacks, Fisher's information is not meaningful under certain circumstances, and there is an important but often overlooked difference between latent and manifest correlations. Here, manifest correlation refers to correlation between observed scores, while latent correlation refers to correlation between scores at the latent (e.g., logit or probit) scale. Thus, using one in place of the other can lead to erroneous conclusions. Taylor series based reliability measures, which are based on manifest correlation functions, are derived and a careful comparison of reliability measures based on latent correlations, Fisher's information, and exact reliability is carried out. The latent correlations are virtually always considerably higher than their manifest counterparts, Fisher's information measure shows no coherent behaviour (it is even negative in some cases), while the newly introduced Taylor series based approximations reflect the exact reliability very closely. Comparisons among the various types of correlations, for various IRT models, are made using algebraic expressions, Monte Carlo simulations, and data analysis. Given the light computational burden and the performance of Taylor series based reliability measures, their use is recommended.  相似文献   

7.
Occupational value with predefined items scale is an originally 26‐item Swedish tool used to assess values people find in their everyday doings. The present study validated this scale on a Turkish sample and described the values that Turks perceived in their daily doings. The participants included a convenience sample of 446 adults with mean age 26 (SD = 7.3). Initial item analysis followed by principal component analysis (Promax) and internal reliability analyses of the components were conducted. Statistics yielded a 19‐item solution distributed across four factors. The Cronbach's alpha was .86, indicating good reliability. Confirming the earlier applications of the scale in the European and the American samples, factors related to recuperation, goal direction and social interaction emerged. Additionally, there appeared another occupational value subfactor, conservation, which did not show up in the Swedish and the American data analyses.  相似文献   

8.
Abstract

Literature addressing missing data handling for random coefficient models is particularly scant, and the few studies to date have focused on the fully conditional specification framework and “reverse random coefficient” imputation. Although it has not received much attention in the literature, a joint modeling strategy that uses random within-cluster covariance matrices to preserve cluster-specific associations is a promising alternative for random coefficient analyses. This study is apparently the first to directly compare these procedures. Analytic results suggest that both imputation procedures can introduce bias-inducing incompatibilities with a random coefficient analysis model. Problems with fully conditional specification result from an incorrect distributional assumption, whereas joint imputation uses an underparameterized model that assumes uncorrelated intercepts and slopes. Monte Carlo simulations suggest that biases from these issues are tolerable if the missing data rate is 10% or lower and the sample is composed of at least 30 clusters with 15 observations per group. Furthermore, fully conditional specification tends to be superior with intraclass correlations that are typical of crosssectional data (e.g., ICC?=?.10), whereas the joint model is preferable with high values typical of longitudinal designs (e.g., ICC?=?.50).  相似文献   

9.
When datasets are affected by nonresponse, imputation of the missing values is a viable solution. However, most imputation routines implemented in commonly used statistical software packages do not accommodate multilevel models that are popular in education research and other settings involving clustering of units. A common strategy to take the hierarchical structure of the data into account is to include cluster-specific fixed effects in the imputation model. Still, this ad hoc approach has never been compared analytically to the congenial multilevel imputation in a random slopes setting. In this paper, we evaluate the impact of the cluster-specific fixed-effects imputation model on multilevel inference. We show analytically that the cluster-specific fixed-effects imputation strategy will generally bias inferences obtained from random coefficient models. The bias of random-effects variances and global fixed-effects confidence intervals depends on the cluster size, the relation of within- and between-cluster variance, and the missing data mechanism. We illustrate the negative implications of cluster-specific fixed-effects imputation using simulation studies and an application based on data from the National Educational Panel Study (NEPS) in Germany.  相似文献   

10.
11.
This article presents a new methodology for solving problems resulting from missing data in large-scale item performance behavioral databases. Useful statistics corrected for missing data are described, and a new method of imputation for missing data is proposed. This methodology is applied to the Dutch Lexicon Project database recently published by Keuleers, Diependaele, and Brysbaert (Frontiers in Psychology, 1, 174, 2010), which allows us to conclude that this database fulfills the conditions of use of the method recently proposed by Courrieu, Brand-D’Abrescia, Peereman, Spieler, and Rey (2011) for testing item performance models. Two application programs in MATLAB code are provided for the imputation of missing data in databases and for the computation of corrected statistics to test models.  相似文献   

12.
Since 1998, soldiers deployed to war zones with the Danish Defense (≈31,000) have been invited to fill out a questionnaire on post‐mission reactions. This provides a unique data source for studying the psychological toll of war. Here, we validate a measure of PTSD‐symptoms from the questionnaire. Soldiers from two cohorts deployed to Afghanistan with the International Security Assistance Force (ISAF) in 2009 (ISAF7, N = 334) and 2013 (ISAF15, N = 278) filled out a standard questionnaire (Psychological Reactions following International Missions, PRIM) concerning a range of post‐deployment reactions including symptoms of PTSD (PRIM‐PTSD). They also filled out a validated measure of PTSD‐symptoms in DSM‐IV, the PTSD‐checklist (PCL). We tested reliability of PRIM‐PTSD by estimating Cronbach's alpha, and tested validity by correlating items, clusters, and overall scale with corresponding items in the PCL. Furthermore, we conducted two confirmatory factor analytic models to test the factor structure of PRIM‐PTSD, and tested measurement invariance of the selected model. Finally, we established a screening and a clinical cutoff score by application of ROC analysis. We found high internal consistency of the PRIM‐PTSD (Cronbach's alpha = 0.88; both cohorts), strong item‐item (0.48–0.83), item‐cluster (0.43–0.72), cluster‐cluster (0.71–0.82) and full‐scale (0.86–0.88) correlations between PRIM‐PTSD and PCL. The factor analyses showed adequate fit of a one‐factor model, which was also found to display strong measurement invariance across cohorts. ROC curve analysis established cutoff scores for screening (sensitivity = 1, specificity = 0.93) and clinical use (sensitivity = 0.71, specificity = 0.98). In conclusion, we find that PRIM‐PTSD is a valid measure for assessing PTSD‐symptoms in Danish soldiers following deployment.  相似文献   

13.
Test of homogeneity of covariances (or homoscedasticity) among several groups has many applications in statistical analysis. In the context of incomplete data analysis, tests of homoscedasticity among groups of cases with identical missing data patterns have been proposed to test whether data are missing completely at random (MCAR). These tests of MCAR require large sample sizes n and/or large group sample sizes n i , and they usually fail when applied to nonnormal data. Hawkins (Technometrics 23:105–110, 1981) proposed a test of multivariate normality and homoscedasticity that is an exact test for complete data when n i are small. This paper proposes a modification of this test for complete data to improve its performance, and extends its application to test of homoscedasticity and MCAR when data are multivariate normal and incomplete. Moreover, it is shown that the statistic used in the Hawkins test in conjunction with a nonparametric k-sample test can be used to obtain a nonparametric test of homoscedasticity that works well for both normal and nonnormal data. It is explained how a combination of the proposed normal-theory Hawkins test and the nonparametric test can be employed to test for homoscedasticity, MCAR, and multivariate normality. Simulation studies show that the newly proposed tests generally outperform their existing competitors in terms of Type I error rejection rates. Also, a power study of the proposed tests indicates good power. The proposed methods use appropriate missing data imputations to impute missing data. Methods of multiple imputation are described and one of the methods is employed to confirm the result of our single imputation methods. Examples are provided where multiple imputation enables one to identify a group or groups whose covariance matrices differ from the majority of other groups.  相似文献   

14.
Analyses of multivariate data are frequently hampered by missing values. Until recently, the only missing-data methods available to most data analysts have been relatively ad1 hoc practices such as listwise deletion. Recent dramatic advances in theoretical and computational statistics, however, have produced anew generation of flexible procedures with a sound statistical basis. These procedures involve multiple imputation (Rubin, 1987), a simulation technique that replaces each missing datum with a set of m > 1 plausible values. The rn versions of the complete data are analyzed by standard complete-data methods, and the results are combined using simple rules to yield estimates, standard errors, and p-values that formally incorporate missing-data uncertainty. New computational algorithms and software described in a recent book (Schafer, 1997a) allow us to create proper multiple imputations in complex multivariate settings. This article reviews the key ideas of multiple imputation, discusses the software programs currently available, and demonstrates their use on data from the Adolescent Alcohol Prevention Trial (Hansen & Graham, 199 I).  相似文献   

15.
项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM 算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示:在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。  相似文献   

16.
Little research has been conducted on Loevinger's Washington University Sentence Completion Test of Ego Development in adult psychiatric outpatients. The measure is a promising method of assessing a construct of personality and character functioning that should be useful research on psychopathology and in choosing treatment modalities. The data presented in this study address the question of the psychometric adequacy of the measure in this segment of the subject population. Specifically, estimates of interrater reliability, internal consistency, and test-retest reliability are presented for a sample of 42 adult outpatients. In addition, the relationship between total protocol ratings and item sum scores is explored.  相似文献   

17.
Measurements of domain knowledge very often use and report Cronbach's alpha or similar indicators of internal consistency for test construction. In this short article, we argue that this approach is often at odds with the theoretical conception of knowledge underlying the measure. While domain knowledge is usually described as a formative construct (formed by the manifest observations) theoretically, the use of Cronbach's alpha to construct and evaluate an empirical measure implies a reflective model (the construct reflects in manifest behaviors). After illustrating the difference between reflective and formative models, we illustrate how this mismatch between theoretical conception and empirical operationalization can have substantial implications for the assessment and modeling of domain knowledge. Specifically, the construct may be operationalized too narrowly or even be misinterpreted by applying criteria for item selection that focus on homogeneity such as Cronbach's alpha. Rather than maximizing items internal consistency, researchers constructing measures of domain knowledge should, therefore, make strong arguments for the theoretical merit of their items even if they are not correlated to each other.  相似文献   

18.
Several authors have suggested that prior to conducting a confirmatory factor analysis it may be useful to group items into a smaller number of item ‘parcels’ or ‘testlets’. The present paper mathematically shows that coefficient alpha based on these parcel scores will only exceed alpha based on the entire set of items if W, the ratio of the average covariance of items between parcels to the average covariance of items within parcels, is greater than unity. If W is less than unity, however, and errors of measurement are uncorrelated, then stratified alpha will be a better lower bound to the reliability of a measure than the other two coefficients. Stratified alpha are also equal to the true reliability of a test when items within parcels are essentially tau‐equivalent if one assumes that errors of measurement are not correlated.  相似文献   

19.
In the diagnostic evaluation of educational systems, self-reports are commonly used to collect data, both cognitive and orectic. For various reasons, in these self-reports, some of the students' data are frequently missing. The main goal of this research is to compare the performance of different imputation methods for missing data in the context of the evaluation of educational systems. On an empirical database of 5,000 subjects, 72 conditions were simulated: three levels of missing data, three types of loss mechanisms, and eight methods of imputation. The levels of missing data were 5%, 10%, and 20%. The loss mechanisms were set at: Missing completely at random, moderately conditioned, and strongly conditioned. The eight imputation methods used were: listwise deletion, replacement by the mean of the scale, by the item mean, the subject mean, the corrected subject mean, multiple regression, and Expectation-Maximization (EM) algorithm, with and without auxiliary variables. The results indicate that the recovery of the data is more accurate when using an appropriate combination of different methods of recovering lost data. When a case is incomplete, the mean of the subject works very well, whereas for completely lost data, multiple imputation with the EM algorithm is recommended. The use of this combination is especially recommended when data loss is greater and its loss mechanism is more conditioned. Lastly, the results are discussed, and some future lines of research are analyzed.  相似文献   

20.
In some situations where reliability must be estimated it is impossible to divide the measuring instrument into more than two separately scoreable parts. When this is the case, the parts may be homogeneous in content but clearly unequal in length. The resultant scores will not be essentially τ-equivalent, and hence total test reliability cannot be satisfactorily estimated via Cronbach's coefficient alpha. Limitation on the number of parts rules out Kristof's three-part approach. A technique is developed for estimating reliability in such situations. The approach is shown to function very well when applied to five achievement tests.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号