首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
A row (or column) of an n×n matrix complies with Regular Minimality (RM) if it has a unique minimum entry which is also a unique minimum entry in its column (respectively, row). The number of violations of RM in a matrix is defined as the number of rows (equivalently, columns) that do not comply with RM. We derive a formula for the proportion of n×n matrices with a given number of violations of RM among all n×n matrices with no tied entries. The proportion of matrices with no more than a given number of violations can be treated as the p-value of a permutation test whose null hypothesis states that all permutations of the entries of a matrix without ties are equiprobable, and the alternative hypothesis states that RM violations occur with lower probability than predicted by the null hypothesis. A matrix with ties is treated as being represented by all matrices without ties that have the same set of strict inequalities among their entries.  相似文献   

2.
Correspondence analysis and optimal structural representations   总被引:1,自引:0,他引:1  
Many well-known measures for the comparison of distinct partitions of the same set ofn objects are based on the structure of class overlap presented in the form of a contingency table (e.g., Pearson's chi-square statistic, Rand's measure, or Goodman-Kruskal'sτ b ), but they all can be rephrased through the use of a simple cross-product index defined between the corresponding entries from twon ×n proximity matrices that provide particular a priori (numerical) codings of the within- and between-class relationships for each of the partitions. We consider the task of optimally constructing the proximity matrices characterizing the partitions (under suitable restriction) so as to maximize the cross-product measure, or equivalently, the Pearson correlation between their entries. The major result presented states that within the broad classes of matrices that are either symmetric, skew-symmetric, or completely arbitrary, optimal representations are already derivable from what is given by a simple one-dimensional correspondence analysis solution. Besides severely limiting the type of structures that might be of interest to consider for representing the proximity matrices, this result also implies that correspondence analysis beyond one dimension must always be justified from logical bases other than the optimization of a single correlational relationship between the matrices representing the two partitions.  相似文献   

3.
宋枝璘  郭磊  郑天鹏 《心理学报》2022,54(4):426-440
数据缺失在测验中经常发生, 认知诊断评估也不例外, 数据缺失会导致诊断结果的偏差。首先, 通过模拟研究在多种实验条件下比较了常用的缺失数据处理方法。结果表明:(1)缺失数据导致估计精确性下降, 随着人数与题目数量减少、缺失率增大、题目质量降低, 所有方法的PCCR均下降, Bias绝对值和RMSE均上升。(2)估计题目参数时, EM法表现最好, 其次是MI, FIML和ZR法表现不稳定。(3)估计被试知识状态时, EM和FIML表现最好, MI和ZR表现不稳定。其次, 在PISA2015实证数据中进一步探索了不同方法的表现。综合模拟和实证研究结果, 推荐选用EM或FIML法进行缺失数据处理。  相似文献   

4.
Bentler PM  Yuan KH 《Psychometrika》2011,76(1):119-123
Indefinite symmetric matrices that are estimates of positive-definite population matrices occur in a variety of contexts such as correlation matrices computed from pairwise present missing data and multinormal based methods for discretized variables. This note describes a methodology for scaling selected off-diagonal rows and columns of such a matrix to achieve positive definiteness. As a contrast to recently developed ridge procedures, the proposed method does not need variables to contain measurement errors. When minimum trace factor analysis is used to implement the theory, only correlations that are associated with Heywood cases are shrunk.  相似文献   

5.
Neil H. Timm 《Psychometrika》1970,35(4):417-437
Employing simulated data, several methods for estimating correlation and variance-covariance matrices are studied for observations missing at random from data matrices. The effect of sample size, number of variables, percent of missing data and average intercorrelations of variables are examined for several proposed methods.The author is indebted to Professors Leonard A. Marascuilo, Gus W. Haggstrom, especially Henry F. Kaiser for their invaluable suggestions throughout this work. Appreciation is also extended to the Computer Center facility of the University of California at Berkeley for the use of computer time to complete the necessary computations.  相似文献   

6.
In many research domains different pieces of information are collected regarding the same set of objects. Each piece of information constitutes a data block, and all these (coupled) blocks have the object mode in common. When analyzing such data, an important aim is to obtain an overall picture of the structure underlying the whole set of coupled data blocks. A further challenge consists of accounting for the differences in information value that exist between and within (i.e., between the objects of a single block) data blocks. To tackle these issues, analysis techniques may be useful in which all available pieces of information are integrated and in which at the same time noise heterogeneity is taken into account. For the case of binary coupled data, however, only methods exist that go for a simultaneous analysis of all data blocks but that do not account for noise heterogeneity. Therefore, in this paper, the SIMCLAS model, being a Hierarchical Classes model for the simultaneous analysis of coupled binary two-way matrices, is presented. In this model, noise heterogeneity between and within the data blocks is accounted for by downweighting entries from noisy blocks/objects within a block. In a simulation study it is shown that (1) the SIMCLAS technique recovers the underlying structure of coupled data to a very large extent, and (2) the SIMCLAS technique outperforms a Hierarchical Classes technique in which all entries contribute equally to the analysis (i.e., noise homogeneity within and between blocks). The latter is also demonstrated in an application of both techniques to empirical data on categorization of semantic concepts.  相似文献   

7.
In many areas of science, research questions imply the analysis of a set of coupled data blocks, with, for instance, each block being an experimental unit by variable matrix, and the variables being the same in all matrices. To obtain an overall picture of the mechanisms that play a role in the different data matrices, the information in these matrices needs to be integrated. This may be achieved by applying a data‐analytic strategy in which a global model is fitted to all data matrices simultaneously, as in some forms of simultaneous component analysis (SCA). Since such a strategy implies that all data entries, regardless the matrix they belong to, contribute equally to the analysis, it may obfuscate the overall picture of the mechanisms underlying the data when the different data matrices are subject to different amounts of noise. One way out is to downweight entries from noisy data matrices in favour of entries from less noisy matrices. Information regarding the amount of noise that is present in each matrix, however, is, in most cases, not available. To deal with these problems, in this paper a novel maximum‐likelihood‐based simultaneous component analysis method, referred to as MxLSCA, is proposed. Being a stochastic extension of SCA, in MxLSCA the amount of noise in each data matrix is estimated and entries from noisy data matrices are downweighted. Both in an extensive simulation study and in an application to data stemming from cross‐cultural emotion psychology, it is shown that the novel MxLSCA strategy outperforms the SCA strategy with respect to disclosing the mechanisms underlying the coupled data.  相似文献   

8.
For analyses with missing data, some popular procedures delete cases with missing values, perform analysis with missing value correlation or covariance matrices, or estimate missing values by sample means. There are objections to each of these procedures. Several procedures are outlined here for replacing missing values by regression values obtained in various ways, and for adjusting coefficients (such as factor score coefficients) when data are missing. None of the procedures are complex or expensive.This research was supported by NIH Special Research Resources Grant RR-3. The author expresses his gratitude to Robert I. Jennrich and the referees for their suggestions.  相似文献   

9.
Abstract

In intervention studies having multiple outcomes, researchers often use a series of univariate tests (e.g., ANOVAs) to assess group mean differences. Previous research found that this approach properly controls Type I error and generally provides greater power compared to MANOVA, especially under realistic effect size and correlation combinations. However, when group differences are assessed for a specific outcome, these procedures are strictly univariate and do not consider the outcome correlations, which may be problematic with missing outcome data. Linear mixed or multivariate multilevel models (MVMMs), implemented with maximum likelihood estimation, present an alternative analysis option where outcome correlations are taken into account when specific group mean differences are estimated. In this study, we use simulation methods to compare the performance of separate independent samples t tests estimated with ordinary least squares and analogous t tests from MVMMs to assess two-group mean differences with multiple outcomes under small sample and missingness conditions. Study results indicated that a MVMM implemented with restricted maximum likelihood estimation combined with the Kenward–Roger correction had the best performance. Therefore, for intervention studies with small N and normally distributed multivariate outcomes, the Kenward–Roger procedure is recommended over traditional methods and conventional MVMM analyses, particularly with incomplete data.  相似文献   

10.
Large sample properties of four methods of handling multivariate missing data are compared. The criterion for comparison is how well the loadings from a single factor model can be estimated. It is shown that efficiencies of the methods depend on the pattern or arrangement of missing data, and an evaluation study is used to generate predictive efficiency equations to guide one's choice of an estimating procedure. A simple regression-type estimator is introduced which shows high efficiency relative to the maximum likelihood method over a large range of patterns and covariance matrices.  相似文献   

11.
Many researchers face the problem of missing data in longitudinal research. Especially, high risk samples are characterized by missing data which can complicate analyses and the interpretation of results. In the current study, our aim was to find the most optimal and best method to deal with the missing data in a specific study with many missing data on the outcome variable. Therefore, different techniques to handle missing data were evaluated, and a solution to efficiently handle substantial amounts of missing data was provided. A simulation study was conducted to determine the most optimal method to deal with the missing data. Results revealed that multiple imputation (MI) using predictive mean matching was the most optimal method with respect to lowest bias and the smallest confidence interval (CI) while maintaining power. Listwise deletion and last observation carried backward also scored acceptable with respect to bias; however, CIs were much larger and sample size almost halved using these methods. Longitudinal research in high risk samples could benefit from using MI in future research to handle missing data. The paper ends with a checklist for handling missing data.  相似文献   

12.
A Monte Carlo study was carried out in order to investigate the ability of ALSCAL to recover true structure inherent in simulated proximity measures when portions of the data are missing. All sets of simulated proximity measures were based on 30 stimuli and three dimensions, and selection of missing elements was done randomly. Properties of the simulated data varied according to (a) the number of individuals, (b) the level of random error, (c) the proportion of missing data, and (d) whether the same entries or different entries were deleted for each individual. Results showed that very accurate recovery of true distances, stimulus coordinates, and weight vectors could be achieved with as much as 60% missing data as long as sample size was sufficiently large and the level of random error was low.  相似文献   

13.
A least-squares strategy is proposed for representing a two-mode proximity matrix as an approximate sum of a small number of matrices that satisfy certain simple order constraints on their entries. The primary class of constraints considered define Q-forms (or anti-Q-forms) for a two-mode matrix, where after suitable and separate row and column reorderings, the entries within each row and within each column are nondecreasing (or nonincreasing) to a maximum (or minimum) and thereafter nonincreasing (or nondecreasing). Several other types of order constraints are also mentioned to show how alternative structures can be considered using the same computational strategy.  相似文献   

14.
项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM 算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示:在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。  相似文献   

15.
A method is presented for generalized canonical correlation analysis of two or more matrices with missing rows. The method is a combination of Carroll’s (1968) method and the missing data approach of the OVERALS technique (Van der Burg, 1988). In a simulation study we assess the performance of the method and compare it to an existing procedure called GENCOM, proposed by Green and Carroll (1988). We find that the proposed method outperforms the GENCOM algorithm both with respect to model fit and recovery of the true structure. The research of Michel van de Velden was partly funded through EU Grant HPMF-CT-2000-00664. The authors would like to thank the associate editor and three anonymous referees for their constructive comments and suggestions that led to a considerable improvement of the paper.  相似文献   

16.
A common representation of data within the context of multidimensional scaling (MDS) is a collection of symmetric proximity (similarity or dissimilarity) matrices for each of M subjects. There are a number of possible alternatives for analyzing these data, which include: (a) conducting an MDS analysis on a single matrix obtained by pooling (averaging) the M subject matrices, (b) fitting a separate MDS structure for each of the M matrices, or (c) employing an individual differences MDS model. We discuss each of these approaches, and subsequently propose a straightforward new method (CONcordance PARtitioning—ConPar), which can be used to identify groups of individual-subject matrices with concordant proximity structures. This method collapses the three-way data into a subject×subject dissimilarity matrix, which is subsequently clustered using a branch-and-bound algorithm that minimizes partition diameter. Extensive Monte Carlo testing revealed that, when compared to K-means clustering of the proximity data, ConPar generally provided better recovery of the true subject cluster memberships. A demonstration using empirical three-way data is also provided to illustrate the efficacy of the proposed method.  相似文献   

17.
A method of hierarchical clustering for relational data is presented, which begins by forming a new square matrix of product-moment correlations between the columns (or rows) of the original data (represented as an n × m matrix). Iterative application of this simple procedure will in general converge to a matrix that may be permuted into the blocked form [?111?1]. This convergence property may be used as the basis of an algorithm (CONCOR) for hierarchical clustering. The CONCOR procedure is applied to several illustrative sets of social network data and is found to give results that are highly compatible with analyses and interpretations of the same data using the blockmodel approach of White (White, Boorman & Breiger, 1976). The results using CONCOR are then compared with results obtained using alternative methods of clustering and scaling (MDSCAL, INDSCAL, HICLUS, ADCLUS) on the same data sets.  相似文献   

18.
Test of homogeneity of covariances (or homoscedasticity) among several groups has many applications in statistical analysis. In the context of incomplete data analysis, tests of homoscedasticity among groups of cases with identical missing data patterns have been proposed to test whether data are missing completely at random (MCAR). These tests of MCAR require large sample sizes n and/or large group sample sizes n i , and they usually fail when applied to nonnormal data. Hawkins (Technometrics 23:105–110, 1981) proposed a test of multivariate normality and homoscedasticity that is an exact test for complete data when n i are small. This paper proposes a modification of this test for complete data to improve its performance, and extends its application to test of homoscedasticity and MCAR when data are multivariate normal and incomplete. Moreover, it is shown that the statistic used in the Hawkins test in conjunction with a nonparametric k-sample test can be used to obtain a nonparametric test of homoscedasticity that works well for both normal and nonnormal data. It is explained how a combination of the proposed normal-theory Hawkins test and the nonparametric test can be employed to test for homoscedasticity, MCAR, and multivariate normality. Simulation studies show that the newly proposed tests generally outperform their existing competitors in terms of Type I error rejection rates. Also, a power study of the proposed tests indicates good power. The proposed methods use appropriate missing data imputations to impute missing data. Methods of multiple imputation are described and one of the methods is employed to confirm the result of our single imputation methods. Examples are provided where multiple imputation enables one to identify a group or groups whose covariance matrices differ from the majority of other groups.  相似文献   

19.
The Non-Equivalent groups with Anchor Test (NEAT) design involves missing data that are missing by design. Three nonlinear observed score equating methods used with a NEAT design are the frequency estimation equipercentile equating (FEEE), the chain equipercentile equating (CEE), and the item-response-theory observed-score-equating (IRT OSE). These three methods each make different assumptions about the missing data in the NEAT design. The FEEE method assumes that the conditional distribution of the test score given the anchor test score is the same in the two examinee groups. The CEE method assumes that the equipercentile functions equating the test score to the anchor test score are the same in the two examinee groups. The IRT OSE method assumes that the IRT model employed fits the data adequately, and the items in the tests and the anchor test do not exhibit differential item functioning across the two examinee groups. This paper first describes the missing data assumptions of the three equating methods. Then it describes how the missing data in the NEAT design can be filled in a manner that is coherent with the assumptions made by each of these equating methods. Implications on equating are also discussed.  相似文献   

20.
各种心理调查、心理实验中, 数据的缺失随处可见。由于数据缺失, 给概化理论分析非平衡数据的方差分量带来一系列问题。基于概化理论框架下, 运用Matlab 7.0软件, 自编程序模拟产生随机双面交叉设计p×i×r缺失数据, 比较和探讨公式法、REML法、拆分法和MCMC法在估计各个方差分量上的性能优劣。结果表明:(1) MCMC方法估计随机双面交叉设计p×i×r缺失数据方差分量, 较其它3种方法表现出更强的优势; (2) 题目和评分者是缺失数据方差分量估计重要的影响因素。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号