首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Correspondence analysis and optimal structural representations   总被引:1,自引:0,他引:1  
Many well-known measures for the comparison of distinct partitions of the same set ofn objects are based on the structure of class overlap presented in the form of a contingency table (e.g., Pearson's chi-square statistic, Rand's measure, or Goodman-Kruskal'sτ b ), but they all can be rephrased through the use of a simple cross-product index defined between the corresponding entries from twon ×n proximity matrices that provide particular a priori (numerical) codings of the within- and between-class relationships for each of the partitions. We consider the task of optimally constructing the proximity matrices characterizing the partitions (under suitable restriction) so as to maximize the cross-product measure, or equivalently, the Pearson correlation between their entries. The major result presented states that within the broad classes of matrices that are either symmetric, skew-symmetric, or completely arbitrary, optimal representations are already derivable from what is given by a simple one-dimensional correspondence analysis solution. Besides severely limiting the type of structures that might be of interest to consider for representing the proximity matrices, this result also implies that correspondence analysis beyond one dimension must always be justified from logical bases other than the optimization of a single correlational relationship between the matrices representing the two partitions.  相似文献   

2.
There are a number of important problems in quantitative psychology that require the identification of a permutation of the n rows and columns of an n × n proximity matrix. These problems encompass applications such as unidimensional scaling, paired‐comparison ranking, and anti‐Robinson forms. The importance of simultaneously incorporating multiple objective criteria in matrix permutation applications is well recognized in the literature; however, to date, there has been a reliance on weighted‐sum approaches that transform the multiobjective problem into a single‐objective optimization problem. Although exact solutions to these single‐objective problems produce supported Pareto efficient solutions to the multiobjective problem, many interesting unsupported Pareto efficient solutions may be missed. We illustrate the limitation of the weighted‐sum approach with an example from the psychological literature and devise an effective heuristic algorithm for estimating both the supported and unsupported solutions of the Pareto efficient set.  相似文献   

3.
The clustering of two-mode proximity matrices is a challenging combinatorial optimization problem that has important applications in the quantitative social sciences. We focus on one particular type of problem related to the clustering of a two-mode binary matrix, which is relevant to the establishment of generalized blockmodels for social networks. In this context, clusters for the rows of the two-mode matrix intersect with clusters of the columns to form blocks, which should ideally be either complete (all 1s) or null (all 0s). A new procedure based on variable neighborhood search is presented and compared to an existing two-mode K-means clustering algorithm. The new procedure generally provided slightly greater explained variation; however, both methods yielded exceptional recovery of cluster structure.  相似文献   

4.
A common representation of data within the context of multidimensional scaling (MDS) is a collection of symmetric proximity (similarity or dissimilarity) matrices for each of M subjects. There are a number of possible alternatives for analyzing these data, which include: (a) conducting an MDS analysis on a single matrix obtained by pooling (averaging) the M subject matrices, (b) fitting a separate MDS structure for each of the M matrices, or (c) employing an individual differences MDS model. We discuss each of these approaches, and subsequently propose a straightforward new method (CONcordance PARtitioning—ConPar), which can be used to identify groups of individual-subject matrices with concordant proximity structures. This method collapses the three-way data into a subject×subject dissimilarity matrix, which is subsequently clustered using a branch-and-bound algorithm that minimizes partition diameter. Extensive Monte Carlo testing revealed that, when compared to K-means clustering of the proximity data, ConPar generally provided better recovery of the true subject cluster memberships. A demonstration using empirical three-way data is also provided to illustrate the efficacy of the proposed method.  相似文献   

5.
To date, most methods for direct blockmodeling of social network data have focused on the optimization of a single objective function. However, there are a variety of social network applications where it is advantageous to consider two or more objectives simultaneously. These applications can broadly be placed into two categories: (1) simultaneous optimization of multiple criteria for fitting a blockmodel based on a single network matrix and (2) simultaneous optimization of multiple criteria for fitting a blockmodel based on two or more network matrices, where the matrices being fit can take the form of multiple indicators for an underlying relationship, or multiple matrices for a set of objects measured at two or more different points in time. A multiobjective tabu search procedure is proposed for estimating the set of Pareto efficient blockmodels. This procedure is used in three examples that demonstrate possible applications of the multiobjective blockmodeling paradigm.  相似文献   

6.
A recursive dynamic programming strategy is discussed for optimally reorganizing the rows and simultaneously the columns of ann ×n proximity matrix when the objective function measuring the adequacy of a reorganization has a fairly simple additive structure. A number of possible objective functions are mentioned along with several numerical examples using Thurstone's paired comparison data on the relative seriousness of crime. Finally, the optimization tasks we propose to attack with dynamic programming are placed in a broader theoretical context of what is typically referred to as the quadratic assignment problem and its extension to cubic assignment.Partial support for this research was provided by NIJ Grant 80-IJ-CX-0061.  相似文献   

7.
The seriation of proximity matrices is an important problem in combinatorial data analysis and can be conducted using a variety of objective criteria. Some of the most popular criteria for evaluating an ordering of objects are based on (anti-) Robinson forms, which reflect the pattern of elements within each row and/or column of the reordered matrix when moving away from the main diagonal. This paper presents a branch-and-bound algorithm that can be used to seriate a symmetric dissimilarity matrix by identifying a reordering of rows and columns of the matrix optimizing an anti-Robinson criterion. Computational results are provided for several proximity matrices from the literature using four different anti-Robinson criteria. The results suggest that with respect to computational efficiency, the branch-and-bound algorithm is generally competitive with dynamic programming. Further, because it requires much less storage than dynamic programming, the branch-and-bound algorithm can provide guaranteed optimal solutions for matrices that are too large for dynamic programming implementations.  相似文献   

8.
Dynamic programming methods for matrix permutation problems in combinatorial data analysis can produce globally-optimal solutions for matrices up to size 30×30, but are computationally infeasible for larger matrices because of enormous computer memory requirements. Branch-and-bound methods also guarantee globally-optimal solutions, but computation time considerations generally limit their applicability to matrix sizes no greater than 35×35. Accordingly, a variety of heuristic methods have been proposed for larger matrices, including iterative quadratic assignment, tabu search, simulated annealing, and variable neighborhood search. Although these heuristics can produce exceptional results, they are prone to converge to local optima where the permutation is difficult to dislodge via traditional neighborhood moves (e.g., pairwise interchanges, object-block relocations, object-block reversals, etc.). We show that a heuristic implementation of dynamic programming yields an efficient procedure for escaping local optima. Specifically, we propose applying dynamic programming to reasonably-sized subsequences of consecutive objects in the locally-optimal permutation, identified by simulated annealing, to further improve the value of the objective function. Experimental results are provided for three classic matrix permutation problems in the combinatorial data analysis literature: (a) maximizing a dominance index for an asymmetric proximity matrix; (b) least-squares unidimensional scaling of a symmetric dissimilarity matrix; and (c) approximating an anti-Robinson structure for a symmetric dissimilarity matrix. We are extremely grateful to the Associate Editor and two anonymous reviewers for helpful suggestions and corrections.  相似文献   

9.
The problem of comparing the agreement of two n × n matrices has a variety of applications in experimental psychology. A well-known index of agreement is based on the sum of the element-wise products of the matrices. Although less familiar to many researchers, measures of agreement based on within-row and/or within-column gradients can also be useful. We provide a suite of MATLAB programs for computing agreement indices and performing matrix permutation tests of those indices. Programs for computing exact p-values are available for small matrices, whereas resampling programs for approximate p-values are provided for larger matrices.  相似文献   

10.
Measurements of similarity have typically been obtained through the use of rating, sorting, and perceptual confusion tasks. In the present paper, a new method for measuring similarity is described, in which subjects rearrange items so that their proximity on a computer screen is proportional to their similarity. This method provides very efficient data collection. If a display hasn objects, then, after subjects have rearranged the objects (requiring slightly more thann movements),n(n-1)/2 pairwise similarities can be recorded. As long as the constraints imposed by two-dimensional space are not too different from those intrinsic to psychological similarity, the technique appears to offer an efficient, user-friendly, and intuitive process for measuring psychological similarity.  相似文献   

11.
A common criterion for seriation of asymmetric matrices is the maximization of the dominance index, which sums the elements above the main diagonal of a reordered matrix. Similarly, a popular seriation criterion for symmetric matrices is the maximization of an anti‐Robinson gradient index, which is associated with the patterning of elements in the rows and columns of a reordered matrix. Although perfect dominance and perfect anti‐Robinson structure are rarely achievable for empirical matrices, we can often identify a sizable subset of objects for which a perfect structure is realized. We present and demonstrate an algorithm for obtaining a maximum cardinality (i.e. the largest number of objects) subset of objects such that the seriation of the proximity matrix corresponding to the subset will have perfect structure. MATLAB implementations of the algorithm are available for dominance, anti‐Robinson and strongly anti‐Robinson structures.  相似文献   

12.
A row (or column) of an n×n matrix complies with Regular Minimality (RM) if it has a unique minimum entry which is also a unique minimum entry in its column (respectively, row). The number of violations of RM in a matrix is defined as the number of rows (equivalently, columns) that do not comply with RM. We derive a formula for the proportion of n×n matrices with a given number of violations of RM among all n×n matrices with no tied entries. The proportion of matrices with no more than a given number of violations can be treated as the p-value of a permutation test whose null hypothesis states that all permutations of the entries of a matrix without ties are equiprobable, and the alternative hypothesis states that RM violations occur with lower probability than predicted by the null hypothesis. A matrix with ties is treated as being represented by all matrices without ties that have the same set of strict inequalities among their entries.  相似文献   

13.
There are two well-known methods for obtaining a guaranteed globally optimal solution to the problem of least-squares unidimensional scaling of a symmetric dissimilarity matrix: (a) dynamic programming, and (b) branch-and-bound. Dynamic programming is generally more efficient than branch-and-bound, but the former is limited to matrices with approximately 26 or fewer objects because of computer memory limitations. We present some new branch-and-bound procedures that improve computational efficiency, and enable guaranteed globally optimal solutions to be obtained for matrices with up to 35 objects. Experimental tests were conducted to compare the relative performances of the new procedures, a previously published branch-and-bound algorithm, and a dynamic programming solution strategy. These experiments, which included both synthetic and empirical dissimilarity matrices, yielded the following findings: (a) the new branch-and-bound procedures were often drastically more efficient than the previously published branch-and-bound algorithm, (b) when computationally feasible, the dynamic programming approach was more efficient than each of the branch-and-bound procedures, and (c) the new branch-and-bound procedures require minimal computer memory and can provide optimal solutions for matrices that are too large for dynamic programming implementation.The authors gratefully acknowledge the helpful comments of three anonymous reviewers and the Editor. We especially thank Larry Hubert and one of the reviewers for providing us with the MATLAB files for optimal and heuristic least-squares unidimensional scaling methods.This revised article was published online in June 2005 with all corrections incorporated.  相似文献   

14.
Based on a simple nonparametric procedure for comparing two proximity matrices, a measure of concordance is introduced that is appropriate whenK independent proximity matrices are available. In addition to the development of a general concept of concordance and specific techniques for its evaluation within and between the subsets of a partition of theK matrices, several methods are also suggested for comparing and/or for fitting a particular structure to the given data. Finally, brief indications are provided as to how the well-known notion of concordance forK rank orders can be included within the more general framework.Partial support for this research was supplied by the National Science Foundation through SOC-77-28227.  相似文献   

15.
The degree of reciprocity of a proximity order is the proportion, P(1), of elements for which the closest neighbor relation is symmetric, and the R value of each element is its rank in the proximity order from its closest neighbor. Assuming a random sampling of points, we show that Euclidean n-spaces produce a very high degree of reciprocity, P(1) ≥ 12, and correspondingly low R values, E(R) ≤ 2, for all n. The same bounds also apply to homogeneous graphs, in which the same number of edges meet at every node. Much less reciprocity and higher R values, however, can be attained in finite tree models and in the contrast model in which the “distance” between objects is a linear function of the numbers of their common and distinctive features.  相似文献   

16.
Establishing blockmodels for one- and two-mode binary network matrices has typically been accomplished using multiple restarts of heuristic algorithms that minimize functions of inconsistency with an ideal block structure. Although these algorithms likely yield exceptional performance, they are not assured to provide blockmodels that optimize the functional indices. In this paper, we present integer programming models that, for a prespecified image matrix, can produce guaranteed optimal solutions for matrices of nontrivial size. Accordingly, analysts performing a confirmatory analysis of a prespecified blockmodel structure can apply our models directly to obtain an optimal solution. In exploratory cases where a blockmodel structure is not prespecified, we recommend a two-stage procedure, where a heuristic method is first used to identify an image matrix and the integer program is subsequently formulated and solved to identify the optimal solution for that image matrix. Although best suited for ideal block structures associated with structural equivalence, the integer programming models have the flexibility to accommodate functional indices pertaining to regular equivalence. Computational results are reported for a variety of one- and two-mode matrices from the blockmodeling literature.  相似文献   

17.
For some proximity matrices, multidimensional scaling yields a roughly circular configuration of the stimuli. Being not symmetric, a row-conditional matrix is not fit for such an analysis. However, suppose its proximities are all different within rows. Calling {{x,y},{x,z}} a conjoint pair of unordered pairs of stimuli, let {x,y}→{x,z} mean that row x shows a stronger proximity for {x,y} than for {x,z}. We have a cyclic permutation π of the set of stimuli characterize a subset of the conjoint pairs. If the arcs {x,y}→{x,z} between the pairs thus characterized are in a specific sense monotone with π, the matrix determines π uniquely, and is, in that sense, a circumplex with π as underlying cycle. In the strongest of the 3 circumplexes thus obtained, → has circular paths. We give examples of analyses of, in particular, conditional proximities by these concepts, and implications for the analysis of presumably circumplical proximities. Circumplexes whose underlying permutation is multi-cyclic are touched.  相似文献   

18.
19.
Although the K-means algorithm for minimizing the within-cluster sums of squared deviations from cluster centroids is perhaps the most common method for applied cluster analyses, a variety of other criteria are available. The p-median model is an especially well-studied clustering problem that requires the selection of p objects to serve as cluster centers. The objective is to choose the cluster centers such that the sum of the Euclidean distances (or some other dissimilarity measure) of objects assigned to each center is minimized. Using 12 data sets from the literature, we demonstrate that a three-stage procedure consisting of a greedy heuristic, Lagrangian relaxation, and a branch-and-bound algorithm can produce globally optimal solutions for p-median problems of nontrivial size (several hundred objects, five or more variables, and up to 10 clusters). We also report the results of an application of the p-median model to an empirical data set from the telecommunications industry.  相似文献   

20.
Consider a set of data consisting of measurements ofn objects with respect top variables displayed in ann ×p matrix. A monotone transformation of the values in each column, represented as a linear combination of integrated basis splines, is assumed determined by a linear combination of a new set of values characterizing each row object. Two different models are used: one, an Eckart-Young decomposition model, and the other, a multivariate normal model. Examples for artificial and real data are presented. The results indicate that both methods are helpful in choosing dimensionality and that the Eckart-Young model is also helpful in displaying the relationships among the objects and the variables. Also, results suggest that the resulting transformations are themselves illuminating.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号