首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There are two well-known methods for obtaining a guaranteed globally optimal solution to the problem of least-squares unidimensional scaling of a symmetric dissimilarity matrix: (a) dynamic programming, and (b) branch-and-bound. Dynamic programming is generally more efficient than branch-and-bound, but the former is limited to matrices with approximately 26 or fewer objects because of computer memory limitations. We present some new branch-and-bound procedures that improve computational efficiency, and enable guaranteed globally optimal solutions to be obtained for matrices with up to 35 objects. Experimental tests were conducted to compare the relative performances of the new procedures, a previously published branch-and-bound algorithm, and a dynamic programming solution strategy. These experiments, which included both synthetic and empirical dissimilarity matrices, yielded the following findings: (a) the new branch-and-bound procedures were often drastically more efficient than the previously published branch-and-bound algorithm, (b) when computationally feasible, the dynamic programming approach was more efficient than each of the branch-and-bound procedures, and (c) the new branch-and-bound procedures require minimal computer memory and can provide optimal solutions for matrices that are too large for dynamic programming implementation.The authors gratefully acknowledge the helpful comments of three anonymous reviewers and the Editor. We especially thank Larry Hubert and one of the reviewers for providing us with the MATLAB files for optimal and heuristic least-squares unidimensional scaling methods.This revised article was published online in June 2005 with all corrections incorporated.  相似文献   

2.
The seriation of proximity matrices is an important problem in combinatorial data analysis and can be conducted using a variety of objective criteria. Some of the most popular criteria for evaluating an ordering of objects are based on (anti-) Robinson forms, which reflect the pattern of elements within each row and/or column of the reordered matrix when moving away from the main diagonal. This paper presents a branch-and-bound algorithm that can be used to seriate a symmetric dissimilarity matrix by identifying a reordering of rows and columns of the matrix optimizing an anti-Robinson criterion. Computational results are provided for several proximity matrices from the literature using four different anti-Robinson criteria. The results suggest that with respect to computational efficiency, the branch-and-bound algorithm is generally competitive with dynamic programming. Further, because it requires much less storage than dynamic programming, the branch-and-bound algorithm can provide guaranteed optimal solutions for matrices that are too large for dynamic programming implementations.  相似文献   

3.
This paper focuses on the problem of developing a partition of n objects based on the information in a symmetric, non‐negative dissimilarity matrix. The goal is to partition the objects into a set of non‐overlapping subsets with the objective of minimizing the sum of the within‐subset dissimilarities. Optimal solutions to this problem can be obtained using dynamic programming, branch‐and‐bound and other mathematical programming methods. An improved branch‐and‐bound algorithm is shown to be particularly efficient. The improvements include better upper bounds that are obtained via a fast exchange algorithm and, more importantly, sharper lower bounds obtained through sequential solution of submatrices. A modified version of the branch‐and‐bound algorithm for minimizing the diameter of a partition is also presented. Computational results for both synthetic and empirical dissimilarity matrices reveal the effectiveness of the branch‐and‐bound methodology.  相似文献   

4.
The clustering of two-mode proximity matrices is a challenging combinatorial optimization problem that has important applications in the quantitative social sciences. We focus on one particular type of problem related to the clustering of a two-mode binary matrix, which is relevant to the establishment of generalized blockmodels for social networks. In this context, clusters for the rows of the two-mode matrix intersect with clusters of the columns to form blocks, which should ideally be either complete (all 1s) or null (all 0s). A new procedure based on variable neighborhood search is presented and compared to an existing two-mode K-means clustering algorithm. The new procedure generally provided slightly greater explained variation; however, both methods yielded exceptional recovery of cluster structure.  相似文献   

5.
Establishing blockmodels for one- and two-mode binary network matrices has typically been accomplished using multiple restarts of heuristic algorithms that minimize functions of inconsistency with an ideal block structure. Although these algorithms likely yield exceptional performance, they are not assured to provide blockmodels that optimize the functional indices. In this paper, we present integer programming models that, for a prespecified image matrix, can produce guaranteed optimal solutions for matrices of nontrivial size. Accordingly, analysts performing a confirmatory analysis of a prespecified blockmodel structure can apply our models directly to obtain an optimal solution. In exploratory cases where a blockmodel structure is not prespecified, we recommend a two-stage procedure, where a heuristic method is first used to identify an image matrix and the integer program is subsequently formulated and solved to identify the optimal solution for that image matrix. Although best suited for ideal block structures associated with structural equivalence, the integer programming models have the flexibility to accommodate functional indices pertaining to regular equivalence. Computational results are reported for a variety of one- and two-mode matrices from the blockmodeling literature.  相似文献   

6.
Blockmodel approaches to network analysis as developed by Harrison White are shown to fall in a broader class of established data analysis methods based on matrix permutations (e.g., clique detection, seriation, permutation algorithms for sparse matrices). Blockmodels are seen as an important generalization of these earlier methods since they permit the data to characterize their own structure, instead of seeking to manifest some preconceived structure which is imposed by the investigator (e.g., cliques, hierarchies, or structural balance). General algorithms for the inductive construction of blockmodels thus occupy a central position in the development of the area. We discuss theoretical and practical aspects of the blockmodel search procedure which has been most widely used (CONCOR algorithm). It is proposed that the distinctive and advantageous feature of CONCOR is that it solves what is initially presented as a combinatorial problem (permutations of matrices to reveal zeroblocks) by representing the problem as a continuous one (analysis of correlation matrices). When this representation strategy receives further development, it is predicted that the fairly crude empirical approach of CONCOR will be supplanted by more powerful procedures within this same class.  相似文献   

7.
In this paper, we propose a cluster-MDS model for two-way one-mode continuous rating dissimilarity data. The model aims at partitioning the objects into classes and simultaneously representing the cluster centers in a low-dimensional space. Under the normal distribution assumption, a latent class model is developed in terms of the set of dissimilarities in a maximum likelihood framework. In each iteration, the probability that a dissimilarity belongs to each of the blocks conforming to a partition of the original dissimilarity matrix, and the rest of parameters, are estimated in a simulated annealing based algorithm. A model selection strategy is used to test the number of latent classes and the dimensionality of the problem. Both simulated and classical dissimilarity data are analyzed to illustrate the model.  相似文献   

8.
There are a number of important problems in quantitative psychology that require the identification of a permutation of the n rows and columns of an n × n proximity matrix. These problems encompass applications such as unidimensional scaling, paired‐comparison ranking, and anti‐Robinson forms. The importance of simultaneously incorporating multiple objective criteria in matrix permutation applications is well recognized in the literature; however, to date, there has been a reliance on weighted‐sum approaches that transform the multiobjective problem into a single‐objective optimization problem. Although exact solutions to these single‐objective problems produce supported Pareto efficient solutions to the multiobjective problem, many interesting unsupported Pareto efficient solutions may be missed. We illustrate the limitation of the weighted‐sum approach with an example from the psychological literature and devise an effective heuristic algorithm for estimating both the supported and unsupported solutions of the Pareto efficient set.  相似文献   

9.
Perhaps the most common criterion for partitioning a data set is the minimization of the within-cluster sums of squared deviation from cluster centroids. Although optimal solution procedures for within-cluster sums of squares (WCSS) partitioning are computationally feasible for small data sets, heuristic procedures are required for most practical applications in the behavioral sciences. We compared the performances of nine prominent heuristic procedures for WCSS partitioning across 324 simulated data sets representative of a broad spectrum of test conditions. Performance comparisons focused on both percentage deviation from the “best-found” WCSS values, as well as recovery of true cluster structure. A real-coded genetic algorithm and variable neighborhood search heuristic were the most effective methods; however, a straightforward two-stage heuristic algorithm, HK-means, also yielded exceptional performance. A follow-up experiment using 13 empirical data sets from the clustering literature generally supported the results of the experiment using simulated data. Our findings have important implications for behavioral science researchers, whose theoretical conclusions could be adversely affected by poor algorithmic performances.  相似文献   

10.
A common representation of data within the context of multidimensional scaling (MDS) is a collection of symmetric proximity (similarity or dissimilarity) matrices for each of M subjects. There are a number of possible alternatives for analyzing these data, which include: (a) conducting an MDS analysis on a single matrix obtained by pooling (averaging) the M subject matrices, (b) fitting a separate MDS structure for each of the M matrices, or (c) employing an individual differences MDS model. We discuss each of these approaches, and subsequently propose a straightforward new method (CONcordance PARtitioning—ConPar), which can be used to identify groups of individual-subject matrices with concordant proximity structures. This method collapses the three-way data into a subject×subject dissimilarity matrix, which is subsequently clustered using a branch-and-bound algorithm that minimizes partition diameter. Extensive Monte Carlo testing revealed that, when compared to K-means clustering of the proximity data, ConPar generally provided better recovery of the true subject cluster memberships. A demonstration using empirical three-way data is also provided to illustrate the efficacy of the proposed method.  相似文献   

11.
The clique partitioning problem (CPP) requires the establishment of an equivalence relation for the vertices of a graph such that the sum of the edge costs associated with the relation is minimized. The CPP has important applications for the social sciences because it provides a framework for clustering objects measured on a collection of nominal or ordinal attributes. In such instances, the CPP incorporates edge costs obtained from an aggregation of binary equivalence relations among the attributes. We review existing theory and methods for the CPP and propose two versions of a new neighborhood search algorithm for efficient solution. The first version (NS-R) uses a relocation algorithm in the search for improved solutions, whereas the second (NS-TS) uses an embedded tabu search routine. The new algorithms are compared to simulated annealing (SA) and tabu search (TS) algorithms from the CPP literature. Although the heuristics yielded comparable results for some test problems, the neighborhood search algorithms generally yielded the best performances for large and difficult instances of the CPP.  相似文献   

12.
The problem of comparing the agreement of two n × n matrices has a variety of applications in experimental psychology. A well-known index of agreement is based on the sum of the element-wise products of the matrices. Although less familiar to many researchers, measures of agreement based on within-row and/or within-column gradients can also be useful. We provide a suite of MATLAB programs for computing agreement indices and performing matrix permutation tests of those indices. Programs for computing exact p-values are available for small matrices, whereas resampling programs for approximate p-values are provided for larger matrices.  相似文献   

13.
The minimum‐diameter partitioning problem (MDPP) seeks to produce compact clusters, as measured by an overall goodness‐of‐fit measure known as the partition diameter, which represents the maximum dissimilarity between any two objects placed in the same cluster. Complete‐linkage hierarchical clustering is perhaps the best‐known heuristic method for the MDPP and has an extensive history of applications in psychological research. Unfortunately, this method has several inherent shortcomings that impede the model selection process, such as: (1) sensitivity to the input order of the objects, (2) failure to obtain a globally optimal minimum‐diameter partition when cutting the tree at K clusters, and (3) the propensity for a large number of alternative minimum‐diameter partitions for a given K. We propose that each of these problems can be addressed by applying an algorithm that finds all of the minimum‐diameter partitions for different values of K. Model selection is then facilitated by considering, for each value of K, the reduction in the partition diameter, the number of alternative optima, and the partition agreement among the alternative optima. Using five examples from the empirical literature, we show the practical value of the proposed process for facilitating model selection for the MDPP.  相似文献   

14.
The implementation of multiobjective programming methods in combinatorial data analysis is an emergent area of study with a variety of pragmatic applications in the behavioural sciences. Most notably, multiobjective programming provides a tool for analysts to model trade offs among competing criteria in clustering, seriation, and unidimensional scaling tasks. Although multiobjective programming has considerable promise, the technique can produce numerically appealing results that lack empirical validity. With this issue in mind, the purpose of this paper is to briefly review viable areas of application for multiobjective programming and, more importantly, to outline the importance of cross‐validation when using this method in cluster analysis.  相似文献   

15.
The decomposition of an asymmetric proximity matrix into its symmetric and skew‐symmetric components is a well‐known principle in combinatorial data analysis. The seriation of the skew‐symmetric component can emphasize information corresponding to the sign or absolute magnitude of the matrix elements, and the choice of objective criterion can have a profound impact on the ordering. In this research note, we propose a bicriterion approach for seriation of a skew‐symmetric matrix incorporating both sign and magnitude information. Two numerical demonstrations reveal that the bicriterion procedure is an effective alternative to direct seriation of the skew‐symmetric matrix, facilitating favourable trade‐offs among sign and magnitude information.  相似文献   

16.
Bentler PM  Yuan KH 《Psychometrika》2011,76(1):119-123
Indefinite symmetric matrices that are estimates of positive-definite population matrices occur in a variety of contexts such as correlation matrices computed from pairwise present missing data and multinormal based methods for discretized variables. This note describes a methodology for scaling selected off-diagonal rows and columns of such a matrix to achieve positive definiteness. As a contrast to recently developed ridge procedures, the proposed method does not need variables to contain measurement errors. When minimum trace factor analysis is used to implement the theory, only correlations that are associated with Heywood cases are shrunk.  相似文献   

17.
This paper is concerned with a problem where K (n×n) proximity matrices are available for a set of n objects. The goal is to identify a single permutation of the n objects that provides an adequate structural fit, as measured by an appropriate index, for each of the K matrices. A multiobjective programming approach for this problem, which seeks to optimize a weighted function of the K indices, is proposed, and illustrative examples are provided using a set of proximity matrices from the psychological literature. These examples show that, by solving the multiobjective programming model under different weighting schemes, the quantitative analyst can uncover information about the relationships among the matrices and often identify one or more permutations that provide good to excellent index values for all matrices.  相似文献   

18.
Several definitions are in use for the derivative of an m × p matrix function F(X) with respect to its n × q matrix argument X. We argue that only one of these definitions is a viable one, and that to study smooth maps from the space of n × q matrices to the space of m × p matrices it is often more convenient to study the map from nq-space to mp-space. Also, several procedures exist for a calculus of functions of matrices. It is argued that the procedure based on differentials is superior to other methods of differentiation, and leads inter alia to a satisfactory chain rule for matrix functions.  相似文献   

19.
It is well known that minimum-diameter partitioning of symmetric dissimilarity matrices can be framed within the context of coloring the vertices of a graph. Although confusion data are typically represented in the form of asymmetric similarity matrices, they are also amenable to a graph-coloring perspective. In this paper, we propose the integration of the minimum-diameter partitioning method with a neighborhood-based coloring approach for analyzing digraphs corresponding to confusion data. This procedure is capable of producing minimum-diameter partitions with the added desirable property that vertices with the same color have similar in-neighborhoods (i.e., directed edges entering the vertex) and out-neighborhoods (i.e., directed edges exiting the vertex) for the digraph corresponding to the minimum partition diameter.  相似文献   

20.
Two-mode binary data matrices arise in a variety of social network contexts, such as the attendance or non-attendance of individuals at events, the participation or lack of participation of groups in projects, and the votes of judges on cases. A popular method for analyzing such data is two-mode blockmodeling based on structural equivalence, where the goal is to identify partitions for the row and column objects such that the clusters of the row and column objects form blocks that are either complete (all 1s) or null (all 0s) to the greatest extent possible. Multiple restarts of an object relocation heuristic that seeks to minimize the number of inconsistencies (i.e., 1s in null blocks and 0s in complete blocks) with ideal block structure is the predominant approach for tackling this problem. As an alternative, we propose a fast and effective implementation of tabu search. Computational comparisons across a set of 48 large network matrices revealed that the new tabu-search heuristic always provided objective function values that were better than those of the relocation heuristic when the two methods were constrained to the same amount of computation time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号