首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Due to the effects of outliers, mixture model tests that require all objects to be classified can severely underestimate the accuracy of hierarchical clustering algorithms. More valid and relevant comparisons between algorithms can be made by calculating accuracy at several levels in the hierarchical tree and considering accuracy as a function of the coverage of the classification. Using this procedure, several algorithms were compared on their ability to resolve ten multivariate normal mixtures. All of the algorithms were significantly more accurate than a random linkage algorithm, and accuracy was inversely related to coverage. Algorithms using correlation as the similarity measure were significantly more accurate than those using Euclidean distance (p < .001). A subset of high accuracy algorithms, including single, average, and centroid linkage using correlation, and Ward's minimum variance technique, was identified.  相似文献   

2.
3.
Additive similarity trees   总被引:20,自引:0,他引:20  
Similarity data can be represented by additive trees. In this model, objects are represented by the external nodes of a tree, and the dissimilarity between objects is the length of the path joining them. The additive tree is less restrictive than the ultrametric tree, commonly known as the hierarchical clustering scheme. The two representations are characterized and compared. A computer program, ADDTREE, for the construction of additive trees is described and applied to several sets of data. A comparison of these results to the results of multidimensional scaling illustrates some empirical and theoretical advantages of tree representations over spatial representations of proximity data.We thank Nancy Henley and Vered Kraus for providing us with data, and Jan deLeeuw for calling our attention to relevant literature. The work of the first author was supported in part by the Psychology Unit of the Israel Defense Forces.  相似文献   

4.
Some extensions of Johnson's hierarchical clustering algorithms   总被引:1,自引:0,他引:1  
Considerable attention has been given in the psychological literature to techniques of data reduction that partition a set of objects into optimally homogeneous groups. This paper is an attempt to extend the hierarchical partitioning algorithms proposed by Johnson and to emphasize a general connection between these clustering procedures and the mathematical theory of lattices. A goodness-of-fit statistic is first proposed that is invariant under monotone increasing transformations of the basic similarity matrix. This statistic is then applied to three illustrative hierarchical clusterings: two obtained by the Johnson algorithms and one obtained by an algorithm that produces the same chain under hypermonotone increasing transformations of the similarity measures.  相似文献   

5.
In one well-known model for psychological distances, objects such as stimuli are placed in a hierarchy of clusters like a phylogenetic tree; in another common model, objects are represented as points in a multidimensional Euclidean space. These models are shown theoretically to be mutually exclusive and exhaustive in the following sense. The distances among a set ofn objects will be strictly monotonically related either to the distances in a hierarchical clustering system, or else to the distances in a Euclidean space of less thann — 1 dimensions, but not to both. Consequently, a lower bound on the number of Euclidean dimensions necessary to represent a set of objects is one less than the size of the largest subset of objects whose distances satisfy the ultrametric inequality, which characterizes the hierarchical model.This work was supported in part by Grant GB-13588X from the National Science Foundation. I would like to thank L. M. Kelly and A. A. J. Marley for their helpful comments and suggestions.  相似文献   

6.
A class of related nonmetric (“monotone invariant”) hierarchical grouping methods is presented. The methods are defined in terms of generalized cliques, based on a systematically varying specification of the degree of indirectness of permitted relationships (i.e., degree of “chaining”). This approach to grouping is shown to provide a useful framework for grouping methods based on ana priori specification of the properties of the desired subsets, and includes a natural generalization for “complete linkage” and “single linkage” clustering, such as the methods of Johnson [1967]. The central feature of the class of methods is a simple iterative matrix operation on the original disparities (“inverse-proximities” or “dissimilarities”) matrix, and one of the methods also constitutes a very efficient single linkage clustering procedure.  相似文献   

7.
An ultrametric is shown equivalent to a matrix that is idempotent under a particular definition of matrix multiplication. By considering ultrametrics in terms of semirings and lattices, the clustering methods Bk of Jardine and Sibson are reinterpreted using this nonstandard operation.  相似文献   

8.
The psychometric and classification literatures have illustrated the fact that a wide class of discrete or network models (e.g., hierarchical or ultrametric trees) for the analysis of ordinal proximity data are plagued by potential degenerate solutions if estimated using traditional nonmetric procedures (i.e., procedures which optimize a STRESS-based criteria of fit and whose solutions are invariant under a monotone transformation of the input data). This paper proposes a new parametric, maximum likelihood based procedure for estimating ultrametric trees for the analysis of conditional rank order proximity data. We present the technical aspects of the model and the estimation algorithm. Some preliminary Monte Carlo results are discussed. A consumer psychology application is provided examining the similarity of fifteen types of snack/breakfast items. Finally, some directions for future research are provided.  相似文献   

9.
A Monte Carlo evaluation of thirty internal criterion measures for cluster analysis was conducted. Artificial data sets were constructed with clusters which exhibited the properties of internal cohesion and external isolation. The data sets were analyzed by four hierarchical clustering methods. The resulting values of the internal criteria were compared with two external criterion indices which determined the degree of recovery of correct cluster structure by the algorithms. The results indicated that a subset of internal criterion measures could be identified which appear to be valid indices of correct cluster recovery. Indices from this subset could form the basis of a permutation test for the existence of cluster structure or a clustering algorithm.  相似文献   

10.
11.
Milligan  Glenn W. 《Psychometrika》1980,45(3):325-342
An evaluation of several clustering methods was conducted. Artificial clusters which exhibited the properties of internal cohesion and external isolation were constructed. The true cluster structure was subsequently hidden by six types of error-perturbation. The results indicated that the hierarchical methods were differentially sensitive to the type of error perturbation. In addition, generally poor recovery performance was obtained when random seed points were used to start theK-means algorithms. However, two alternative starting procedures for the nonhierarchical methods produced greatly enhanced cluster recovery and were found to be robust with respect to all of the types of error examined.  相似文献   

12.
Connections are pointed out between the concept of a cut in a graph and the data analysis problems of hierarchical clustering and seriation encountered in the social and behavioral sciences. An emphasis is placed on hierarchical clustering by the criterion of k-edge connectivity and on the relationship between several criteria for object seriation proposed in the literature and the appropriate graph-theoretical structures.  相似文献   

13.
In Experiment I, subjects made similarity judgments about all 56 category terms listed in the Battig and Montague (1969) norms. These judgments were then subjected to a hierarchical clustering analysis. Experiment II demonstrated that the relations among the category labels are very similar to the relations among the high dominance exemplars of these categories. Experiment III showed that the distances between the category terms in the hierarchical clustering analysis could predict RTs in a same-different paradigm.  相似文献   

14.
Theories of attitude change have failed to identify the architecture of interattitudinal structures and relate it to attitude change. This article examines two models (a hierarchical and a spatial–linkage model) of interattitudinal structure that explicitly posit consequences for attitude change. An experiment (N= 391) was conducted that manipulated type of hierarchy (explicit versus implicit), whether the hierarchy was primed or not, and the location in the hierarchy to which a message was directed. Whereas the hierarchical model predicts only top–down influence of attitudes on each other, a spatial–linkage model predicts that linked attitudes may influence each other regardless of hierarchical position. The results support the spatial–linkage model in that interattitudinal change is constrained less by a concept's relative position in a hierarchical structure than by the concept's association with other concepts in that structure. Furthermore, within these interattitudinal structures, concepts directly targeted by a persuasive message often exhibit less attitude change than related concepts to which the focal concept appears to be linked. Finally, an explicit hierarchy of concepts appears to facilitate interattitudinal influence much more than an implicit hierarchy of concepts does; the key to this facilitation seems to be the mental accessibility of the organizational structure.  相似文献   

15.
This paper illustrates two formal models for psychiatric classification. The first model, called a hierarchical or tree structure, requires patient categories to be disjoint or strictly nested. The second model, called the generally overlapping or network model, allows patient categories to cut across each other in a variety of different ways. Thus, patient groups can be disjoint, strictly nested (as in a hierarchy), or partially overlapping. To derive classification schemes consistent with the structural models, two different clustering techniques were applied to interpatient similarity data collected on 50 psychiatric patients. A hierarchical clustering technique was applied to the similarity data to obtain a hierarchical classification. To obtain a generally overlapping classification, Peay's cliquing procedure was applied to the same data. Two criteria were used to compare the clustering solutions. First, a solution's goodness-of-fit to the original data was examined by calculating the proportion of variance accounted for by cluster categories. Second, the predictive accuracy of a solution was analyzed by looking at the categories' ability to predict treatment assignment. The generally overlapping solution produced the best fit to the original similarity data; however, the hierarchical solution's clusters tended to be more readily interpretable in terms of psychiatric syndromes. Both clustering solutions were relatively poor predictors of treatment assignment. It was concluded that the hierarchical and generally overlapping approaches, although not conclusively demonstrated, represented promising models for psychiatric classification.  相似文献   

16.
For the clustering problem with general (not necessarily symmetric) relational constraints, different sets of feasible clusterings, also called clustering types, determined by the same relation, can be defined. In this paper some clustering types are discussed and adaptations of the hierarchical clustering method compatible with these clustering types are proposed.  相似文献   

17.
This paper propose a novel secure routing mechanism called Spatial and Energy Aware Trusted Dynamic Distance Source Routing (SEAT-DSR) algorithm for enhancing the network life time of wireless sensor networks. Here, the spatial information, energy level, and the effectiveness of data quality are equalized by the Quality of Service (QoS) based energy aware routing algorithms. In addition to this approach, a standard clustering algorithm is also incorporates for grouping the wireless sensor nodes based on the trust score, spatial information, energy level and the distance between the nodes. In this SEAT-DSR is also capable of making decision over the evaluation metrics that are decided and expressed the QoS. Moreover, a new hierarchical trust mechanism is also introduced in this model which adopts multi-attributes of many wireless sensor nodes according to the data communication speed, data size, energy consumption, and the recommendation. This new hierarchical trust method relies over an improved the sliding window time by considering the presence of various attacks frequency to identify the attackers by discovering their anomalous behaviour. The proposed SEAT-DSR is evaluated by conducting many experiments in a simulation environment that creates by using Network Simulator-2 (NS2). The experimental results of the proposed algorithm are proved that the average packet transfer rate is increased drastically than the existing secure routing methodologies.  相似文献   

18.
Statisticians typically estimate the parameters of latent class and latent profile models using the Expectation-Maximization algorithm. This paper proposes an alternative two-stage approach to model fitting. The first stage uses the modified k-means and hierarchical clustering algorithms to identify the latent classes that best satisfy the conditional independence assumption underlying the latent variable model. The second stage then uses mixture modeling treating the class membership as known. The proposed approach is theoretically justifiable, directly checks the conditional independence assumption, and converges much faster than the full likelihood approach when analyzing high-dimensional data. This paper also develops a new classification rule based on latent variable models. The proposed classification procedure reduces the dimensionality of measured data and explicitly recognizes the heterogeneous nature of the complex disease, which makes it perfect for analyzing high-throughput genomic data. Simulation studies and real data analysis demonstrate the advantages of the proposed method.  相似文献   

19.
This paper presents a new procedure called TREEFAM for estimating ultrametric tree structures from proximity data confounded by differential stimulus familiarity. The objective of the proposed TREEFAM procedure is to quantitatively filter out the effects of stimulus unfamiliarity in the estimation of an ultrametric tree. A conditional, alternating maximum likelihood procedure is formulated to simultaneously estimate an ultrametric tree, under the unobserved condition of complete stimulus familiarity, and subject-specific parameters capturing the adjustments due to differential unfamiliarity. We demonstrate the performance of the TREEFAM procedure under a variety of alternative conditions via a modest Monte Carlo experimental study. An empirical application provides evidence that the TREEFAM outperforms traditional models that ignore the effects of unfamiliarity in terms of superior tree recovery and overall goodness-of-fit.  相似文献   

20.
Two types of graph-theoretic representations of psychological distances or dissimilarities are proposed: weighted free trees and weighted bidirectional trees. A weighted free tree is a generalization of the sort of graph representation used in hierarchical clustering. A weighted bidirectional tree is a further generalization which allows for asymmetric dissimilarities. The properties of these structures are discussed and numerical methods are presented that can be used to derive a representation for any given set of dissimilarities. The applicability of these structures is illustrated by using them to represent the data from experiments on the similarities among animal terms and on memory for sentences. An analysis based upon free trees is compared and contrasted with analyses using hierarchical clustering and multidimensional scaling.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号