Hierarchical Clustering Ensemble preserving the T-transitivity
Clustering, Hierarchical clustering, Ultrametric, Dendrograms, Aggregation function, T-transitivity.
The main idea of ensemble learning is improved machine learning results by combining several models. Initially applied to supervised learning, this approach usually produces better results in comparison with single methods. Similarly, unsupervised ensemble learning, or consensus clustering, create individual clustering that is more robust in comparison to unique methods. The most common methods are designed for flat clustering, and show
superior in quality to clustering unique methods. Thus, it can be expected that consensus of hierarchical clustering could also lead to higher quality in creating hierarchical clustering. Recent studies not been taken to consider particularities inherent in the different methods of hierarchical grouping during the consensus process. This work investigates the impact of the ensemble consistency in the final consensual results considering the differents hierarchical methods uses in the ensemble. We propose a process that retains intermediate transitivity in dendrograms. In this algorithm, firstly, the dendrograms describing the base clustering are converted to an ultrametric matrix. Then, after one fuzzification process, the consensus functions based on aggregation operator with preserve transitivity property is applied to the matrices and form the final consensus matrix. The final clustering will be a dendrogram obtained from this aggregate matrix. Analyzing the results of the experiments performed on the known datasets and also visualizing algorithm’s process on the visual (two-dimensional) datasets shows this approach can significantly improve the accuracy performance once retaining the consistency property.