Density peak clustering algorithms and their variants have achieved promising results in many fields over the last few years. However, most of these algorithms parameters requiring to be fine-tuned by users. When facing real-world data without ground-truths, it is often challenging and time-consuming to identify better parameter values for parametric clustering algorithms. Considering this, we propose a density peak clustering algorithm guided by pseudo labels (PLDPC), in which the manually pre-specified parameters are avoided through applying the mutual information criterion. Specifically, we first design a novel pseudo-label generation method based on the theory of co-occurrence. Then, we use the maximizing mutual information method to obtain better clustering results. To evaluate the effectiveness of the proposed PLDPC algorithm, we conduct extensive experiments on 23 datasets, including six synthetic and seventeen real-world datasets. The experimental results show that PLDPC outperforms three classical algorithms (i.e., K-means, DPC, and DBSCAN) and eight state-of-the-art (SOTA) clustering algorithms in most cases.
- Density peak clustering
- Maximizing mutual information
- Pseudo labels
ASJC Scopus subject areas
- Management Information Systems
- Information Systems and Management
- Artificial Intelligence