An improved density peak clustering algorithm guided by pseudo labels

Yizhang Wang, Wei Pang, Jingchu Zhou

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)
54 Downloads (Pure)

Abstract

Density peak clustering algorithms and their variants have achieved promising results in many fields over the last few years. However, most of these algorithms parameters requiring to be fine-tuned by users. When facing real-world data without ground-truths, it is often challenging and time-consuming to identify better parameter values for parametric clustering algorithms. Considering this, we propose a density peak clustering algorithm guided by pseudo labels (PLDPC), in which the manually pre-specified parameters are avoided through applying the mutual information criterion. Specifically, we first design a novel pseudo-label generation method based on the theory of co-occurrence. Then, we use the maximizing mutual information method to obtain better clustering results. To evaluate the effectiveness of the proposed PLDPC algorithm, we conduct extensive experiments on 23 datasets, including six synthetic and seventeen real-world datasets. The experimental results show that PLDPC outperforms three classical algorithms (i.e., K-means, DPC, and DBSCAN) and eight state-of-the-art (SOTA) clustering algorithms in most cases.
Original languageEnglish
Article number109374
JournalKnowledge-Based Systems
Volume252
Early online date8 Jul 2022
DOIs
Publication statusPublished - 27 Sept 2022

Keywords

  • Density peak clustering
  • Maximizing mutual information
  • Pseudo labels

ASJC Scopus subject areas

  • Software
  • Management Information Systems
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'An improved density peak clustering algorithm guided by pseudo labels'. Together they form a unique fingerprint.

Cite this