TY - JOUR
T1 - A Multi-View Clustering Algorithm for Mixed Numeric and Categorical Data
AU - Ji, Jinchao
AU - Li, Ruonan
AU - Pang, Wei
AU - He, Fei
AU - Feng, Guozhong
AU - Zhao, Xiaowei
N1 - Funding Information:
This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 61502093, Grant 61802057, Grant 61977015, and Grant 62077012; in part by the Fundamental Research Funds for the Central Universities under Grant 2412019FZ048, Grant 2412019FZ047, Grant 2412019FZ052, Grant 2412019ZD014, and Grant 2412020XK003; and in part by the Science and Technology Research Project of ``13th Five-Year" of the Education Department, Jilin, under Grant JJKH20190290KJ.
Publisher Copyright:
© 2013 IEEE.
PY - 2021/2/4
Y1 - 2021/2/4
N2 - Clustering data with both numeric and categorical attributes is of great importance as such data are ubiquitous in real-world problems. Multi-view learning approaches have proven to be more effective and having better generalisation ability compared to single-view learning in many problems. However, most of the existing clustering algorithms developed for mixed numeric and categorical data are single-view. In this research, we propose a novel multi-view clustering algorithm based on the k-prototypes (which we term Multi-view K-Prototypes) for clustering mixed data. To the best of our knowledge, our proposed Multi-view K-Prototypes is the first multi-view version of the well-known k-prototypes algorithm. To cluster the mixed data over multiple views, we present a novel representation prototype of cluster centres in the scenario of multiple views, and we also devise formulas for updating the cluster centres over each view. Then we propose the concept of consensus cluster centres to output the final clustering result. Finally, we carried out a series of experiments on four benchmark datasets to assess the performance of the proposed Multi-view K-Prototypes clustering. Experimental results show that the Multi-view K-Prototypes algorithm outperforms the seven state-of-the-art algorithms in most cases.
AB - Clustering data with both numeric and categorical attributes is of great importance as such data are ubiquitous in real-world problems. Multi-view learning approaches have proven to be more effective and having better generalisation ability compared to single-view learning in many problems. However, most of the existing clustering algorithms developed for mixed numeric and categorical data are single-view. In this research, we propose a novel multi-view clustering algorithm based on the k-prototypes (which we term Multi-view K-Prototypes) for clustering mixed data. To the best of our knowledge, our proposed Multi-view K-Prototypes is the first multi-view version of the well-known k-prototypes algorithm. To cluster the mixed data over multiple views, we present a novel representation prototype of cluster centres in the scenario of multiple views, and we also devise formulas for updating the cluster centres over each view. Then we propose the concept of consensus cluster centres to output the final clustering result. Finally, we carried out a series of experiments on four benchmark datasets to assess the performance of the proposed Multi-view K-Prototypes clustering. Experimental results show that the Multi-view K-Prototypes algorithm outperforms the seven state-of-the-art algorithms in most cases.
KW - Classification algorithms
KW - Clustering algorithms
KW - Data clustering
KW - Licenses
KW - Machine learning algorithms
KW - Mathematical model
KW - Partitioning algorithms
KW - Prototypes
KW - mixed data
KW - multi-view learning
KW - numeric and categorical attributes
UR - http://www.scopus.com/inward/record.url?scp=85106807735&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3057113
DO - 10.1109/ACCESS.2021.3057113
M3 - Article
SN - 2169-3536
VL - 9
SP - 24913
EP - 24924
JO - IEEE Access
JF - IEEE Access
ER -