Clustering Single-cell RNA-sequencing Data based on Matching Clusters Structures

Yizhang Wang, You Zhou, Wei Pang, Yanchun Liang, Shu Wang

Research output: Contribution to journalArticlepeer-review

35 Downloads (Pure)


Single-cell sequencing technology can generate RNA-sequencing data at the single cell level, and one important single-cell RNA-sequencing data analysis method is to identify their cell types without supervised information. Clustering is an unsupervised approach that can help find new insights into biology especially for exploring the biological functions of specific cell type. However, it is challenging for traditional clustering methods to obtain high-quality cell type recognition results. In this research, we propose a novel Clustering method based on Matching Clusters Structures (MCSC) for identifying cell types among single-cell RNA-sequencing data. Firstly, MCSC obtains two different groups of clustering results from the same K-means algorithm because its initial centroids are randomly selected. Then, for one group, MCSC uses shared nearest neighbour information to calculate a label transition matrix, which denotes label transition probability between any two initial clusters. Each initial cluster may be reassigned if merging results after label transition satisfy a consensus function that maximizes structural matching degree of two different groups of clustering results. In essence, the MCSC may be interpreted as a label training process. We evaluate the proposed MCSC with five commonly used datasets and compare MCSC with several classical and state-of-the-art algorithms. The experimental results show that MCSC outperform other algorithms.

Original languageEnglish
Pages (from-to)89-95
Number of pages7
JournalTehnički vjesnik – Technical Gazette
Issue number1
Publication statusPublished - 15 Feb 2020


  • Clustering
  • Consensus function
  • Single-cell sequencing

ASJC Scopus subject areas

  • General Engineering


Dive into the research topics of 'Clustering Single-cell RNA-sequencing Data based on Matching Clusters Structures'. Together they form a unique fingerprint.

Cite this