Back to Search Start Over

Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning

Authors :
Xiaoshu Zhu
Hong-Dong Li
Jianxin Wang
Jie Zhang
Yunpei Xu
Xiaoqing Peng
Source :
Interdisciplinary Sciences: Computational Life Sciences. 12:117-130
Publication Year :
2020
Publisher :
Springer Science and Business Media LLC, 2020.

Abstract

Clustering of single-cell RNA sequencing (scRNA-seq) data enables discovering cell subtypes, which is helpful for understanding and analyzing the processes of diseases. Determining the weight of edges is an essential component in graph-based clustering methods. While several graph-based clustering algorithms for scRNA-seq data have been proposed, they are generally based on k-nearest neighbor (KNN) and shared nearest neighbor (SNN) without considering the structure information of graph. Here, to improve the clustering accuracy, we present a novel method for single-cell clustering, called structural shared nearest neighbor-Louvain (SSNN-Louvain), which integrates the structure information of graph and module detection. In SSNN-Louvain, based on the distance between a node and its shared nearest neighbors, the weight of edge is defined by introducing the ratio of the number of the shared nearest neighbors to that of nearest neighbors, thus integrating structure information of the graph. Then, a modified Louvain community detection algorithm is proposed and applied to identify modules in the graph. Essentially, each community represents a subtype of cells. It is worth mentioning that our proposed method integrates the advantages of both SNN graph and community detection without the need for tuning any additional parameter other than the number of neighbors. To test the performance of SSNN-Louvain, we compare it to five existing methods on 16 real datasets, including nonnegative matrix factorization, single-cell interpretation via multi-kernel learning, SNN-Cliq, Seurat and PhenoGraph. The experimental results show that our approach achieves the best average performance in these datasets.

Details

ISSN :
18671462 and 19132751
Volume :
12
Database :
OpenAIRE
Journal :
Interdisciplinary Sciences: Computational Life Sciences
Accession number :
edsair.doi.dedup.....c4c8d3804d5a7226a98af296488b55e9
Full Text :
https://doi.org/10.1007/s12539-019-00357-4