Back to Search
Start Over
Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning
- Source :
- Interdisciplinary Sciences: Computational Life Sciences. 12:117-130
- Publication Year :
- 2020
- Publisher :
- Springer Science and Business Media LLC, 2020.
-
Abstract
- Clustering of single-cell RNA sequencing (scRNA-seq) data enables discovering cell subtypes, which is helpful for understanding and analyzing the processes of diseases. Determining the weight of edges is an essential component in graph-based clustering methods. While several graph-based clustering algorithms for scRNA-seq data have been proposed, they are generally based on k-nearest neighbor (KNN) and shared nearest neighbor (SNN) without considering the structure information of graph. Here, to improve the clustering accuracy, we present a novel method for single-cell clustering, called structural shared nearest neighbor-Louvain (SSNN-Louvain), which integrates the structure information of graph and module detection. In SSNN-Louvain, based on the distance between a node and its shared nearest neighbors, the weight of edge is defined by introducing the ratio of the number of the shared nearest neighbors to that of nearest neighbors, thus integrating structure information of the graph. Then, a modified Louvain community detection algorithm is proposed and applied to identify modules in the graph. Essentially, each community represents a subtype of cells. It is worth mentioning that our proposed method integrates the advantages of both SNN graph and community detection without the need for tuning any additional parameter other than the number of neighbors. To test the performance of SSNN-Louvain, we compare it to five existing methods on 16 real datasets, including nonnegative matrix factorization, single-cell interpretation via multi-kernel learning, SNN-Cliq, Seurat and PhenoGraph. The experimental results show that our approach achieves the best average performance in these datasets.
- Subjects :
- 0303 health sciences
Theoretical computer science
Sequence Analysis, RNA
Computer science
Cells
030302 biochemistry & molecular biology
Graph partition
Health Informatics
General Biochemistry, Genetics and Molecular Biology
Computer Science Applications
Non-negative matrix factorization
k-nearest neighbors algorithm
03 medical and health sciences
Cell clustering
Cluster Analysis
Humans
RNA
Graph (abstract data type)
Computational Science and Engineering
Cluster analysis
Algorithms
030304 developmental biology
Subjects
Details
- ISSN :
- 18671462 and 19132751
- Volume :
- 12
- Database :
- OpenAIRE
- Journal :
- Interdisciplinary Sciences: Computational Life Sciences
- Accession number :
- edsair.doi.dedup.....c4c8d3804d5a7226a98af296488b55e9
- Full Text :
- https://doi.org/10.1007/s12539-019-00357-4