Back to Search Start Over

A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor [version 2; referees: 3 approved, 2 approved with reservations]

Authors :
Aaron T.L. Lun
Davis J. McCarthy
John C. Marioni
Author Affiliations :
<relatesTo>1</relatesTo>Cancer Research UK Cambridge Institute, Cambridge, UK<br /><relatesTo>2</relatesTo>EMBL European Bioinformatics Institute, Cambridge, UK<br /><relatesTo>3</relatesTo>St Vincent’s Institute of Medical Research, Fitzroy, Australia<br /><relatesTo>4</relatesTo>Wellcome Trust Sanger Institute, Cambridge, UK
Source :
F1000Research. 5:2122
Publication Year :
2016
Publisher :
London, UK: F1000 Research Limited, 2016.

Abstract

Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells. This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity. The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter. Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise. This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project. It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection. Analyses were demonstrated on gene-level count data from several publicly available datasets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells. This will provide a range of usage scenarios from which readers can construct their own analysis pipelines.

Details

ISSN :
20461402
Volume :
5
Database :
F1000Research
Journal :
F1000Research
Notes :
Revised Amendments from Version 1 This version of the workflow contains a number of improvements based on the referees' comments. We have re-compiled the workflow using the latest packages from Bioconductor release 3.4, and stated more explicitly the dependence on these package versions. We have added a reference to the Bioconductor workflow page, which provides user-friendly instructions for installation and execution of the workflow. We have also moved cell cycle classification before gene filtering as this provides more precise cell cycle phase classifications. Some minor rewording and elaborations have also been performed in various parts of the article., , [version 2; referees: 3 approved, 2 approved with reservations]
Publication Type :
Academic Journal
Accession number :
edsfor.10.12688.f1000research.9501.2
Document Type :
software-tool
Full Text :
https://doi.org/10.12688/f1000research.9501.2