Back to Search Start Over

BASiCS workflow: a step-by-step analysis of expression variability using single cell RNA sequencing data [version 1; peer review: 3 approved with reservations]

Authors :
Alan O'Callaghan
Nils Eling
John C. Marioni
Catalina A. Vallejos
Author Affiliations :
<relatesTo>1</relatesTo>MRC Human Genetics Unit, Institute of Genetics & Cancer, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, UK<br /><relatesTo>2</relatesTo>Department of Quantitative Biomedicine, University of Zurich, Winterthurerstrasse 190, Zürich, CH-8057, Switzerland<br /><relatesTo>3</relatesTo>Institute for Molecular Health Sciences, ETH Zürich, Otto-Stern Weg 7, Zürich, 8093, Switzerland<br /><relatesTo>4</relatesTo>European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK<br /><relatesTo>5</relatesTo>Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, CB2 0RE, UK<br /><relatesTo>6</relatesTo>The Alan Turing Institute, British Library, 96 Euston Road, London, NW1 2DB, UK
Source :
F1000Research. 11:59
Publication Year :
2022
Publisher :
London, UK: F1000 Research Limited, 2022.

Abstract

Cell-to-cell gene expression variability is an inherent feature of complex biological systems, such as immunity and development. Single-cell RNA sequencing is a powerful tool to quantify this heterogeneity, but it is prone to strong technical noise. In this article, we describe a step-by-step computational workflow that uses the BASiCS Bioconductor package to robustly quantify expression variability within and between known groups of cells (such as experimental conditions or cell types). BASiCS uses an integrated framework for data normalisation, technical noise quantification and downstream analyses, propagating statistical uncertainty across these steps. Within a single seemingly homogeneous cell population, BASiCS can identify highly variable genes that exhibit strong heterogeneity as well as lowly variable genes with stable expression. BASiCS also uses a probabilistic decision rule to identify changes in expression variability between cell populations, whilst avoiding confounding effects related to differences in technical noise or in overall abundance. Using a publicly available dataset, we guide users through a complete pipeline that includes preliminary steps for quality control, as well as data exploration using the scater and scran Bioconductor packages. The workflow is accompanied by a Docker image that ensures the reproducibility of our results.

Details

ISSN :
20461402
Volume :
11
Database :
F1000Research
Journal :
F1000Research
Notes :
[version 1; peer review: 3 approved with reservations]
Publication Type :
Academic Journal
Accession number :
edsfor.10.12688.f1000research.74416.1
Document Type :
software-tool
Full Text :
https://doi.org/10.12688/f1000research.74416.1