Back to Search Start Over

ALFA: annotation landscape for aligned reads

Authors :
Alice Lebreton
Charles Bernard
Benoît Noël
Leïla Bastianelli
Mathieu Bahin
Auguste Genovesio
Valentine Murigneux
Hervé Le Hir
Institut de biologie de l'ENS Paris (UMR 8197/1024) (IBENS)
Département de Biologie - ENS Paris
École normale supérieure - Paris (ENS Paris)-École normale supérieure - Paris (ENS Paris)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)
Institut National de la Recherche Agronomique (INRA)
Fondation pour la Recherche Médicale (FRM-AJE20131128944)
Inserm ATIP-Avenir
Programme Émergences – Recherche médicale (Mairie de Paris)
ANR-10-IDEX-0001-02/10-LABX-0054,MEMOLIFE,Memory in living systems: an integrated approach(2010)
ANR-10-IDEX-0001-02/10-IDEX-0001,PSL,Paris Sciences et Lettres(2010)
Institut de biologie de l'ENS Paris (IBENS)
École normale supérieure - Paris (ENS-PSL)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)
ANR-10-IDEX-0001,PSL,Paris Sciences et Lettres(2010)
Lebreton, Alice
Initiative d'excellence - Paris Sciences et Lettres - - PSL2010 - ANR-10-IDEX-0001 - IDEX - VALID
Genovesio, Auguste
École normale supérieure - Paris (ENS Paris)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris)
Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Département de Biologie - ENS Paris
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)
Source :
BMC Genomics, BMC Genomics, BioMed Central, 2019, 20 (250), ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics, 2019, 20 (250), pp.1-11. ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics (20), 1-11. (2019), BMC Genomics, BioMed Central, 2019, 20 (250), pp.1-11. ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics, Vol 20, Iss 1, Pp 1-11 (2019)
Publication Year :
2019
Publisher :
HAL CCSD, 2019.

Abstract

Background The last 10 years have seen the rise of countless functional genomics studies based on Next-Generation Sequencing (NGS). In the vast majority of cases, whatever the species, whatever the experiment, the two first steps of data analysis consist of a quality control of the raw reads followed by a mapping of those reads to a reference genome/transcriptome. Subsequent steps then depend on the type of study that is being made. While some tools have been proposed for investigating data quality after the mapping step, there is no commonly adopted framework that would be easy to use and broadly applicable to any NGS data type. Results We present ALFA, a simple but universal tool that can be used after the mapping step on any kind of NGS experiment data for any organism with available genomic annotations. In a single command line, ALFA can compute and display distribution of reads by categories (exon, intron, UTR, etc.) and biotypes (protein coding, miRNA, etc.) for a given aligned dataset with nucleotide precision. We present applications of ALFA to Ribo-Seq and RNA-Seq on Homo sapiens, CLIP-Seq on Mus musculus, RNA-Seq on Saccharomyces cerevisiae, Bisulfite sequencing on Arabidopsis thaliana and ChIP-Seq on Caenorhabditis elegans. Conclusions We show that ALFA provides a powerful and broadly applicable approach for post mapping quality control and to produce a global overview using common or dedicated annotations. It is made available to the community as an easy to install command line tool and from the Galaxy Tool Shed.

Details

Language :
English
ISSN :
14712164
Database :
OpenAIRE
Journal :
BMC Genomics, BMC Genomics, BioMed Central, 2019, 20 (250), ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics, 2019, 20 (250), pp.1-11. ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics (20), 1-11. (2019), BMC Genomics, BioMed Central, 2019, 20 (250), pp.1-11. ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics, Vol 20, Iss 1, Pp 1-11 (2019)
Accession number :
edsair.doi.dedup.....8e5134e3e4b76c2aef6844487ed1b258