Back to Search
Start Over
ALFA: annotation landscape for aligned reads
- Source :
- BMC Genomics, BMC Genomics, BioMed Central, 2019, 20 (250), ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics, 2019, 20 (250), pp.1-11. ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics (20), 1-11. (2019), BMC Genomics, BioMed Central, 2019, 20 (250), pp.1-11. ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics, Vol 20, Iss 1, Pp 1-11 (2019)
- Publication Year :
- 2019
- Publisher :
- HAL CCSD, 2019.
-
Abstract
- Background The last 10 years have seen the rise of countless functional genomics studies based on Next-Generation Sequencing (NGS). In the vast majority of cases, whatever the species, whatever the experiment, the two first steps of data analysis consist of a quality control of the raw reads followed by a mapping of those reads to a reference genome/transcriptome. Subsequent steps then depend on the type of study that is being made. While some tools have been proposed for investigating data quality after the mapping step, there is no commonly adopted framework that would be easy to use and broadly applicable to any NGS data type. Results We present ALFA, a simple but universal tool that can be used after the mapping step on any kind of NGS experiment data for any organism with available genomic annotations. In a single command line, ALFA can compute and display distribution of reads by categories (exon, intron, UTR, etc.) and biotypes (protein coding, miRNA, etc.) for a given aligned dataset with nucleotide precision. We present applications of ALFA to Ribo-Seq and RNA-Seq on Homo sapiens, CLIP-Seq on Mus musculus, RNA-Seq on Saccharomyces cerevisiae, Bisulfite sequencing on Arabidopsis thaliana and ChIP-Seq on Caenorhabditis elegans. Conclusions We show that ALFA provides a powerful and broadly applicable approach for post mapping quality control and to produce a global overview using common or dedicated annotations. It is made available to the community as an easy to install command line tool and from the Galaxy Tool Shed.
- Subjects :
- 0106 biological sciences
lcsh:QH426-470
lcsh:Biotechnology
Arabidopsis
transcription
noncoding rna
tool
universal
post mapping
quality control
ngs
Saccharomyces cerevisiae
Computational biology
Biology
[SDV.BBM.BM] Life Sciences [q-bio]/Biochemistry, Molecular Biology/Molecular biology
01 natural sciences
Data type
Mice
03 medical and health sciences
Annotation
lcsh:TP248.13-248.65
[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN]
Genetics
Animals
Humans
Caenorhabditis elegans
030304 developmental biology
Protein coding
0303 health sciences
[SDV.BIBS] Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]
Sequence Analysis, RNA
Gene Expression Profiling
Computational Biology
High-Throughput Nucleotide Sequencing
Molecular Sequence Annotation
[SDV.BBM.BM]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Molecular biology
[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]
lcsh:Genetics
Homo sapiens
Data quality
[SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN]
DNA microarray
Functional genomics
Software
010606 plant biology & botany
Biotechnology
Reference genome
Subjects
Details
- Language :
- English
- ISSN :
- 14712164
- Database :
- OpenAIRE
- Journal :
- BMC Genomics, BMC Genomics, BioMed Central, 2019, 20 (250), ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics, 2019, 20 (250), pp.1-11. ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics (20), 1-11. (2019), BMC Genomics, BioMed Central, 2019, 20 (250), pp.1-11. ⟨10.1186/s12864-019-5624-2⟩, BMC Genomics, Vol 20, Iss 1, Pp 1-11 (2019)
- Accession number :
- edsair.doi.dedup.....8e5134e3e4b76c2aef6844487ed1b258