1. aPEAch: Automated Pipeline for End-to-End Analysis of Epigenomic and Transcriptomic Data.
- Author
-
Xiropotamos, Panagiotis, Papageorgiou, Foteini, Manousaki, Haris, Sinnis, Charalampos, Antonatos, Charalabos, Vasilopoulos, Yiannis, and Georgakilas, Georgios K.
- Subjects
- *
RNA sequencing , *DNA sequencing , *GENE expression , *NUCLEOTIDE sequencing , *NON-coding RNA - Abstract
Simple Summary: The emergence of next-generation sequencing (NGS) signified a revolution in biology research by capturing the significance of DNA loci and RNA molecules on a genome-wide scale. The complexity and volume of NGS data highlight the need for robust and user-friendly computational tools. The framework presented here, aPEAch, is an automated pipeline for end-to-end analysis of DNA and RNA sequencing assays, including small RNA sequencing. Implemented in Python, it allows users to customize the analysis and results, handling single or multiple replicates in batches, while also automating advanced unsupervised learning analyses. aPEAch offers quality control reports, fragment size distribution plots, and intermediate files supporting reproducibility and interoperability, along with publication-ready visualizations. With the advent of next-generation sequencing (NGS), experimental techniques that capture the biological significance of DNA loci or RNA molecules have emerged as fundamental tools for studying the epigenome and transcriptional regulation on a genome-wide scale. The volume of the generated data and the underlying complexity regarding their analysis highlight the need for robust and easy-to-use computational analytic methods that can streamline the process and provide valuable biological insights. Our solution, aPEAch, is an automated pipeline that facilitates the end-to-end analysis of both DNA- and RNA-sequencing assays, including small RNA sequencing, from assessing the quality of the input sample files to answering meaningful biological questions by exploiting the rich information embedded in biological data. Our method is implemented in Python, based on a modular approach that enables users to choose the path and extent of the analysis and the representations of the results. The pipeline can process samples with single or multiple replicates in batches, allowing the ease of use and reproducibility of the analysis across all samples. aPEAch provides a variety of sample metrics such as quality control reports, fragment size distribution plots, and all intermediate output files, enabling the pipeline to be re-executed with different parameters or algorithms, along with the publication-ready visualization of the results. Furthermore, aPEAch seamlessly incorporates advanced unsupervised learning analyses by automating clustering optimization and visualization, thus providing invaluable insight into the underlying biological mechanisms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF