1. Accelerating Identification of Chromatin Accessibility from noisy ATAC-seq Data using Modern CPUs
- Author
-
Menachem Adelman, Narendra Chaudhary, Dhiraj D. Kalamkar, Barukh Ziv, Bharat Kaul, Sanchit Misra, Alexander Heinecke, and Evangelos Georganas
- Subjects
Identification (information) ,Speedup ,Computer science ,business.industry ,Filter (video) ,Deep learning ,ATAC-seq ,Artificial intelligence ,Parallel computing ,business ,Training performance ,Chromatin ,Convolution - Abstract
Identifying accessible chromatin regions is a fundamental problem in epigenomics with ATAC-seq being a commonly used assay. Exponential rise in single cell ATAC-seq experiments has made it critical to accelerate processing of ATAC-seq data. ATAC-seq data can have a low signal-to-noise ratio for various reasons including low coverage or low cell count. To denoise and identify accessible chromatin regions from noisy ATAC-seq data, use of deep learning on 1D data – using large filter sizes, long tensor widths, and/or dilation - has recently been proposed. Here, we present ways to accelerate the end-to-end training performance of these deep learning based methods using CPUs. We evaluate our approach on the recently released AtacWorks toolkit. Compared to an Nvidia DGX-1 box with 8 V100 GPUs, we get up to 2.27× speedup using just 16 CPU sockets. To achieve this, we build an efficient 1D dilated convolution layer and demonstrate reduced precision (BFloat16) training.
- Published
- 2021
- Full Text
- View/download PDF