1. Convolutional neural networks for structured omics: OmicsCNN and the OmicsConv layer
- Author
-
Jurman, Giuseppe, Maggio, Valerio, Fioravanti, Diego, Giarratano, Ylenia, Landi, Isotta, Francescatto, Margherita, Agostinelli, Claudio, Chierici, Marco, De Domenico, Manlio, and Furlanello, Cesare
- Subjects
FOS: Computer and information sciences ,FOS: Biological sciences ,Machine Learning (stat.ML) ,Quantitative Methods (q-bio.QM) - Abstract
Convolutional Neural Networks (CNNs) are a popular deep learning architecture widely applied in different domains, in particular in classifying over images, for which the concept of convolution with a filter comes naturally. Unfortunately, the requirement of a distance (or, at least, of a neighbourhood function) in the input feature space has so far prevented its direct use on data types such as omics data. However, a number of omics data are metrizable, i.e., they can be endowed with a metric structure, enabling to adopt a convolutional based deep learning framework, e.g., for prediction. We propose a generalized solution for CNNs on omics data, implemented through a dedicated Keras layer. In particular, for metagenomics data, a metric can be derived from the patristic distance on the phylogenetic tree. For transcriptomics data, we combine Gene Ontology semantic similarity and gene co-expression to define a distance; the function is defined through a multilayer network where 3 layers are defined by the GO mutual semantic similarity while the fourth one by gene co-expression. As a general tool, feature distance on omics data is enabled by OmicsConv, a novel Keras layer, obtaining OmicsCNN, a dedicated deep learning framework. Here we demonstrate OmicsCNN on gut microbiota sequencing data, for Inflammatory Bowel Disease (IBD) 16S data, first on synthetic data and then a metagenomics collection of gut microbiota of 222 IBD patients., 7 pages, 3 figures. arXiv admin note: text overlap with arXiv:1709.02268
- Published
- 2017
- Full Text
- View/download PDF