1. NetMiner-an ensemble pipeline for building genome-wide and high-quality gene co-expression network using massive-scale RNA-seq samples
- Author
-
Bingke Jiao, Hua Yu, Lu Lu, Wei Liu, Pengfei Wang, Chengzhi Liang, and Shuangcheng Chen
- Subjects
0301 basic medicine ,Candidate gene ,Computer science ,Gene Identification and Analysis ,lcsh:Medicine ,Gene Expression ,RNA-Seq ,Genetic Networks ,Genome ,Biochemistry ,Gene Regulatory Networks ,Cell Cycle and Cell Division ,lcsh:Science ,Multidisciplinary ,Eukaryota ,Agriculture ,Plants ,Nucleic acids ,Experimental Organism Systems ,Cell Processes ,RNA, Long Noncoding ,Network Analysis ,Algorithms ,Research Article ,Computer and Information Sciences ,Arabidopsis Thaliana ,Computational biology ,Brassica ,Research and Analysis Methods ,Novel gene ,03 medical and health sciences ,Model Organisms ,Circular RNA ,Plant and Algal Models ,Genetics ,Grasses ,Non-coding RNA ,Gene ,Sequence Analysis, RNA ,lcsh:R ,Organisms ,RNA ,Biology and Life Sciences ,Computational Biology ,Cell Biology ,Genetic architecture ,Agronomy ,030104 developmental biology ,Long non-coding RNAs ,Gene co-expression network ,lcsh:Q ,Rice - Abstract
Accurately reconstructing gene co-expression network is of great importance for uncovering the genetic architecture underlying complex and various phenotypes. The recent availability of high-throughput RNA-seq sequencing has made genome-wide detecting and quantifying of the novel, rare and low-abundance transcripts practical. However, its potential merits in reconstructing gene co-expression network have still not been well explored. Using massive-scale RNA-seq samples, we have designed an ensemble pipeline, called NetMiner, for building genome-scale and high-quality Gene Co-expression Network (GCN) by integrating three frequently used inference algorithms. We constructed a RNA-seq-based GCN in one species of monocot rice. The quality of network obtained by our method was verified and evaluated by the curated gene functional association data sets, which obviously outperformed each single method. In addition, the powerful capability of network for associating genes with functions and agronomic traits was shown by enrichment analysis and case studies. In particular, we demonstrated the potential value of our proposed method to predict the biological roles of unknown protein-coding genes, long non-coding RNA (lncRNA) genes and circular RNA (circRNA) genes. Our results provided a valuable and highly reliable data source to select key candidate genes for subsequent experimental validation. To facilitate identification of novel genes regulating important biological processes and phenotypes in other plants or animals, we have published the source code of NetMiner, making it freely available at https://github.com/czllab/NetMiner.
- Published
- 2018