Gang Fang, LaDeana W. Hillier, Brenton R. Graveley, Ali Mortazavi, Norbert Perrimon, Nathan Boley, Jingyi Jessica Li, William C. Spencer, James B. Brown, Chau Huynh, Roger A. Hoskins, Mark Gerstein, Ann S. Hammonds, Sarah Djebali, Sonali Jha, Kenneth H. Wan, Cédric Howald, Raymond K. Auerbach, Chenghai Xue, Haiyan Huang, Jorg Drenkow, Elise A. Feingold, Julien Lagarde, Daifeng Wang, Dmitri D. Pervouchine, Thomas R. Gingeras, Guilin Wang, Peter Cherbas, Brent Ewing, Chao Di, Gary Saunders, Benjamin W. Booth, Joel Rozowsky, Yan Zhang, Anastasia Samsonova, Dionna M. Kasper, Cristina Sisu, Marcus H. Stoiber, Jiayu Wen, Michael O. Duff, Felix Schlesinger, Gennifer E. Merrihew, Sara Olson, Susan E. Celniker, Burak H. Alver, Chao Cheng, Gemma E. May, Alexandre Reymond, Carrie A. Davis, Alexander Dobin, Max E. Boeck, Roger P. Alexander, Michael J. Pazin, Peter J. Park, Adam Frankish, Lucy Cherbas, Zhi Lu, Kevin Y. Yip, Henry Zheng, Owen Thompson, Jing Leng, Kathie L. Watkins, Andrea Tanzer, Valerie Reinke, Rebecca McWhirter, Eric C. Lai, Steven E. Brenner, Robert H. Waterston, Koon-Kiu Yan, Masaomi Kato, Roderic Guigó, Huaien Wang, Kimberly Bell, Pnina Strasbourger, Baikang Pei, Jen Harrow, Long Hu, Chris Zaleski, Rabi Murad, Thomas C. Kaufman, Erik Ladewig, Robert R. Kitchen, Anurag Sethi, Kejia Wen, Guanjun Gao, Arif Harmanci, Megan Fastuca, Brian Oliver, Frank J. Slack, David M. Miller, Tim Hubbard, Garrett Robinson, Peter J. Good, Peter J. Bickel, Michael J. MacCoss, and Li Yang
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features1, 2, 3, 4, 5, 6. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a ‘universal model’ based on a single set of organism-independent parameters. In particular, this work was funded by a contract from the National Human Genome Research Institute modENCODE Project, contract U01 HG004271 and U54 HG006944, to S.E.C. (principal investigator) and P.C., T.R.G., R.A.H. and B.R.G. (co-principal investigators) with additional support from R01 GM076655 (S.E.C.) both under Department of Energy contract no. DE-AC02-05CH11231, and U54 HG007005 to B.R.G. J.B.B.’s work was supported by NHGRI K99 HG006698 and DOE DE-AC02-05CH11231. Work in P.J.B.’s group was supported by the modENCODE DAC sub award 5710003102, 1U01HG007031-01 and the ENCODE DAC 5U01HG004695-04. Work in M.B.G.’s group was supported by NIH grants HG007000 and HG007355. Work in Bloomington was supported in part by the Indiana METACyt Initiative of Indiana University, funded by an award from the Lilly Endowment, Inc. Work in E.C.L.’s group was supported by U01-HG004261 and RC2-HG005639. P.J.P. acknowledges support from the National Institutes of Health (grant no. U01HG004258). We thank the HAVANA team for providing annotation of the human reference genome, whose work is supported by National Institutes of Health (grant no. 5U54HG004555), the Wellcome Trust (grant no. WT098051). R.G. acknowledges support from the Spanish Ministry of Education (grant BIO2011-26205). We also acknowledge use of the Yale University Biomedical High Performance Computing Center. R.W.'s lab was supported by grant no. U01 HG 004263.