1. Construction, Visualisation, and Clustering of Transcription Networks from Microarray Expression Data
- Author
-
Shiri Freilich, Anton J. Enright, Janet M. Thornton, Leon Goldovsky, Tom C. Freeman, Markus Brosch, Russell J. Grocock, Pierre Mazière, Stijn van Dongen, Enright, Anton [0000-0002-6090-3100], and Apollo - University of Cambridge Repository
- Subjects
Transcription, Genetic ,QH301-705.5 ,Eukaryotes ,Gene regulatory network ,Gene Expression ,Biology ,computer.software_genre ,Pattern Recognition, Automated ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,Mice ,0302 clinical medicine ,Data visualization ,Imaging, Three-Dimensional ,Genetics ,Microarray databases ,Animals ,Cluster Analysis ,Gene Regulatory Networks ,Biology (General) ,Cluster analysis ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,Oligonucleotide Array Sequence Analysis ,Mammals ,0303 health sciences ,Ecology ,business.industry ,Microarray analysis techniques ,Gene Expression Profiling ,Computational Biology ,Mus (Mouse) ,Computational Theory and Mathematics ,Gene Expression Regulation ,Modeling and Simulation ,Vertebrates ,Gene chip analysis ,Data mining ,DNA microarray ,business ,computer ,030217 neurology & neurosurgery ,Algorithms ,Software ,Network analysis ,Research Article - Abstract
Network analysis transcends conventional pairwise approaches to data analysis as the context of components in a network graph can be taken into account. Such approaches are increasingly being applied to genomics data, where functional linkages are used to connect genes or proteins. However, while microarray gene expression datasets are now abundant and of high quality, few approaches have been developed for analysis of such data in a network context. We present a novel approach for 3-D visualisation and analysis of transcriptional networks generated from microarray data. These networks consist of nodes representing transcripts connected by virtue of their expression profile similarity across multiple conditions. Analysing genome-wide gene transcription across 61 mouse tissues, we describe the unusual topography of the large and highly structured networks produced, and demonstrate how they can be used to visualise, cluster, and mine large datasets. This approach is fast, intuitive, and versatile, and allows the identification of biological relationships that may be missed by conventional analysis techniques. This work has been implemented in a freely available open-source application named BioLayout Express 3D., Author Summary This paper describes a novel approach for analysis of gene expression data. In this approach, normalized gene expression data is transformed into a graph where nodes in the graph represent transcripts connected to each other by virtue of their coexpression across multiple tissues or samples. The graph paradigm has many advantages for such analyses. Graph clustering of the derived network performs extremely well in comparison to traditional pairwise schemes. We show that this approach is robust and able to accommodate large datasets such as the Genomics Institute of the Novartis Research Foundation mouse tissue atlas. The entire approach and algorithms are combined into a single open-source JAVA application that allows users to perform this analysis and further mining on their own data and to visualize the results interactively in 3-D. The approach is not limited to gene expression data but would also be useful for other complex biological datasets. We use the method to investigate the relationship between the phylogenetic age of transcripts and their tissue specificity.
- Published
- 2007
- Full Text
- View/download PDF