7 results on '"Mushegian, Arcady R"'
Search Results
2. Positionally Cloned Human Disease Genes: Patterns of Evolutionary Conservation and Functional Motifs
- Author
-
Mushegian, Arcady R., Bassett, Douglas E., Boguski, Mark S., Bork, Peer, and Koonin, Eugene V.
- Published
- 1997
3. Computational methods for Gene Orthology inference.
- Author
-
Kristensen, David M., Wolf, Yuri I., Mushegian, Arcady R., and Koonin, Eugene V.
- Subjects
GENES ,GENOMICS ,GENOMES ,HEURISTIC algorithms ,PHYLOGENY - Abstract
Accurate inference of orthologous genes is a pre-requisite for most comparative genomics studies, and is also important for functional annotation of new genomes. Identification of orthologous gene sets typically involves phylogenetic tree analysis, heuristic algorithms based on sequence conservation, synteny analysis, or some combination of these approaches. The most direct tree-based methods typically rely on the comparison of an individual gene tree with a species tree. Once the two trees are accurately constructed, orthologs are straightforwardly identified by the definition of orthology as those homologs that are related by speciation, rather than gene duplication, at their most recent point of origin. Although ideal for the purpose of orthology identification in principle, phylogenetic trees are computationally expensive to construct for large numbers of genes and genomes, and they often contain errors, especially at large evolutionary distances. Moreover, in many organisms, in particular prokaryotes and viruses, evolution does not appear to have followed a simple ‘tree-like’ mode, which makes conventional tree reconciliation inapplicable. Other, heuristic methods identify probable orthologs as the closest homologous pairs or groups of genes in a set of organisms. These approaches are faster and easier to automate than tree-based methods, with efficient implementations provided by graph-theoretical algorithms enabling comparisons of thousands of genomes. Comparisons of these two approaches show that, despite conceptual differences, they produce similar sets of orthologs, especially at short evolutionary distances. Synteny also can aid in identification of orthologs. Often, tree-based, sequence similarity- and synteny-based approaches can be combined into flexible hybrid methods. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
4. Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock.
- Author
-
Dequéant, Mary-Lee, Ahnert, Sebastian, Edelsbrunner, Herbert, Fink, Thomas M. A., Glynn, Earl F., Hattem, Gaye, Kudlicki, Andrzej, Mileyko, Yuriy, Morton, Jason, Mushegian, Arcady R., Pachter, Lior, Rowicka, Maga, Shiu, Anne, Sturmfels, Bernd, and Pourquié, Olivier
- Subjects
EMBRYONIC periodicity ,GENES ,NOISE ,GENOMES ,FOURIER analysis ,PERSISTENCE ,HYPOTHESIS ,MICE ,SOMITE - Abstract
While genome-wide gene expression data are generated at an increasing rate, the repertoire of approaches for pattern discovery in these data is still limited. Identifying subtle patterns of interest in large amounts of data (tens of thousands of profiles) associated with a certain level of noise remains a challenge. A microarray time series was recently generated to study the transcriptional program of the mouse segmentation clock, a biological oscillator associated with the periodic formation of the segments of the body axis. A method related to Fourier analysis, the Lomb-Scargle periodogram, was used to detect periodic profiles in the dataset, leading to the identification of a novel set of cyclic genes associated with the segmentation clock. Here, we applied to the same microarray time series dataset four distinct mathematical methods to identify significant patterns in gene expression profiles. These methods are called: Phase consistency, Address reduction, Cyclohedron test and Stable persistence, and are based on different conceptual frameworks that are either hypothesis- or data-driven. Some of the methods, unlike Fourier transforms, are not dependent on the assumption of periodicity of the pattern of interest. Remarkably, these methods identified blindly the expression profiles of known cyclic genes as the most significant patterns in the dataset. Many candidate genes predicted by more than one approach appeared to be true positive cyclic genes and will be of particular interest for future research. In addition, these methods predicted novel candidate cyclic genes that were consistent with previous biological knowledge and experimental validation in mouse embryos. Our results demonstrate the utility of these novel pattern detection strategies, notably for detection of periodic profiles, and suggest that combining several distinct mathematical approaches to analyze microarray datasets is a valuable strategy for identifying genes that exhibit novel, interesting transcriptional patterns. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
5. The evolution of Runx genes I. A comparative study of sequences from phylogenetically diverse model organisms.
- Author
-
Rennert, Jessica, Coffman, James A., Mushegian, Arcady R., and Robertson, Anthony J.
- Subjects
GENES ,BIOLOGICAL evolution ,GENETICS ,NUCLEOTIDE sequence ,PHYLOGENY ,METAZOA ,PROMOTERS (Genetics) - Abstract
Background: Runx genes encode proteins defined by the highly conserved Runt DNA-binding domain. Studies of Runx genes and proteins in model organisms indicate that they are key transcriptional regulators of animal development. However, little is known about Runx gene evolution. Results: A phylogenetically broad sampling of publicly available Runx gene sequences was collected. In addition to the published sequences from mouse, sea urchin, Drosophila melanogaster and Caenorhabditis elegans, we collected several previously uncharacterised Runx sequences from public genome sequence databases. Among deuterostomes, mouse and pufferfish each contain three Runx genes, while the tunicate Ciona intestinalis and the sea urchin Strongylocentrotus purpuratus were each found to have only one Runx gene. Among protostomes, C. elegans has a single Runx gene, while Anopheles gambiae has three and D. melanogaster has four, including two genes that have not been previously described. Comparative sequence analysis reveals two highly conserved introns, one within and one just downstream of the Runt domain. All vertebrate Runx genes utilize two alternative promoters. Conclusions: In the current public sequence database, the Runt domain is found only in bilaterians, suggesting that it may be a metazoan invention. Bilaterians appear to ancestrally contain a single Runx gene, suggesting that the multiple Runx genes in vertebrates and insects arose by independent duplication events within those respective lineages. At least two introns were present in the primordial bilaterian Runx gene. Alternative promoter usage arose prior to the duplication events that gave rise to three Runx genes in vertebrates. [ABSTRACT FROM AUTHOR]
- Published
- 2003
- Full Text
- View/download PDF
6. Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea.
- Author
-
Koonin, Eugene V., Mushegian, Arcady R., Galperin, Michael Y., and Walker, D. Roland
- Subjects
GENOMES ,ARCHAEBACTERIA ,BACTERIA ,AMINO acid sequence ,GENES ,PROTEINS - Abstract
Protein sequences encoded in three complete bacterial genomes, those of Haemophilus influenzae, Mycoplasma genitalium and Synechocystis sp., and the first available archaeal genome sequence, that of Methanococcus jannaschii, were analysed using the BLAST2 algorithm and methods for amino acid motif detection. Between 75% and 90% of the predicted proteins encoded in each of the bacterial genomes and 73% of the M. jannaschii proteins showed significant sequence similarity to proteins from other species. The fraction of bacterial and archaeal proteins containing regions conserved over long phylogenetic distances is nearly the same and close to 70%. Functions of 70-85% of the bacterial proteins and about 70% of the archaeal proteins were predicted with varying precision. This contrasts with the previous report that more than half of the archaeal proteins have no homologues and shows that, with more sensitive methods and detailed analysis of conserved motifs, archaeal genomes become as amenable to meaningful interpretation by computer as bacterial genomes. The analysis of conserved motifs resulted in the prediction of a number of previously undetected functions of bacterial and archaeal proteins and in the identification of novel protein families. In spite of the generally high conservation of protein sequences, orthologues of 25% or less of the M. jannaschii genes were detected in each individual completely sequenced genome, supporting the uniqueness of archaea as a distinct domain of life. About 53% of the M. jannaschii proteins belong to families of paralogues, a fraction similar to that in bacteria with larger genomes, such as Synechocystis sp. and Escherichia coli, but higher than that in H. influenzae, which has approximately the same number of genes as M. jannaschii. Certain groups of proteins, e.g. molecular chaperones and DNA repair enzymes, thought to be ubiquitous and represented in the minimal gene set derived by bacterial genome comparison, are missing in M. jannaschii, indicating massive non-orthologous displacement of genes responsible for essential functions. An unexpectedly large fraction of the M. jannaschii gene products, 44%, shows significantly higher similarity to bacterial than to eukaryotic proteins, compared with 13% that have eukaryotic proteins as their closest homologues (the rest of the proteins show approximately the same level of similarity to bacterial and eukaryotic homologues or have no homologues). Proteins involved in translation, transcription, replication and protein secretion are most closely related to eukaryotic proteins, whereas metabolic enzymes, metabolite uptake systems, enzymes for cell wall biosynthesis and many uncharacterized proteins appear to be 'bacterial'. A similar prevalence of proteins of apparent bacterial origin was observed among the currently available sequences from the distantly related archaeal genus, Sulfolobus. It is likely that the evolution of archaea included at least one major merger between ancestral cells from the bacterial lineage and the lineage leading to the eukaryotic nucleocytoplasm. [ABSTRACT FROM AUTHOR]
- Published
- 1997
- Full Text
- View/download PDF
7. Measuring similarity between gene interaction profiles.
- Author
-
Barido-Sottani, Joëlle, Chapman, Samuel D., Kosman, Evsey, and Mushegian, Arcady R.
- Subjects
GENETIC vectors ,BIOLOGICAL networks ,GENES ,PROTEIN-protein interactions ,ENDOPLASMIC reticulum ,RESEMBLANCE (Philosophy) ,GENE regulatory networks - Abstract
Background: Gene and protein interaction data are often represented as interaction networks, where nodes stand for genes or gene products and each edge stands for a relationship between a pair of gene nodes. Commonly, that relationship within a pair is specified by high similarity between profiles (vectors) of experimentally defined interactions of each of the two genes with all other genes in the genome; only gene pairs that interact with similar sets of genes are linked by an edge in the network. The tight groups of genes/gene products that work together in a cell can be discovered by the analysis of those complex networks. Results: We show that the choice of the similarity measure between pairs of gene vectors impacts the properties of networks and of gene modules detected within them. We re-analyzed well-studied data on yeast genetic interactions, constructed four genetic networks using four different similarity measures, and detected gene modules in each network using the same algorithm. The four networks induced different numbers of putative functional gene modules, and each similarity measure induced some unique modules. In an example of a putative functional connection suggested by comparing genetic interaction vectors, we predict a link between SUN-domain proteins and protein glycosylation in the endoplasmic reticulum. Conclusions: The discovery of molecular modules in genetic networks is sensitive to the way of measuring similarity between profiles of gene interactions in a cell. In the absence of a formal way to choose the "best" measure, it is advisable to explore the measures with different mathematical properties, which may identify different sets of connections between genes. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.