43 results on '"segmental duplication"'
Search Results
2. Subtelomeric plasticity contributes to gene family expansion in the human parasitic flatworm Schistosoma mansoni
- Author
-
Brann, T, Beltramini, A, Chaparro, C, Berriman, M, Doyle, SR, and Protasio, AV
- Published
- 2024
- Full Text
- View/download PDF
3. Genome-wide identification and expression analysis of the plant-specific PLATZ gene family in Tartary buckwheat (Fagopyrum tataricum).
- Author
-
Li, Jing, Feng, Shan, Zhang, Yuchuan, Xu, Lei, Luo, Yan, Yuan, Yuhao, Yang, Qinghua, and Feng, Baili
- Subjects
- *
BUCKWHEAT , *GENE families , *DNA-binding proteins , *PROMOTERS (Genetics) , *ROOT development , *NUCLEOTIDE sequencing , *CHROMOSOME duplication - Abstract
Background: Plant AT-rich sequence and zinc-binding (PLATZ) proteins belong to a novel class of plant-specific zinc-finger-dependent DNA-binding proteins that play essential roles in plant growth and development. Although the PLATZ gene family has been identified in several species, systematic identification and characterization of this gene family has not yet been carried out for Tartary buckwheat, which is an important medicinal and edible crop with high nutritional value. The recent completion of Tartary buckwheat genome sequencing has laid the foundation for this study. Results: A total of 14 FtPLATZ proteins were identified in Tartary buckwheat and were classified into four phylogenetic groups. The gene structure and motif composition were similar within the same group, and evident distinctions among different groups were detected. Gene duplication, particularly segmental duplication, was the main driving force in the evolution of FtPLATZs. Synteny analysis revealed that Tartary buckwheat shares more orthologous PLATZ genes with dicotyledons, particularly soybean. In addition, the expression of FtPLATZs in different tissues and developmental stages of grains showed evident specificity and preference. FtPLATZ3 may be involved in the regulation of grain size, and FtPLATZ4 and FtPLATZ11 may participate in root development. Abundant and variable hormone-responsive cis-acting elements were distributed in the promoter regions of FtPLATZs, and almost all FtPLATZs were significantly regulated after exogenous hormone treatments, particularly methyl jasmonate treatment. Moreover, FtPLATZ6 was significantly upregulated under all exogenous hormone treatments, which may indicate that this gene plays a critical role in the hormone response of Tartary buckwheat. Conclusions: This study lays a foundation for further exploration of the function of FtPLATZ proteins and their roles in the growth and development of Tartary buckwheat and contributes to the genetic improvement of Tartary buckwheat. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. Genome-wide characterization and expression analysis of Erf gene family in cotton.
- Author
-
Zafar, Muhammad Mubashar, Rehman, Abdul, Razzaq, Abdul, Parvaiz, Aqsa, Mustafa, Ghulam, Sharif, Faiza, Mo, Huijuan, Youlu, Yuan, Shakeel, Amir, and Ren, Maozhi
- Subjects
- *
GENE families , *COTTON , *WHOLE genome sequencing , *TRANSCRIPTION factors - Abstract
Background: AP2/ERF transcription factors are important in a variety of biological activities, including plant growth, development, and responses to biotic and abiotic stressors. However, little study has been done on cotton's AP2/ERF genes, although cotton is an essential fibre crop. We were able to examine the tissue and expression patterns of AP2/ERF genes in cotton on a genome-wide basis because of the recently published whole genome sequence of cotton. Genome-wide analysis of ERF gene family within two diploid species (G. arboreum & G. raimondii) and two tetraploid species (G. barbadense, G. hirsutum) was performed. Results: A total of 118, 120, 213, 220 genes containing the sequence of single AP2 domain were identified in G. arboreum, G. raimondii, G. barbadense and G. hirsutum respectively. The identified genes were unevenly distributed across 13/26 chromosomes of A and D genomes of cotton. Synteny and collinearity analysis revealed that segmental duplications may have played crucial roles in the expansion of the cotton ERF gene family, as well as tandem duplications played a minor role. Cis-acting elements of the promoter sites of Ghi-ERFs genes predict the involvement in multiple hormone responses and abiotic stresses. Transcriptome and qRT-PCR analysis revealed that Ghi-ERF-2D.6, Ghi-ERF-12D.13, Ghi-ERF-6D.1, Ghi-ERF-7A.6 and Ghi-ERF-11D.5 are candidate genes against salinity tolerance in upland cotton. Conclusion: Overwhelmingly, the present study paves the way to better understand the evolution of cotton ERF genes and lays a foundation for future investigation of ERF genes in improving salinity stress tolerance in cotton. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. Genome-wide identification and expression analysis of the GhIQD gene family in upland cotton (Gossypium hirsutum L.)
- Author
-
DOU, Lingling, LV, Limin, KANG, Yangyang, TIAN, Ruijie, HUANG, Deqing, LI, Jiayin, LI, Siyi, LIU, Fengping, CAO, Lingyan, JIN, Yuhua, LIU, Yang, LI, Huaizhu, WANG, Wenbo, PANG, Chaoyou, SHANG, Haihong, ZOU, Changsong, SONG, Guoli, and XIAO, Guanghui
- Published
- 2021
- Full Text
- View/download PDF
6. Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters.
- Author
-
Dallery, Jean-Félix, Lapalu, Nicolas, Zampounis, Antonios, Pigné, Sandrine, Luyten, Isabelle, Amselem, Joëlle, Wittenberg, Alexander H. J., Shiguo Zhou, de Queiroz, Marisa V., Robin, Guillaume P., Auger, Annie, Hainaut, Matthieu, Henrissat, Bernard, Ki-Tae Kim, Yong-Hwan Lee, Lespinet, Olivier, Schwartz, David C., Thon, Michael R., and O'Connell, Richard J.
- Subjects
- *
COLLETOTRICHUM , *CHROMOSOME structure , *METABOLITES , *TRANSPOSONS , *RIBOSOMAL DNA , *PATHOGENIC microorganisms - Abstract
Background: The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture. Results: Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highlyconserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications. Conclusion: The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
7. Genome-wide analysis of WOX genes in upland cotton and their expression pattern under different stresses.
- Author
-
Zhaoen Yang, Qian Gong, Wenqiang Qin, Zuoren Yang, Yuan Cheng, Lili Lu, Xiaoyang Ge, Chaojun Zhang, Zhixia Wu, and Fuguang Li
- Subjects
- *
GENE expression in plants , *PLANT growth , *ARABIDOPSIS , *LOCUS (Genetics) ,COTTON genetics - Abstract
Background: WUSCHEL-related homeobox (WOX) family members play significant roles in plant growth and development, such as in embryo patterning, stem-cell maintenance, and lateral organ formation. The recently published cotton genome sequences allow us to perform comprehensive genome-wide analysis and characterization of WOX genes in cotton. Results: In this study, we identified 21, 20, and 38 WOX genes in Gossypium arboreum (2n = 26, A2), G. raimondii (2n = 26, D5), and G. hirsutum (2n = 4x = 52, (AD)t), respectively. Sequence logos showed that homeobox domains were significantly conserved among the WOX genes in cotton, Arabidopsis, and rice. A total of 168 genes from three typical monocots and six dicots were naturally divided into three clades, which were further classified into nine sub-clades. A good collinearity was observed in the synteny analysis of the orthologs from At and Dt (t represents tetraploid) sub-genomes. Whole genome duplication (WGD) and segmental duplication within At and Dt sub-genomes played significant roles in the expansion of WOX genes, and segmental duplication mainly generated the WUS clade. Copia and Gypsy were the two major types of transposable elements distributed upstream or downstream of WOX genes. Furthermore, through comparison, we found that the exon/intron pattern was highly conserved between Arabidopsis and cotton, and the homeobox domain loci were also conserved between them. In addition, the expression pattern in different tissues indicated that the duplicated genes in cotton might have acquired new functions as a result of sub-functionalization or neo-functionalization. The expression pattern of WOX genes under different stress treatments showed that the different genes were induced by different stresses. Conclusion: In present work, WOX genes, classified into three clades, were identified in the upland cotton genome. Whole genome and segmental duplication were determined to be the two major impetuses for the expansion of gene numbers during the evolution. Moreover, the expression patterns suggested that the duplicated genes might have experienced a functional divergence. Together, these results shed light on the evolution of the WOX gene family, and would be helpful in future research. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
8. The similar and different evolutionary trends of MATE family occurred between rice and Arabidopsis thaliana.
- Author
-
Lihui Wang, Xiujuan Bei, Jiansheng Gao, Yaxuan Li, Yueming Yan, and Yingkao Hu
- Subjects
- *
ARABIDOPSIS thaliana , *CARRIER proteins , *GENE expression in plants , *PLANT genes , *RICE varieties - Abstract
Background: Multidrug and toxic compound extrusion (MATE) transporter proteins are present in all organisms. Although the functions of some MATE gene family members have been studied in plants, few studies have investigated the gene expansion patterns, functional divergence, or the effects of positive selection. Results: Forty-five MATE genes from rice and 56 from Arabidopsis were identified and grouped into four subfamilies. MATE family genes have similar exon-intron structures in rice and Arabidopsis; MATE gene structures are conserved in each subfamily but differ among subfamilies. In both species, the MATE gene family has expanded mainly through tandem and segmental duplications. A transcriptome atlas showed considerable differences in expression among the genes, in terms of transcript abundance and expression patterns under normal growth conditions, indicating wide functional divergence in this family. In both rice and Arabidopsis, the MATE genes showed consistent functional divergence trends, with highly significant Type-I divergence in each subfamily, while Type-II divergence mainly occurred in subfamily III. The Type-II coefficients between rice subfamilies I/III, II/III, and IV/III were all significantly greater than zero, while only the Type-II coefficient between Arabidopsis IV/III subfamilies was significantly greater than zero. A site-specific model analysis indicated that MATE genes have relatively conserved evolutionary trends. A branch-site model suggested that the extent of positive selection on each subfamily of rice and Arabidopsis was different: subfamily II of Arabidopsis showed higher positive selection than other subfamilies, whereas in rice, positive selection was highest in subfamily III. In addition, the analyses identified 18 rice sites and 7 Arabidopsis sites that were responsible for positive selection and for Type-I and Type-II functional divergence; there were no common sites between rice and Arabidopsis. Five coevolving amino acid sites were identified in rice and three in Arabidopsis; these sites might have important roles in maintaining local structural stability and protein functional domains. Conclusions: We demonstrate that the MATE gene family expanded through tandem and segmental duplication in both rice and Arabidopsis. Overall, the results of our analyses contribute to improved understanding of the molecular evolution and functions of the MATE gene family in plants. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
9. An evolutionary driver of interspersed segmental duplications in primates
- Author
-
Cantsilieris, Stuart, Sunkin, Susan M., Johnson, Matthew E., Anaclerio, Fabio, Huddleston, John, Baker, Carl, Dougherty, Max L., Underwood, Jason G., Sulovari, Arvis, Hsieh, PingHsun, Mao, Yafei, Catacchio, Claudia Rita, Malig, Maika, Welch, AnneMarie E., Sorensen, Melanie, Munson, Katherine M., Jiang, Weihong, Girirajan, Santhosh, Ventura, Mario, Lamb, Bruce T., Conlon, Ronald A., and Eichler, Evan E.
- Published
- 2020
- Full Text
- View/download PDF
10. Interlocus gene conversion explains at least 2.7% of single nucleotide variants in human segmental duplications.
- Author
-
Dumont, Beth L.
- Subjects
- *
GENE conversion , *CHROMOSOME duplication , *GENETIC recombination research , *SINGLE nucleotide polymorphisms , *NUCLEOTIDES - Abstract
Background: Interlocus gene conversion (IGC) is a recombination-based mechanism that results in the unidirectional transfer of short stretches of sequence between paralogous loci. Although IGC is a well-established mechanism of human disease, the extent to which this mutagenic process has shaped overall patterns of segregating variation in multi-copy regions of the human genome remains unknown. One expected manifestation of IGC in population genomic data is the presence of one-to-one paralogous SNPs that segregate identical alleles. Results: Here, I use SNP genotype calls from the low-coverage phase 3 release of the 1000 Genomes Project to identify 15,790 parallel, shared SNPs in duplicated regions of the human genome. My approach for identifying these sites accounts for the potential redundancy of short read mapping in multi-copy genomic regions, thereby effectively eliminating false positive SNP calls arising from paralogous sequence variation. I demonstrate that independent mutation events to identical nucleotides at paralogous sites are not a significant source of shared polymorphisms in the human genome, consistent with the interpretation that these sites are the outcome of historical IGC events. These putative signals of IGC are enriched in genomic contexts previously associated with non-allelic homologous recombination, including clear signals in gene families that form tandem intra-chromosomal clusters. Conclusions: Taken together, my analyses implicate IGC, not point mutation, as the mechanism generating at least 2.7% of single nucleotide variants in duplicated regions of the human genome. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
11. Genes on B chromosomes of vertebrates.
- Author
-
Makunin, Alexey I., Dementyeva, Polina V., Graphodatsky, Alexander S., Volobouev, Vitaly T., Kukekova, Anna V., and Trifonov, Vladimir A.
- Subjects
- *
VERTEBRATE genetics , *CHROMOSOMES , *MOLECULAR genetics , *GENOMICS , *GENETIC mutation , *NUCLEOTIDE sequencing - Abstract
Background: There is a growing body of evidence that B chromosomes, once regarded as totally heterochromatic and genetically inert, harbor multiple segmental duplications containing clusters of ribosomal RNA genes, processed pseudogenes and protein-coding genes. Application of novel molecular approaches further supports complex composition and possible phenotypic effects of B chromosomes. Results: Here we review recent findings of gene-carrying genomic segments on B chromosomes from different vertebrate groups. We demonstrate that the genetic content of B chromosomes is highly heterogeneous and some B chromosomes contain multiple large duplications derived from various chromosomes of the standard karyotype. Although B chromosomes seem to be mostly homologous to each other within a species, their genetic content differs between species. There are indications that some genomic regions are more likely to be located on B chromosomes. Conclusions: The discovery of multiple autosomal genes on B chromosomes opens a new discussion about their possible effects ranging from sex determination to fitness and adaptation, their complex interactions with host genome and role in evolution. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
12. Distribution of segmental duplications in the context of higher order chromatin organisation of human chromosome 7.
- Author
-
Ebert, Grit, Steininger, Anne, Weißmann, Robert, Boldt, Vivien, Lind-Thomsen, Allan, Grune, Jana, Badelt, Stefan, Heßler, Melanie, Peiser, Matthias, Hitzler, Manuel, Jensen, Lars R., Müller, Ines, Hu, Hao, Arndt, Peter F., Kuss, Andreas W., Tebel, Katrin, and Ullmann, Reinhard
- Subjects
- *
CHROMOSOMES , *CHROMOSOME duplication , *WILLIAMS syndrome , *SEGMENTAL analysis technique (Biomechanics) , *CHROMATIN - Abstract
Background Segmental duplications (SDs) are not evenly distributed along chromosomes. The reasons for this biased susceptibility to SD insertion are poorly understood. Accumulation of SDs is associated with increased genomic instability, which can lead to structural variants and genomic disorders such as the Williams-Beuren syndrome. Despite these adverse effects, SDs have become fixed in the human genome. Focusing on chromosome 7, which is particularly rich in interstitial SDs, we have investigated the distribution of SDs in the context of evolution and the three dimensional organisation of the chromosome in order to gain insights into the mutual relationship of SDs and chromatin topology. Results Intrachromosomal SDs preferentially accumulate in those segments of chromosome 7 that are homologous to marmoset chromosome 2. Although this formerly compact segment has been re-distributed to three different sites during primate evolution, we can show by means of public data on long distance chromatin interactions that these three intervals, and consequently the paralogous SDs mapping to them, have retained their spatial proximity in the nucleus. Focusing on SD clusters implicated in the aetiology of the Williams-Beuren syndrome locus we demonstrate by cross-species comparison that these SDs have inserted at the borders of a topological domain and that they flank regions with distinct DNA conformation. Conclusions Our study suggests a link of nuclear architecture and the propagation of SDs across chromosome 7, either by promoting regional SD insertion or by contributing to the establishment of higher order chromatin organisation themselves. The latter could compensate for the high risk of structural rearrangements and thus may have contributed to their evolutionary fixation in the human genome. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
13. Genome-wide identification and functional characterization of the Camelina sativa WRKY gene family in response to abiotic stress
- Author
-
Chunhui Zhang, Runzhi Li, Hongli Cui, Ying Shi, Chunli Ji, Lixia Yuan, Yanan Song, and Jinai Xue
- Subjects
Expression profiles ,lcsh:QH426-470 ,lcsh:Biotechnology ,WRKY transcriptional factors ,Camelina (Camelina sativa (L.) Crantz) ,Genome ,Function analysis ,Gene Expression Regulation, Plant ,Stress, Physiological ,Arabidopsis ,lcsh:TP248.13-248.65 ,Genetics ,Gene family ,Gene ,Phylogeny ,Segmental duplication ,Plant Proteins ,biology ,Abiotic stress ,biology.organism_classification ,WRKY protein domain ,lcsh:Genetics ,Multigene Family ,Genome-wide characterization ,Tandem exon duplication ,Genome, Plant ,Biotechnology ,Research Article - Abstract
Background WRKY transcription factors are a superfamily of regulators involved in diverse biological processes and stress responses in plants. However, there is limited knowledge about the WRKY family in camelina (Camelina sativa), an important Brassicaceae oil crop with strong tolerance for various stresses. Here, a genome-wide characterization of WRKY proteins is performed to examine their gene structures, phylogenetics, expression, conserved motif organizations, and functional annotation to identify candidate WRKYs that mediate stress resistance regulation in camelinas. Results A total of 242 CsWRKY proteins encoded by 224 gene loci distributed unevenly over the chromosomes were identified, and they were classified into three groups by phylogenetic analysis according to their WRKY domains and zinc finger motifs. The 15 CsWRKY gene loci generated 33 spliced variants. Orthologous WRKY gene pairs were identified, with 173 pairs in the C. sativa and Arabidopsis genomes as well as 282 pairs in the C. sativa and B. napus genomes, respectively. A total of 137 segmental duplication events were observed, but there was no tandem duplication in the camelina genome. Ten major conserved motifs were examined, with WRKYGQK being the most conserved, and several variants were present in many CsWRKYs. Expression analysis revealed that 50% more CsWRKY genes were expressed constitutively, and a set of them displayed tissue-specific expression. Notably, 11 CsWRKY genes exhibited significant expression changes in seedlings under cold, salt, and drought stresses, showing a preferentially inducible expression pattern in response to the stress. Conclusions The present article describes a detailed analysis of the CsWRKY gene family and its expression profiles in 12 tissues and under several stress conditions. Segmental duplication is the major force underlying the broad expansion of this gene family, and a strong purifying pressure occurred for CsWRKY proteins during their evolution. CsWRKY proteins play important roles in plant development, with differential functions in different tissues. Exceptionally, eleven CsWRKYs, particularly five alternative spliced isoforms, were found to be the possible key players in mediating plant responses to various stresses. Overall, our results provide a foundation for understanding the roles of CsWRKYs and the precise mechanism through which CsWRKYs regulate high stress resistance as well as the development of stress tolerance cultivars among Cruciferae crops.
- Published
- 2020
14. Genome-wide identification, phylogeny and expression analysis of the SPL gene family in wheat
- Author
-
Xiaoying Wang, Qin Ding, Lingjian Ma, Dazhong Zhang, Yucui Han, Ting Zhu, Yue Liu, and Liting Ma
- Subjects
Crops, Agricultural ,China ,Subfamily ,Protein domain ,Plant Science ,Biology ,Genome ,SPL gene family ,Gene Expression Regulation, Plant ,lcsh:Botany ,Gene duplication ,Gene family ,Gene ,Phylogeny ,Triticum ,Segmental duplication ,Plant Proteins ,Genetics ,Comparative genomics ,Phylogenetic analysis ,Expression patterns analysis ,Gene Expression Profiling ,food and beverages ,lcsh:QK1-989 ,Plant Breeding ,Wheat ,Carrier Proteins ,Genome, Plant ,Research Article ,Genome-Wide Association Study ,Transcription Factors - Abstract
Background Members of the plant-specific SPL gene family (squamosa promoter-binding protein -like) contain the SBP conserved domain and are involved in the regulation of plant growth and development, including the development of plant flowers and plant epidermal hair, the plant stress response, and the synthesis of secondary metabolites. This family has been identified in various plants. However, there is no systematic analysis of the SPL gene family at the genome-wide level of wheat. Results In this study, 56 putative TaSPL genes were identified using the comparative genomics method; we renamed them TaSPL001 - TaSPL056 on their chromosomal distribution. According to the un-rooted neighbor joining phylogenetic tree, gene structure and motif analyses, the 56 TaSPL genes were divided into 8 subgroups. A total of 81 TaSPL gene pairs were designated as arising from duplication events and 64 interacting protein branches were identified as involve in the protein interaction network. The expression patterns of 21 randomly selected TaSPL genes in different tissues (roots, stems, leaves and inflorescence) and under 4 treatments (abscisic acid, gibberellin, drought and salt) were detected by quantitative real-time polymerase chain reaction (qRT-PCR). Conclusions The wheat genome contains 56 TaSPL genes and those in same subfamily share similar gene structure and motifs. TaSPL gene expansion occurred through segmental duplication events. Combining the results of transcriptional and qRT-PCR analyses, most of these TaSPL genes were found to regulate inflorescence and spike development. Additionally, we found that 13 TaSPLs were upregulated by abscisic acid, indicating that TaSPL genes play a positive role in the abscisic acid-mediated pathway of the seedling stage. This study provides comprehensive information on the SPL gene family of wheat and lays a solid foundation for elucidating the biological functions of TaSPLs and improvement of wheat yield.
- Published
- 2020
15. Association of microsatellite pairs with segmental duplications in insect genomes.
- Author
-
Behura, Susanta K. and Severson, David W.
- Subjects
- *
MICROSATELLITE repeats , *CHROMOSOME duplication , *INSECT genomes , *NUCLEOTIDE sequence , *INSECT phylogeny , *TRANSPOSONS , *INSECTS - Abstract
Background Segmental duplications (SDs), also known as low-copy repeats, are DNA sequences of length greater than 1 kb which are duplicated with a high degree of sequence identity (greater than 90%) causing instability in genomes. SDs are generally found in the genome as mosaic forms of duplicated sequences which are generated by a two-step process: first, multiple duplicated sequences are aggregated at specific genomic regions, and then, these primary duplications undergo multiple secondary duplications. However, the mechanism of how duplicated sequences are aggregated in the first place is not well understood. Results By analyzing the distribution of microsatellite sequences among twenty insect species in a genome-wide manner it was found that pairs of microsatellites along with the intervening sequences were duplicated multiple times in each genome. They were found as low copy repeats or segmental duplications when the duplicated loci were greater than 1 kb in length and had greater than 90% sequence similarity. By performing a sliding-window genomic analysis for number of paired microsatellites and number of segmental duplications, it was observed that regions rich in repetitive paired microsatellites tend to get richer in segmental duplication suggesting a "rich-gets-richer" mode of aggregation of the duplicated loci in specific regions of the genome. Results further show that the relationship between number of paired microsatellites and segmental duplications among the species is independent of the known phylogeny suggesting that association of microsatellites with segmental duplications may be a species-specific evolutionary process. It also observed that the repetitive microsatellite pairs are associated with gene duplications but those sequences are rarely retained in the orthologous genes between species. Although some of the duplicated sequences with microsatellites as termini were found within transposable elements (TEs) of Drosophila, most of the duplications are found in the TE-free and gene-free regions of the genome. Conclusion The study clearly suggests that microsatellites are instrumental in extensive sequence duplications that may contribute to species-specific evolution of genome plasticity in insects. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
16. An Sp185/333 gene cluster from the purple sea urchin and putative microsatellite-mediated gene diversification.
- Author
-
Miller, Chase A, Buckley, Katherine M, Easley, Rebecca L, and Smith, L Courtney
- Subjects
- *
PARACENTROTUS lividus , *GENE clusters , *GENE conversion , *STRONGYLOCENTROTUS purpuratus , *GENES , *SHOTGUN sequencing , *GENE families , *CHROMOSOME duplication - Abstract
Background: The immune system of the purple sea urchin, Strongylocentrotus purpuratus, is complex and sophisticated. An important component of sea urchin immunity is the Sp185/333 gene family, which is significantly upregulated in immunologically challenged animals. The Sp185/333 genes are less than 2 kb with two exons and are members of a large diverse family composed of greater than 40 genes. The S. purpuratus genome assembly, however, contains only six Sp185/333 genes. This underrepresentation could be due to the difficulties that large gene families present in shotgun assembly, where multiple similar genes can be collapsed into a single consensus gene. Results: To understand the genomic organization of the Sp185/333 gene family, a BAC insert containing Sp185/333 genes was assembled, with careful attention to avoiding artifacts resulting from collapse or artificial duplication/expansion of very similar genes. Twelve candidate BAC assemblies were generated with varying parameters and the optimal assembly was identified by PCR, restriction digests, and subclone sequencing. The validated assembly contained six Sp185/333 genes that were clustered in a 34 kb region at one end of the BAC with five of the six genes tightly clustered within 20 kb. The Sp185/333 genes in this cluster were no more similar to each other than to previously sequenced Sp185/333 genes isolated from three different animals. This was unexpected given their proximity and putative effects of gene homogenization in closely linked, similar genes. All six genes displayed significant similarity including both 5' and 3' flanking regions, which were bounded by microsatellites. Three of the Sp185/333 genes and their flanking regions were tandemly duplicated such that each repeated segment consisted of a gene plus 0.7 kb 5' and 2.4 kb 3' of the gene (4.5 kb total). Both edges of the segmental duplications were bounded by different microsatellites. Conclusions: The high sequence similarity of the Sp185/333 genes and flanking regions, suggests that the microsatellites may promote genomic instability and are involved with gene duplication and/or gene conversion and the extraordinary sequence diversity of this family. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
17. Expansion and subfunctionalisation of flavonoid 3',5'-hydroxylases in the grapevine lineage.
- Author
-
Falginella, Luigi, Castellarin, Simone D, Testolin, Raffaele, Gambetta, Gregory A, Morgante, Michele, and Di Gaspero, Gabriele
- Subjects
- *
GRAPE varieties , *GRAPES , *FRUIT ripening , *GENITALIA , *VITIS vinifera , *ANTHOCYANINS , *FRUIT - Abstract
Background: Flavonoid 3',5'-hydroxylases (F3'5'Hs) and flavonoid 3'-hydroxylases (F3'Hs) competitively control the synthesis of delphinidin and cyanidin, the precursors of blue and red anthocyanins. In most plants, F3'5'H genes are present in low-copy number, but in grapevine they are highly redundant. Results: The first increase in F3'5'H copy number occurred in the progenitor of the eudicot clade at the time of the γ triplication. Further proliferation of F3'5'H s has occurred in one of the paleologous loci after the separation of Vitaceae from other eurosids, giving rise to 15 paralogues within 650 kb. Twelve reside in 9 tandem blocks of ~35-55 kb that share 91-99% identity. The second paleologous F3'5'H has been maintained as an orphan gene in grapevines, and lacks orthologues in other plants. Duplicate F3'5'H s have spatially and temporally partitioned expression profiles in grapevine. The orphan F3'5'H copy is highly expressed in vegetative organs. More recent duplicate F3'5'H s are predominately expressed in berry skins. They differ only slightly in the coding region, but are distinguished in the structure of the promoter. Differences in cis-regulatory sequences of promoter regions are paralleled by temporal specialisation of gene transcription during fruit ripening. Variation in anthocyanin profiles consistently reflects changes in the F3'5'H mRNA pool across different cultivars. More F3'5'H copies are expressed at high levels in grapevine varieties with 93-94% of 3'5'-OH anthocyanins. In grapevines depleted in 3'5'-OH anthocyanins (15-45%), fewer F3'5'H copies are transcribed, and at lower levels. Conversely, only two copies of the gene encoding the competing F3'H enzyme are present in the grape genome; one copy is expressed in both vegetative and reproductive organs at comparable levels among cultivars, while the other is transcriptionally silent. Conclusions: These results suggest that expansion and subfunctionalisation of F3'5'H s have increased the complexity and diversification of the fruit colour phenotype among red grape varieties. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
18. Genome-wide exploration and characterization of miR172/euAP2 genes in Brassica napus L. for likely role in flower organ development
- Author
-
Jiana Li, Yanru Cao, Yumin Gao, Liezhao Liu, Tengyue Wang, Jia Wang, Xinfu Xu, Yingchao Tan, Xiaoke Ping, Hongju Jian, and Kun Lu
- Subjects
0106 biological sciences ,0301 basic medicine ,Nonsynonymous substitution ,Sequence analysis ,Evolution ,Plant Science ,Flowers ,Biology ,Genes, Plant ,01 natural sciences ,Genome ,Expression analysis ,03 medical and health sciences ,lcsh:Botany ,Gene family ,Coding region ,miR172 ,Gene ,Conserved Sequence ,Phylogeny ,Segmental duplication ,Genetics ,Brassica napus ,food and beverages ,Chromosome Mapping ,lcsh:QK1-989 ,MicroRNAs ,030104 developmental biology ,euAP2 ,Synonymous substitution ,Sequence Alignment ,010606 plant biology & botany ,Research Article ,Genome-Wide Association Study - Abstract
Background APETALA2-like genes encode plant-specific transcription factors, some of which possess one microRNA172 (miR172) binding site. The miR172 and its target euAP2 genes are involved in the process of phase transformation and flower organ development in many plants. However, the roles of miR172 and its target AP2 genes remain largely unknown in Brassica napus (B. napus). Results In this study, 19 euAP2 and four miR172 genes were identified in the B. napus genome. A sequence analysis suggested that 17 euAP2 genes were targeted by Bna-miR172 in the 3′ coding region. EuAP2s were classified into five major groups in B.napus. This classification was consistent with the exon-intron structure and motif organization. An analysis of the nonsynonymous and synonymous substitution rates revealed that the euAP2 genes had gone through purifying selection. Whole genome duplication (WGD) or segmental duplication events played a major role in the expansion of the euAP2 gene family. A cis-regulatory element (CRE) analysis suggested that the euAP2s were involved in the response to light, hormones, stress, and developmental processes including circadian control, endosperm and meristem expression. Expression analysis of the miR172-targeted euAP2s in nine different tissues showed diverse spatiotemporal expression patterns. Most euAP2 genes were highly expressed in the floral organs, suggesting their specific functions in flower development. BnaAP2–1, BnaAP2–5 and BnaTOE1–2 had higher expression levels in late-flowering material than early-flowering material based on RNA-seq and qRT-PCR, indicating that they may act as floral suppressors. Conclusions Overall, analyses of the evolution, structure, tissue specificity and expression of the euAP2 genes were peformed in B.napus. Based on the RNA-seq and experimental data, euAP2 may be involved in flower development. Three euAP2 genes (BnaAP2–1, BnaAP2–5 and BnaTOE1–2) might be regarded as floral suppressors. The results of this study provide insights for further functional characterization of the miR172 /euAP2 module in B.napus. Electronic supplementary material The online version of this article (10.1186/s12870-019-1936-2) contains supplementary material, which is available to authorized users.
- Published
- 2019
19. Genome-wide analysis of the pentatricopeptide repeat gene family in different maize genomes and its important role in kernel development
- Author
-
Chunhui Li, Lin Chen, Yunsu Shi, Yongxiang Li, Tianyu Wang, Yanchun Song, Yu Li, and Dengfeng Zhang
- Subjects
0301 basic medicine ,Candidate gene ,Plant Science ,Biology ,Genes, Plant ,Genome ,Zea mays ,Chromosomes, Plant ,Kernel development ,Expression variation ,03 medical and health sciences ,Gene Expression Regulation, Plant ,lcsh:Botany ,Gene duplication ,Gene expression ,Pentatricopeptide repeat (PPR) proteins ,Gene family ,Gene ,Segmental duplication ,Plant Proteins ,Genetics ,food and beverages ,Chromosome Mapping ,lcsh:QK1-989 ,Maize ,Gene structure ,030104 developmental biology ,Pentatricopeptide repeat ,Edible Grain ,Genome, Plant ,Research Article ,Genome-Wide Association Study - Abstract
Background The pentatricopeptide repeat (PPR) gene family is one of the largest gene families in land plants (450 PPR genes in Arabidopsis, 477 PPR genes in rice and 486 PPR genes in foxtail millet) and is important for plant development and growth. Most PPR genes are encoded by plastid and mitochondrial genomes, and the gene products regulate the expression of the related genes in higher plants. However, the functions remain largely unknown, and systematic analysis and comparison of the PPR gene family in different maize genomes have not been performed. Results In this study, systematic identification and comparison of PPR genes from two elite maize inbred lines, B73 and PH207, were performed. A total of 491 and 456 PPR genes were identified in the B73 and PH207 genomes, respectively. Basic bioinformatics analyses, including of the classification, gene structure, chromosomal location and conserved motifs, were conducted. Examination of PPR gene duplication showed that 12 and 15 segmental duplication gene pairs exist in the B73 and PH207 genomes, respectively, with eight duplication events being shared between the two genomes. Expression analysis suggested that 53 PPR genes exhibit qualitative variations in the different genetic backgrounds. Based on analysis of the correlation between PPR gene expression in kernels and kernel-related traits, four PPR genes are significantly negatively correlated with hundred kernel weight, 12 are significantly negatively correlated with kernel width, and eight are significantly correlated with kernel number. Eight of the 24 PPR genes are also located in metaQTL regions associated with yield and kernel-related traits in maize. Two important PPR genes (GRMZM2G353195 and GRMZM2G141202) might be regarded as important candidate genes associated with maize kernel-related traits. Conclusions Our results provide a more comprehensive understanding of PPR genes in different maize inbred lines and identify important candidate genes related to kernel development for subsequent functional validation in maize. Electronic supplementary material The online version of this article (10.1186/s12870-018-1572-2) contains supplementary material, which is available to authorized users.
- Published
- 2018
20. Expansion and evolutionary patterns of cysteine-rich peptides in plants
- Author
-
Huping Zhang, Leiting Li, Huijun Jiao, Shaoling Zhang, Musana R. Fabrice, Juyou Wu, Xin Qiao, and Xing Liu
- Subjects
0106 biological sciences ,0301 basic medicine ,Signal peptide ,lcsh:QH426-470 ,Clustered genes ,Gene duplication ,lcsh:Biotechnology ,Genomics ,Biology ,01 natural sciences ,Cysteine-rich peptide ,Self-incompatibility ,Evolution, Molecular ,Pyrus ,03 medical and health sciences ,Gene Expression Regulation, Plant ,lcsh:TP248.13-248.65 ,Genetics ,Gene family ,Cysteine ,Selection, Genetic ,Gene ,Segmental duplication ,Synteny ,Expression divergence ,Divergent evolution pattern ,food and beverages ,Positive selection ,lcsh:Genetics ,030104 developmental biology ,DNA microarray ,Peptides ,010606 plant biology & botany ,Biotechnology ,Research Article - Abstract
Background Cysteine-rich peptides (CRPs) are gaining recognition as regulators of cell–cell communication in plants. Results We identified 9556 CRPs in 12 plant species and analysed their evolutionary patterns. In most angiosperm plants, whole genome duplication and segmental duplication are the major factors driving the expansion of CRP family member genes, especially signal peptides. About 30% of the CRP genes were found clustered on the chromosomes, except in maize (Zea mays). Considerable collinearities between CRP genes between or within species reveal several syntenic regions on the chromosomes. Different subfamilies display diverse evolutionary rates, suggesting that these subfamilies are subjected to different selective pressures. CRPs in different duplication models also show contrasting evolutionary rates, although the underlying mechanism is unclear because of the complexity of gene evolution. The 1281 positively selected genes identified are probably generated within a certain period of time. While most of these belonged to maize and sorghum (Sorghum bicolor), new CRP functions would also be expected. Up-regulation of 10 CRPs was observed in self-pollinated pear pistils and pollen tubes under self S-RNase treatments in vitro. The expression divergence between different CRP gene duplication types suggests that different duplication mechanisms affected the fate of the duplicated CRPs. Conclusion Our analyses of the evolution of the CRP gene family provides a unique view of the evolution of this large gene family. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3948-3) contains supplementary material, which is available to authorized users.
- Published
- 2017
21. Duplication and positive selection among hominin-specific PRAME genes
- Author
-
Chris P. Ponting, Zoë Birtle, and Leo Goodstadt
- Subjects
Male ,Models, Molecular ,Pan troglodytes ,lcsh:QH426-470 ,Pseudogene ,lcsh:Biotechnology ,Population ,Biology ,Genome ,Translocation, Genetic ,PRAME Gene ,Evolution, Molecular ,Gene Duplication ,lcsh:TP248.13-248.65 ,Testis ,Genetics ,Animals ,Cluster Analysis ,Humans ,Selection, Genetic ,education ,Melanoma ,Gene ,Alleles ,Phylogeny ,Segmental duplication ,education.field_of_study ,PRAME ,Polymorphism, Genetic ,Models, Genetic ,Genome, Human ,Exons ,Introns ,Gene Expression Regulation, Neoplastic ,lcsh:Genetics ,Phenotype ,Chromosomes, Human, Pair 1 ,Multigene Family ,Human genome ,Pseudogenes ,Research Article ,Biotechnology - Abstract
Background The physiological and phenotypic differences between human and chimpanzee are largely specified by our genomic differences. We have been particularly interested in recent duplications in the human genome as examples of relatively large-scale changes to our genome. We performed an in-depth evolutionary analysis of a region of chromosome 1, which is copy number polymorphic among humans, and that contains at least 32 PRAME (Preferentially expressed antigen of melanoma) genes and pseudogenes. PRAME-like genes are expressed in the testis and in a large number of tumours, and are thought to possess roles in spermatogenesis and oogenesis. Results Using nucleotide substitution rate estimates for exons and introns, we show that two large segmental duplications, of six and seven human PRAME genes respectively, occurred in the last 3 million years. These duplicated genes are thus hominin-specific, having arisen in our genome since the divergence from chimpanzee. This cluster of PRAME genes appears to have arisen initially from a translocation approximately 95–85 million years ago. We identified multiple sites within human or mouse PRAME sequences which exhibit strong evidence of positive selection. These form a pronounced cluster on one face of the predicted PRAME protein structure. Conclusion We predict that PRAME genes evolved adaptively due to strong competition between rapidly-dividing cells during spermatogenesis and oogenesis. We suggest that as PRAME gene copy number is polymorphic among individuals, positive selection of PRAME alleles may still prevail within the human population.
- Published
- 2016
22. Large-scale copy number variants (CNVs): Distribution in normal subjects and FISH/real-time qPCR analysis
- Author
-
Sarah L. Nolin, W. Ted Brown, M. E. Suzanne Lewis, Xudong Liu, Jeanette J. A. Holden, Maryam Koochek, Ying Qiao, Chansonette Harvard, and Evica Rajcan-Separovic
- Subjects
Male ,congenital, hereditary, and neonatal diseases and abnormalities ,lcsh:QH426-470 ,endocrine system diseases ,lcsh:Biotechnology ,Black People ,Genomics ,Biology ,Polymerase Chain Reaction ,White People ,law.invention ,03 medical and health sciences ,law ,lcsh:TP248.13-248.65 ,Neoplasms ,mental disorders ,Genetics ,Humans ,Copy-number variation ,Polymerase chain reaction ,In Situ Hybridization, Fluorescence ,030304 developmental biology ,Segmental duplication ,0303 health sciences ,Polymorphism, Genetic ,Gene Expression Profiling ,030305 genetics & heredity ,Genetic Variation ,Nucleic Acid Hybridization ,Hispanic or Latino ,Sequence Analysis, DNA ,Gene expression profiling ,lcsh:Genetics ,Gene Expression Regulation ,Human genome ,Female ,DNA microarray ,Comparative genomic hybridization ,Research Article ,Biotechnology - Abstract
Background Genomic copy number variants (CNVs) involving >1 kb of DNA have recently been found to be widely distributed throughout the human genome. They represent a newly recognized form of DNA variation in normal populations, discovered through screening of the human genome using high-throughput and high resolution methods such as array comparative genomic hybridization (array-CGH). In order to understand their potential significance and to facilitate interpretation of array-CGH findings in constitutional disorders and cancers, we studied 27 normal individuals (9 Caucasian; 9 African American; 9 Hispanic) using commercially available 1 Mb resolution BAC array (Spectral Genomics). A selection of CNVs was further analyzed by FISH and real-time quantitative PCR (RT-qPCR). Results A total of 42 different CNVs were detected in 27 normal subjects. Sixteen (38%) were not previously reported. Thirteen of the 42 CNVs (31%) contained 28 genes listed in OMIM. FISH analysis of 6 CNVs (4 previously reported and 2 novel CNVs) in normal subjects resulted in the confirmation of copy number changes for 1 of 2 novel CNVs and 2 of 4 known CNVs. Three CNVs tested by FISH were further validated by RT-qPCR and comparable data were obtained. This included the lack of copy number change by both RT-qPCR and FISH for clone RP11-100C24, one of the most common known copy number variants, as well as confirmation of deletions for clones RP11-89M16 and RP5-1011O17. Conclusion We have described 16 novel CNVs in 27 individuals. Further study of a small selection of CNVs indicated concordant and discordant array vs. FISH/RT-qPCR results. Although a large number of CNVs has been reported to date, quantification using independent methods and detailed cellular and/or molecular assessment has been performed on a very small number of CNVs. This information is, however, very much needed as it is currently common practice to consider CNVs reported in normal subjects as benign changes when detected in individuals affected with a variety of developmental disorders.
- Published
- 2007
- Full Text
- View/download PDF
23. Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence
- Author
-
Joseph Y. Cheung, Xavier Estivill, Jeffrey R. MacDonald, Stephen W. Scherer, Razi Khaja, Ken S. Lau, and Lap-Chee Tsui
- Subjects
Genetic diseases, inborn - genetics ,Single-nucleotide polymorphism ,Biology ,Genoma humà ,Genome ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,0302 clinical medicine ,Gene Duplication ,Gene duplication ,Chromosomes, Human ,Humans ,Copy-number variation ,030304 developmental biology ,Segmental duplication ,Sequence (medicine) ,Genetics ,0303 health sciences ,Base Sequence ,Genome, Human ,Research ,Genetic Diseases, Inborn ,Computational Biology ,Genetic Variation ,Sequence Analysis, DNA ,Human genetics ,Malalties ,Human genome ,Artifacts ,030217 neurology & neurosurgery - Abstract
BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve., published_or_final_version
- Published
- 2003
24. Origins of chromosomal rearrangement hotspots in the human genome: evidence from the AZFadeletion hotspots
- Author
-
Hurles, Matthew E, Willey, David, Matthews, Lucy, and Hussain, Syed Sufyan
- Published
- 2004
- Full Text
- View/download PDF
25. Hotspots of mammalian chromosomal evolution
- Author
-
Bailey, Jeffrey A, Baertsch, Robert, Kent, W James, Haussler, David, and Eichler, Evan E
- Published
- 2004
- Full Text
- View/download PDF
26. DiagHunter and GenoPix2D: programs for genomic comparisons, large-scale homology discovery and visualization
- Author
-
Cannon, Steven B, Kozik, Alexander, Chan, Brian, Michelmore, Richard, and Young, Nevin D
- Published
- 2003
- Full Text
- View/download PDF
27. Refinement of a chimpanzee pericentric inversion breakpoint to a segmental duplication cluster
- Author
-
Locke, Devin P, Archidiacono, Nicoletta, Misceo, Doriana, Cardone, Maria Francesca, Deschamps, Stephane, Roe, Bruce, Rocchi, Mariano, and Eichler, Evan E
- Published
- 2003
- Full Text
- View/download PDF
28. Recent segmental and gene duplications in the mouse genome
- Author
-
Cheung, Joseph, Wilson, Michael D, Zhang, Junjun, Khaja, Razi, MacDonald, Jeffrey R, Heng, Henry HQ, Koop, Ben F, and Scherer, Stephen W
- Published
- 2003
- Full Text
- View/download PDF
29. Identifying related L1 retrotransposons by analyzing 3' transduced sequences
- Author
-
Szak, Suzanne T, Pickeral, Oxana K, Landsman, David, and Boeke, Jef D
- Published
- 2003
- Full Text
- View/download PDF
30. Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence
- Author
-
Cheung, Joseph, Estivill, Xavier, Khaja, Razi, MacDonald, Jeffrey R, Lau, Ken, Tsui, Lap-Chee, and Scherer, Stephen W
- Published
- 2003
- Full Text
- View/download PDF
31. Association of microsatellite pairs with segmental duplications in insect genomes
- Author
-
Susanta K. Behura and David W. Severson
- Subjects
Insecta ,Gene duplication ,Segmental duplication ,Genome, Insect ,Biology ,Genome ,DNA sequencing ,Evolution, Molecular ,Segmental Duplications, Genomic ,Phylogenetics ,Genetics ,Genome dynamics ,Animals ,Gene ,Phylogeny ,Microsatellite ,Low copy repeats ,Sequence Analysis, DNA ,Duplication shadowing ,Insect genomes ,Biotechnology ,Research Article ,Microsatellite Repeats - Abstract
Background Segmental duplications (SDs), also known as low-copy repeats, are DNA sequences of length greater than 1 kb which are duplicated with a high degree of sequence identity (greater than 90%) causing instability in genomes. SDs are generally found in the genome as mosaic forms of duplicated sequences which are generated by a two-step process: first, multiple duplicated sequences are aggregated at specific genomic regions, and then, these primary duplications undergo multiple secondary duplications. However, the mechanism of how duplicated sequences are aggregated in the first place is not well understood. Results By analyzing the distribution of microsatellite sequences among twenty insect species in a genome-wide manner it was found that pairs of microsatellites along with the intervening sequences were duplicated multiple times in each genome. They were found as low copy repeats or segmental duplications when the duplicated loci were greater than 1 kb in length and had greater than 90% sequence similarity. By performing a sliding-window genomic analysis for number of paired microsatellites and number of segmental duplications, it was observed that regions rich in repetitive paired microsatellites tend to get richer in segmental duplication suggesting a “rich-gets-richer” mode of aggregation of the duplicated loci in specific regions of the genome. Results further show that the relationship between number of paired microsatellites and segmental duplications among the species is independent of the known phylogeny suggesting that association of microsatellites with segmental duplications may be a species-specific evolutionary process. It was also observed that the repetitive microsatellite pairs are associated with gene duplications but those sequences are rarely retained in the orthologous genes between species. Although some of the duplicated sequences with microsatellites as termini were found within transposable elements (TEs) of Drosophila, most of the duplications are found in the TE-free and gene-free regions of the genome. Conclusion The study clearly suggests that microsatellites are instrumental in extensive sequence duplications that may contribute to species-specific evolution of genome plasticity in insects.
- Published
- 2013
32. Identification of both copy number variation-type and constant-type core elements in a large segmental duplication region of the mouse genome
- Author
-
Takeaki Uno, Tsuyoshi Koide, Juzoh Umemori, Kenji Ichiyanagi, and Akihiro Mori
- Subjects
Repetitive element ,DNA Copy Number Variations ,Retrotransposon ,Genomics ,Biology ,Homology search ,Genome ,Homology (biology) ,Mice ,Comparative genome hybridization array ,Species Specificity ,Gene Duplication ,Sequence Homology, Nucleic Acid ,Gene duplication ,Genetics ,Animals ,Cluster Analysis ,Copy-number variation ,Chromosome 13 ,Segmental duplication ,Repetitive Sequences, Nucleic Acid ,Mouse genome ,Nucleic Acid Hybridization ,Sequence Analysis, DNA ,Evolutionary biology ,Algorithms ,Biotechnology ,Research Article - Abstract
Background Copy number variation (CNV), an important source of diversity in genomic structure, is frequently found in clusters called CNV regions (CNVRs). CNVRs are strongly associated with segmental duplications (SDs), but the composition of these complex repetitive structures remains unclear. Results We conducted self-comparative-plot analysis of all mouse chromosomes using the high-speed and large-scale-homology search algorithm SHEAP. For eight chromosomes, we identified various types of large SD as tartan-checked patterns within the self-comparative plots. A complex arrangement of diagonal split lines in the self-comparative-plots indicated the presence of large homologous repetitive sequences. We focused on one SD on chromosome 13 (SD13M), and developed SHEPHERD, a stepwise ab initio method, to extract longer repetitive elements and to characterize repetitive structures in this region. Analysis using SHEPHERD showed the existence of 60 core elements, which were expected to be the basic units that form SDs within the repetitive structure of SD13M. The demonstration that sequences homologous to the core elements (>70% homology) covered approximately 90% of the SD13M region indicated that our method can characterize the repetitive structure of SD13M effectively. Core elements were composed largely of fragmented repeats of a previously identified type, such as long interspersed nuclear elements (LINEs), together with partial genic regions. Comparative genome hybridization array analysis showed that whereas 42 core elements were components of CNVR that varied among mouse strains, 8 did not vary among strains (constant type), and the status of the others could not be determined. The CNV-type core elements contained significantly larger proportions of long terminal repeat (LTR) types of retrotransposon than the constant-type core elements, which had no CNV. The higher divergence rates observed in the CNV-type core elements than in the constant type indicate that the CNV-type core elements have a longer evolutionary history than constant-type core elements in SD13M. Conclusions Our methodology for the identification of repetitive core sequences simplifies characterization of the structures of large SDs and detailed analysis of CNV. The results of detailed structural and quantitative analyses in this study might help to elucidate the biological role of one of the SDs on chromosome 13.
- Published
- 2013
33. Exome sequencing resolves apparent incidental findings and reveals further complexity of SH3TC2 variant alleles causing Charcot-Marie-Tooth neuropathy
- Author
-
Alicia Hawes, Christian J. Buhay, James R. Lupski, Matthew N. Bainbridge, Jeffrey G. Reid, Richard A. Gibbs, Christie Kovar, Christine M. Eng, Donna M. Muzny, Min Wang, Claudia Gonzaga-Jauregui, Shalini N. Jhangiani, and Yaping Yang
- Subjects
Exome sequencing ,Pseudogene ,Biology ,Genome ,DNA sequencing ,03 medical and health sciences ,Genetics ,Personal genomes ,Genetics(clinical) ,Molecular Biology ,Exome ,Genetics (clinical) ,030304 developmental biology ,Segmental duplication ,Whole genome sequencing ,0303 health sciences ,Whole-genome sequencing ,Research ,030305 genetics & heredity ,SH3TC2 ,Precision medicine ,Incidental findings ,Molecular Medicine ,Personal genomics - Abstract
Background The debate regarding the relative merits of whole genome sequencing (WGS) versus exome sequencing (ES) centers around comparative cost, average depth of coverage for each interrogated base, and their relative efficiency in the identification of medically actionable variants from the myriad of variants identified by each approach. Nevertheless, few genomes have been subjected to both WGS and ES, using multiple next generation sequencing platforms. In addition, no personal genome has been so extensively analyzed using DNA derived from peripheral blood as opposed to DNA from transformed cell lines that may either accumulate mutations during propagation or clonally expand mosaic variants during cell transformation and propagation. Methods We investigated a genome that was studied previously by SOLiD chemistry using both ES and WGS, and now perform six independent ES assays (Illumina GAII (x2), Illumina HiSeq (x2), Life Technologies' Personal Genome Machine (PGM) and Proton), and one additional WGS (Illumina HiSeq). Results We compared the variants identified by the different methods and provide insights into the differences among variants identified between ES runs in the same technology platform and among different sequencing technologies. We resolved the true genotypes of medically actionable variants identified in the proband through orthogonal experimental approaches. Furthermore, ES identified an additional SH3TC2 variant (p.M1?) that likely contributes to the phenotype in the proband. Conclusions ES identified additional medically actionable variant calls and helped resolve ambiguous single nucleotide variants (SNV) documenting the power of increased depth of coverage of the captured targeted regions. Comparative analyses of WGS and ES reveal that pseudogenes and segmental duplications may explain some instances of apparent disease mutations in unaffected individuals.
- Published
- 2013
34. Genome-wide analysis of WOX genes in upland cotton and their expression pattern under different stresses.
- Author
-
Yang Z, Gong Q, Qin W, Yang Z, Cheng Y, Lu L, Ge X, Zhang C, Wu Z, and Li F
- Subjects
- Amino Acid Sequence, Conserved Sequence, Gene Duplication, Gene Expression, Gossypium metabolism, Homeodomain Proteins metabolism, Multigene Family, Phylogeny, Polymerase Chain Reaction, Stress, Physiological, Synteny, Genes, Plant, Gossypium genetics, Homeodomain Proteins genetics
- Abstract
Background: WUSCHEL-related homeobox (WOX) family members play significant roles in plant growth and development, such as in embryo patterning, stem-cell maintenance, and lateral organ formation. The recently published cotton genome sequences allow us to perform comprehensive genome-wide analysis and characterization of WOX genes in cotton., Results: In this study, we identified 21, 20, and 38 WOX genes in Gossypium arboreum (2n = 26, A
2 ), G. raimondii (2n = 26, D5 ), and G. hirsutum (2n = 4x = 52, (AD)t ), respectively. Sequence logos showed that homeobox domains were significantly conserved among the WOX genes in cotton, Arabidopsis, and rice. A total of 168 genes from three typical monocots and six dicots were naturally divided into three clades, which were further classified into nine sub-clades. A good collinearity was observed in the synteny analysis of the orthologs from At and Dt (t represents tetraploid) sub-genomes. Whole genome duplication (WGD) and segmental duplication within At and Dt sub-genomes played significant roles in the expansion of WOX genes, and segmental duplication mainly generated the WUS clade. Copia and Gypsy were the two major types of transposable elements distributed upstream or downstream of WOX genes. Furthermore, through comparison, we found that the exon/intron pattern was highly conserved between Arabidopsis and cotton, and the homeobox domain loci were also conserved between them. In addition, the expression pattern in different tissues indicated that the duplicated genes in cotton might have acquired new functions as a result of sub-functionalization or neo-functionalization. The expression pattern of WOX genes under different stress treatments showed that the different genes were induced by different stresses., Conclusion: In present work, WOX genes, classified into three clades, were identified in the upland cotton genome. Whole genome and segmental duplication were determined to be the two major impetuses for the expansion of gene numbers during the evolution. Moreover, the expression patterns suggested that the duplicated genes might have experienced a functional divergence. Together, these results shed light on the evolution of the WOX gene family, and would be helpful in future research.- Published
- 2017
- Full Text
- View/download PDF
35. Genetic variation and expression diversity between grain and sweet sorghum lines
- Author
-
Zhigang Ma, Srinivasan Ramachandran, Jeevanandam Vanitha, and Shu-Ye Jiang
- Subjects
Genetics ,Genetic diversity ,Genome evolution ,Genetic Variation ,Biology ,DNA Methylation ,Genes, Plant ,Genome ,Polymorphism, Single Nucleotide ,Phenotype ,Gene Expression Regulation, Plant ,Genetic variation ,Databases, Genetic ,DNA microarray ,Promoter Regions, Genetic ,Sweet sorghum ,Gene ,Sorghum ,Segmental duplication ,Biotechnology ,Research Article ,Oligonucleotide Array Sequence Analysis - Abstract
Background Biological scientists have long sought after understanding how genes and their structural/functional changes contribute to morphological diversity. Though both grain (BT×623) and sweet (Keller) sorghum lines originated from the same species Sorghum bicolor L., they exhibit obvious phenotypic variations. However, the genome re-sequencing data revealed that they exhibited limited functional diversity in their encoding genes in a genome-wide level. The result raises the question how the obvious morphological variations between grain and sweet sorghum occurred in a relatively short evolutionary or domesticated period. Results We implemented an integrative approach by using computational and experimental analyses to provide a detail insight into phenotypic, genetic variation and expression diversity between BT×623 and Keller lines. We have investigated genome-wide expression divergence between BT×623 and Keller under normal and sucrose treatment. Through the data analysis, we detected more than 3,000 differentially expressed genes between these two varieties. Such expression divergence was partially contributed by differential cis-regulatory elements or DNA methylation, which was genetically determined by functionally divergent genes between these two varieties. Both tandem and segmental duplication played important roles in the genome evolution and expression divergence. Conclusion Substantial differences in gene expression patterns between these two varieties have been observed. Such an expression divergence is genetically determined by the divergence in genome level.
- Published
- 2013
36. Evolutionary mechanisms driving the evolution of a large polydnavirus gene family coding for protein tyrosine phosphatases
- Author
-
Stéphane Dupas, Elfie Perdereau, Catherine Dupuy, Jean-Michel Drezen, Céline Serbielle, François Héricourt, Elisabeth Huguet, Institut de recherche sur la biologie de l'insecte UMR7261 (IRBI), Université de Tours-Centre National de la Recherche Scientifique (CNRS), Institut Jacques Monod (IJM (UMR_7592)), Université Paris Diderot - Paris 7 (UPD7)-Centre National de la Recherche Scientifique (CNRS), and Université de Tours (UT)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
0106 biological sciences ,Genome evolution ,Gene duplication ,Evolution ,Molecular Sequence Data ,Wasps ,010603 evolutionary biology ,01 natural sciences ,Genome ,Protein tyrosine phosphatase ,Bracovirus ,Evolution, Molecular ,03 medical and health sciences ,Polydnavirus ,QH359-425 ,Gene family ,Animals ,[SDV.MP.PAR]Life Sciences [q-bio]/Microbiology and Parasitology/Parasitology ,Amino Acid Sequence ,Gene ,Ecology, Evolution, Behavior and Systematics ,ComputingMilieux_MISCELLANEOUS ,Phylogeny ,030304 developmental biology ,Segmental duplication ,Genetics ,0303 health sciences ,biology ,Provirus ,biology.organism_classification ,Positive selection ,Evolutionary biology ,Polydnaviridae ,Protein Tyrosine Phosphatases ,Sequence Alignment ,Research Article - Abstract
Background Gene duplications have been proposed to be the main mechanism involved in genome evolution and in acquisition of new functions. Polydnaviruses (PDVs), symbiotic viruses associated with parasitoid wasps, are ideal model systems to study mechanisms of gene duplications given that PDV genomes consist of virulence genes organized into multigene families. In these systems the viral genome is integrated in a wasp chromosome as a provirus and virus particles containing circular double-stranded DNA are injected into the parasitoids’ hosts and are essential for parasitism success. The viral virulence factors, organized in gene families, are required collectively to induce host immune suppression and developmental arrest. The gene family which encodes protein tyrosine phosphatases (PTPs) has undergone spectacular expansion in several PDV genomes with up to 42 genes. Results Here, we present strong indications that PTP gene family expansion occurred via classical mechanisms: by duplication of large segments of the chromosomally integrated form of the virus sequences (segmental duplication), by tandem duplications within this form and by dispersed duplications. We also propose a novel duplication mechanism specific to PDVs that involves viral circle reintegration into the wasp genome. The PTP copies produced were shown to undergo conservative evolution along with episodes of adaptive evolution. In particular recently produced copies have undergone positive selection in sites most likely involved in defining substrate selectivity. Conclusion The results provide evidence about the dynamic nature of polydnavirus proviral genomes. Classical and PDV-specific duplication mechanisms have been involved in the production of new gene copies. Selection pressures associated with antagonistic interactions with parasitized hosts have shaped these genes used to manipulate lepidopteran physiology with evidence for positive selection involved in adaptation to host targets.
- Published
- 2012
- Full Text
- View/download PDF
37. A common copy-number breakpoint of ERBB2 amplification in breast cancer colocalizes with a complex block of segmental duplications
- Author
-
Xiongfong Chen, Anna A. Kondratova, Raymond R. Tubbs, Joseph P. Crowe, Michael Marotta, Ayako Inoshita, Robert M. Stephens, G. Thomas Budd, Hisashi Tanaka, and Joanne Lyons
- Subjects
Genome evolution ,DNA Copy Number Variations ,Receptor, ErbB-2 ,Non-allelic homologous recombination ,Gene Dosage ,Breast Neoplasms ,Biology ,Genome ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,Chromosome Breakpoints ,0302 clinical medicine ,Segmental Duplications, Genomic ,Keratins, Hair-Specific ,Gene duplication ,Humans ,030304 developmental biology ,Segmental duplication ,Sequence Deletion ,Medicine(all) ,Genetics ,0303 health sciences ,Comparative Genomic Hybridization ,Base Sequence ,Genome, Human ,Breakpoint ,Gene Amplification ,3. Good health ,Haplotypes ,030220 oncology & carcinogenesis ,Human genome ,Female ,Comparative genomic hybridization ,Research Article ,Chromosomes, Human, Pair 17 - Abstract
Introduction Segmental duplications (low-copy repeats) are the recently duplicated genomic segments in the human genome that display nearly identical (> 90%) sequences and account for about 5% of euchromatic regions. In germline, duplicated segments mediate nonallelic homologous recombination and thus cause both non-disease-causing copy-number variants and genomic disorders. To what extent duplicated segments play a role in somatic DNA rearrangements in cancer remains elusive. Duplicated segments often cluster and form genomic blocks enriched with both direct and inverted repeats (complex genomic regions). Such complex regions could be fragile and play a mechanistic role in the amplification of the ERBB2 gene in breast tumors, because repeated sequences are known to initiate gene amplification in model systems. Methods We conducted polymerase chain reaction (PCR)-based assays for primary breast tumors and analyzed publically available array-comparative genomic hybridization data to map a common copy-number breakpoint in ERBB2-amplified primary breast tumors. We further used molecular, bioinformatics, and population-genetics approaches to define duplication contents, structural variants, and haplotypes within the common breakpoint. Results We found a large (> 300-kb) block of duplicated segments that was colocalized with a common-copy number breakpoint for ERBB2 amplification. The breakpoint that potentially initiated ERBB2 amplification localized in a region 1.5 megabases (Mb) on the telomeric side of ERBB2. The region is very complex, with extensive duplications of KRTAP genes, structural variants, and, as a result, a paucity of single-nucleotide polymorphism (SNP) markers. Duplicated segments are varied in size and degree of sequence homology, indicating that duplications have occurred recurrently during genome evolution. Conclusions Amplification of the ERBB2 gene in breast tumors is potentially initiated by a complex region that has unusual genomic features and thus requires rigorous, labor-intensive investigation. The haplotypes we provide could be useful to identify the potential association between the complex region and ERBB2 amplification.
- Published
- 2012
38. A high-resolution integrated map of copy number polymorphisms within and between breeds of the modern domesticated dog
- Author
-
Thomas J. Nicholas, Evan E. Eichler, Carl Baker, and Joshua M. Akey
- Subjects
0106 biological sciences ,lcsh:QH426-470 ,endocrine system diseases ,DNA Copy Number Variations ,Genotype ,lcsh:Biotechnology ,Population genetics ,Genomics ,Biology ,01 natural sciences ,Structural variation ,03 medical and health sciences ,Dogs ,Segmental Duplications, Genomic ,Species Specificity ,lcsh:TP248.13-248.65 ,mental disorders ,Genetics ,Animals ,Copy-number variation ,Gene ,030304 developmental biology ,Segmental duplication ,Oligonucleotide Array Sequence Analysis ,0303 health sciences ,Comparative Genomic Hybridization ,Tiling array ,Chromosome Mapping ,lcsh:Genetics ,Phenotype ,010606 plant biology & botany ,Comparative genomic hybridization ,Biotechnology ,Research Article - Abstract
BackgroundStructural variation contributes to the rich genetic and phenotypic diversity of the modern domestic dog,Canis lupus familiaris, although compared to other organisms, catalogs of canine copy number variants (CNVs) are poorly defined. To this end, we developed a customized high-density tiling array across the canine genome and used it to discover CNVs in nine genetically diverse dogs and a gray wolf.ResultsIn total, we identified 403 CNVs that overlap 401 genes, which are enriched for defense/immunity, oxidoreductase, protease, receptor, signaling molecule and transporter genes. Furthermore, we performed detailed comparisons between CNVs located within versus outside of segmental duplications (SDs) and find that CNVs in SDs are enriched for gene content and complexity. Finally, we compiled all known dog CNV regions and genotyped them with a custom aCGH chip in 61 dogs from 12 diverse breeds. These data allowed us to perform the first population genetics analysis of canine structural variation and identify CNVs that potentially contribute to breed specific traits.ConclusionsOur comprehensive analysis of canine CNVs will be an important resource in genetically dissecting canine phenotypic and behavioral variation.
- Published
- 2011
39. Are ribosomal DNA clusters rearrangement hotspots? A case study in the genus Mus (Rodentia, Muridae)
- Author
-
Josette Catalan, Emmanuel J. P. Douzery, Janice Britton-Davidian, Benoîte Cazaux, and Frédéric Veyrunes
- Subjects
Comparative genomics ,Genetics ,Concerted evolution ,Evolution ,Centromere ,Chromosome Breakpoints ,Biology ,Genome ,Chromosomes, Mammalian ,DNA, Ribosomal ,Rats ,Mice ,Phylogenetics ,Evolutionary biology ,Karyotyping ,Multigene Family ,QH359-425 ,Animals ,Ribosomal DNA ,Ecology, Evolution, Behavior and Systematics ,Phylogeny ,Segmental duplication ,Research Article - Abstract
Background Recent advances in comparative genomics have considerably improved our knowledge of the evolution of mammalian karyotype architecture. One of the breakthroughs was the preferential localization of evolutionary breakpoints in regions enriched in repetitive sequences (segmental duplications, telomeres and centromeres). In this context, we investigated the contribution of ribosomal genes to genome reshuffling since they are generally located in pericentromeric or subtelomeric regions, and form repeat clusters on different chromosomes. The target model was the genus Mus which exhibits a high rate of karyotypic change, a large fraction of which involves centromeres. Results The chromosomal distribution of rDNA clusters was determined by in situ hybridization of mouse probes in 19 species. Using a molecular-based reference tree, the phylogenetic distribution of clusters within the genus was reconstructed, and the temporal association between rDNA clusters, breakpoints and centromeres was tested by maximum likelihood analyses. Our results highlighted the following features of rDNA cluster dynamics in the genus Mus: i) rDNA clusters showed extensive diversity in number between species and an almost exclusive pericentromeric location, ii) a strong association between rDNA sites and centromeres was retrieved which may be related to their shared constraint of concerted evolution, iii) 24% of the observed breakpoints mapped near an rDNA cluster, and iv) a substantial rate of rDNA cluster change (insertion, deletion) also occurred in the absence of chromosomal rearrangements. Conclusions This study on the dynamics of rDNA clusters within the genus Mus has revealed a strong evolutionary relationship between rDNA clusters and centromeres. Both of these genomic structures coincide with breakpoints in the genus Mus, suggesting that the accumulation of a large number of repeats in the centromeric region may contribute to the high level of chromosome repatterning observed in this group. However, the elevated rate of rDNA change observed in the chromosomally invariant clade indicates that the presence of these sequences is insufficient to lead to genome instability. In agreement with recent studies, these results suggest that additional factors such as modifications of the epigenetic state of DNA may be required to trigger evolutionary plasticity.
- Published
- 2011
40. The role of duplications in the evolution of genomes highlights the need for evolutionary-based approaches in comparative genomics
- Author
-
Anthony Levasseur, Pierre Pontarotti, Unité mixte de recherche de biotechnologie des champignons filamenteux, Université de la Méditerranée - Aix-Marseille 2-Institut National de la Recherche Agronomique (INRA)-Université de Provence - Aix-Marseille 1, Laboratoire d'Analyse, Topologie, Probabilités (LATP), Université Paul Cézanne - Aix-Marseille 3-Université de Provence - Aix-Marseille 1-Centre National de la Recherche Scientifique (CNRS), Evolution Biologique et Modélisation, and Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
0106 biological sciences ,[SDV]Life Sciences [q-bio] ,Immunology ,Population ,Population genetics ,Review ,Saccharomyces cerevisiae ,Biology ,010603 evolutionary biology ,01 natural sciences ,Genome ,General Biochemistry, Genetics and Molecular Biology ,Evolution, Molecular ,Polyploidy ,03 medical and health sciences ,Gene Duplication ,[SDV.IDA]Life Sciences [q-bio]/Food engineering ,Gene duplication ,Humans ,[SPI.GPROC]Engineering Sciences [physics]/Chemical and Process Engineering ,education ,lcsh:QH301-705.5 ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,Segmental duplication ,Comparative genomics ,évolution biologique ,0303 health sciences ,education.field_of_study ,Agricultural and Biological Sciences(all) ,Phylogenetic tree ,Biochemistry, Genetics and Molecular Biology(all) ,Applied Mathematics ,Genomics ,cellule eucaryote ,Fixation (population genetics) ,lcsh:Biology (General) ,Evolutionary biology ,Modeling and Simulation ,mutation ,General Agricultural and Biological Sciences - Abstract
Understanding the evolutionary plasticity of the genome requires a global, comparative approach in which genetic events are considered both in a phylogenetic framework and with regard to population genetics and environmental variables. In the mechanisms that generate adaptive and non-adaptive changes in genomes, segmental duplications (duplication of individual genes or genomic regions) and polyploidization (whole genome duplications) are well-known driving forces. The probability of fixation and maintenance of duplicates depends on many variables, including population sizes and selection regimes experienced by the corresponding genes: a combination of stochastic and adaptive mechanisms has shaped all genomes. A survey of experimental work shows that the distinction made between fixation and maintenance of duplicates still needs to be conceptualized and mathematically modeled. Here we review the mechanisms that increase or decrease the probability of fixation or maintenance of duplicated genes, and examine the outcome of these events on the adaptation of the organisms. Reviewers This article was reviewed by Dr. Etienne Joly, Dr. Lutz Walter and Dr. W. Ford Doolittle.
- Published
- 2011
- Full Text
- View/download PDF
41. Duplication and independent selection of cell-wall invertase genes GIF1 and OsCIN1 during rice evolution and domestication
- Author
-
Wen Wang, Hong Zhang, Ertao Wang, Qin Wang, Song Ge, Qun Li, Xun Xu, Lin Lin, Zuhua He, Lin Zhang, and Bao-Rong Lu
- Subjects
Genetics ,Phylogenetic tree ,beta-Fructofuranosidase ,Evolution ,Population genetics ,Oryza ,Biology ,Genes, Plant ,Evolutionary biology ,Cell Wall ,Gene Duplication ,Gene duplication ,Research article ,QH359-425 ,Trans-Activators ,Selection, Genetic ,Domestication ,Gene ,Ecology, Evolution, Behavior and Systematics ,Function (biology) ,Segmental duplication ,Synteny - Abstract
Background Various evolutionary models have been proposed to interpret the fate of paralogous duplicates, which provides substrates on which evolution selection could act. In particular, domestication, as a special selection, has played important role in crop cultivation with divergence of many genes controlling important agronomic traits. Recent studies have indicated that a pair of duplicate genes was often sub-functionalized from their ancestral functions held by the parental genes. We previously demonstrated that the rice cell-wall invertase (CWI) gene GIF1 that plays an important role in the grain-filling process was most likely subjected to domestication selection in the promoter region. Here, we report that GIF1 and another CWI gene OsCIN1 constitute a pair of duplicate genes with differentiated expression and function through independent selection. Results Through synteny analysis, we show that GIF1 and another cell-wall invertase gene OsCIN1 were paralogues derived from a segmental duplication originated during genome duplication of grasses. Results based on analyses of population genetics and gene phylogenetic tree of 25 cultivars and 25 wild rice sequences demonstrated that OsCIN1 was also artificially selected during rice domestication with a fixed mutation in the coding region, in contrast to GIF1 that was selected in the promoter region. GIF1 and OsCIN1 have evolved into different expression patterns and probable different kinetics parameters of enzymatic activity with the latter displaying less enzymatic activity. Overexpression of GIF1 and OsCIN1 also resulted in different phenotypes, suggesting that OsCIN1 might regulate other unrecognized biological process. Conclusion How gene duplication and divergence contribute to genetic novelty and morphological adaptation has been an interesting issue to geneticists and biologists. Our discovery that the duplicated pair of GIF1 and OsCIN1 has experienced sub-functionalization implies that selection could act independently on each duplicate towards different functional specificity, which provides a vivid example for evolution of genetic novelties in a model crop. Our results also further support the established hypothesis that gene duplication with sub-functionalization could be one solution for genetic adaptive conflict.
- Published
- 2010
42. Detection and correction of false segmental duplications caused by genome mis-assembly
- Author
-
Steven L. Salzberg and David R. Kelley
- Subjects
Pan troglodytes ,2R hypothesis ,Method ,Sequence alignment ,Genomics ,Computational biology ,Biology ,Genome ,Contig Mapping ,03 medical and health sciences ,0302 clinical medicine ,Dogs ,Segmental Duplications, Genomic ,Animals ,Humans ,030304 developmental biology ,Segmental duplication ,Genetics ,0303 health sciences ,Assembly software ,Base Sequence ,nutritional and metabolic diseases ,Sequence Analysis, DNA ,Diploidy ,nervous system diseases ,Cattle ,Ploidy ,Chickens ,Sequence Alignment ,030217 neurology & neurosurgery ,Software - Abstract
A method for determining false segmental duplications in vertebrate genomes, thus correcting mis-assemblies and providing more accurate estimates of duplications., Diploid genomes with divergent chromosomes present special problems for assembly software as two copies of especially polymorphic regions may be mistakenly constructed, creating the appearance of a recent segmental duplication. We developed a method for identifying such false duplications and applied it to four vertebrate genomes. For each genome, we corrected mis-assemblies, improved estimates of the amount of duplicated sequence, and recovered polymorphisms between the sequenced chromosomes.
- Published
- 2010
43. The similar and different evolutionary trends of MATE family occurred between rice and Arabidopsis thaliana.
- Author
-
Wang L, Bei X, Gao J, Li Y, Yan Y, and Hu Y
- Abstract
Background: Multidrug and toxic compound extrusion (MATE) transporter proteins are present in all organisms. Although the functions of some MATE gene family members have been studied in plants, few studies have investigated the gene expansion patterns, functional divergence, or the effects of positive selection., Results: Forty-five MATE genes from rice and 56 from Arabidopsis were identified and grouped into four subfamilies. MATE family genes have similar exon-intron structures in rice and Arabidopsis; MATE gene structures are conserved in each subfamily but differ among subfamilies. In both species, the MATE gene family has expanded mainly through tandem and segmental duplications. A transcriptome atlas showed considerable differences in expression among the genes, in terms of transcript abundance and expression patterns under normal growth conditions, indicating wide functional divergence in this family. In both rice and Arabidopsis, the MATE genes showed consistent functional divergence trends, with highly significant Type-I divergence in each subfamily, while Type-II divergence mainly occurred in subfamily III. The Type-II coefficients between rice subfamilies I/III, II/III, and IV/III were all significantly greater than zero, while only the Type-II coefficient between Arabidopsis IV/III subfamilies was significantly greater than zero. A site-specific model analysis indicated that MATE genes have relatively conserved evolutionary trends. A branch-site model suggested that the extent of positive selection on each subfamily of rice and Arabidopsis was different: subfamily II of Arabidopsis showed higher positive selection than other subfamilies, whereas in rice, positive selection was highest in subfamily III. In addition, the analyses identified 18 rice sites and 7 Arabidopsis sites that were responsible for positive selection and for Type-I and Type-II functional divergence; there were no common sites between rice and Arabidopsis. Five coevolving amino acid sites were identified in rice and three in Arabidopsis; these sites might have important roles in maintaining local structural stability and protein functional domains., Conclusions: We demonstrate that the MATE gene family expanded through tandem and segmental duplication in both rice and Arabidopsis. Overall, the results of our analyses contribute to improved understanding of the molecular evolution and functions of the MATE gene family in plants.
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.