609 results on '"Nucleotide composition"'
Search Results
2. GC Content Across Insect Genomes: Phylogenetic Patterns, Causes and Consequences.
- Author
-
Kyriacou, Riccardo G., Mulhair, Peter O., and Holland, Peter W. H.
- Subjects
- *
INSECT genomes , *GENE conversion , *ANIMAL species , *NUMBERS of species , *CHROMOSOMES , *LEPIDOPTERA , *GENOMES - Abstract
The proportions of A:T and G:C nucleotide pairs are often unequal and can vary greatly between animal species and along chromosomes. The causes and consequences of this variation are incompletely understood. The recent release of high-quality genome sequences from the Darwin Tree of Life and other large-scale genome projects provides an opportunity for GC heterogeneity to be compared across a large number of insect species. Here we analyse GC content along chromosomes, and within protein-coding genes and codons, of 150 insect species from four holometabolous orders: Coleoptera, Diptera, Hymenoptera, and Lepidoptera. We find that protein-coding sequences have higher GC content than the genome average, and that Lepidoptera generally have higher GC content than the other three insect orders examined. GC content is higher in small chromosomes in most Lepidoptera species, but this pattern is less consistent in other orders. GC content also increases towards subtelomeric regions within protein-coding genes in Diptera, Coleoptera and Lepidoptera. Two species of Diptera, Bombylius major and B. discolor, have very atypical genomes with ubiquitous increase in AT content, especially at third codon positions. Despite dramatic AT-biased codon usage, we find no evidence that this has driven divergent protein evolution. We argue that the GC landscape of Lepidoptera, Diptera and Coleoptera genomes is influenced by GC-biased gene conversion, strongest in Lepidoptera, with some outlier taxa affected drastically by counteracting processes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Comparative mitochondrial genome analysis of three leafhopper species of the genus Abrus Dai & Zhang (Hemiptera: Cicadellidae: Deltocephalinae) from China with phylogenetic implication
- Author
-
Muhammad Asghar Hassan, Zhixiang Tan, Rongrong Shen, and Jichun Xing
- Subjects
Leafhopper ,Nucleotide composition ,Genetic diversity ,Mitogenome ,Phylogenetic relationship ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background The phylogenetic position and classification of Athysanini are poorly defined, as it includes a large group of polyphyletic genera that have historically been assigned to it mainly because they still exhibit the most typical deltocephaline genitalic and external body characters but lack the distinctive characteristics that other tribes possess. The bamboo-feeding leafhopper genus Abrus belong to the tribe Athysanini of subfamily Deltocephalinae, which currently comprises 19 valid described species, and are limited to the Oriental and Palaearctic regions in China. Although the taxonomy of Abrus are well updated, the references on comparative mitogenomic analyses of Abrus species are only known for a single species. In this study, we sequenced and analyzed the complete mitochondrial genomes (mitogenomes) of Abrus daozhenensis Chen, Yang & Li, 2012 (16,391bp) and A. yunshanensis Chen, Yang & Li, 2012 (15,768bp) (Athysanini), and compared with published mitogenome sequence of A. expansivus Xing & Li, 2014 (15,904bp). Results These Abrus species shared highly conserved mitogenomes with similar gene order to that of the putative ancestral insect with 37 typical genes and a non-coding A + T-rich region. The nucleotide composition of these genomes is highly biased toward A + T nucleotides (76.2%, 76.3%, and 74.7%), AT-skews (0.091 to 0.095, and 0.095), negative GC-skews (− 0.138, − 0.161, and − 0.138), and codon usage. All 22 tRNA genes had typical cloverleaf secondary structures, except for trnS1 (AGN) which lacks the dihydrouridine arm, and distinctively trnG in the mitogenome of A. expansivus lacks the TψC arm. Phylogenetic analyses based on 13 PCGs, 2 rRNA genes, and 22 tRNA genes consistently recovered the monophyletic Opsiini, Penthimiini, Selenocephalini, Scaphoideini, and Athysanini (except Watanabella graminea, previously sequenced species as Chlorotettix nigromaculatus) based on limited available mitogenome sequence data of 37 species. Conclusion At present, Abrus belongs to the tribe Athysanini based on both morphological and molecular datasets, which is strongly supported in present phylogenetic analyses in both BI and ML methods using the six concatenated datasets: amino acid sequences and nucleotides from different combinations of protein-coding genes (PCGs), ribosomal RNA (rRNAs), and transfer RNA (tRNAs). Phylogenetic trees reconstructed herein based on the BI and ML analyses consistently recovered monophylitic Athysanini, except Watanabella graminea (Athysanini) in Opsiini with high support values.
- Published
- 2023
- Full Text
- View/download PDF
4. Characterization of the complete mitochondrial genome of a forensically important beetle, Ptomascopus plagiatus (Coleoptera: staphylinidae: silphinae)
- Author
-
Wei Han, Dianxing Feng, and Shutong Dai
- Subjects
Ptomascopus plagiatus (Ménétriés ,1854) ,mitogenome ,nucleotide composition ,phylogenetic relationship ,Genetics ,QH426-470 - Abstract
AbstractPtomascopus plagiatus (Ménétriés, 1854) is a forensically important silphid species. In this study, we report on the mitochondrial genome of P. plagiatus. The complete mitochondrial genome of P. plagiatus is 17556 bp and contains 22 transfer RNA genes, 13 protein-coding genes (PCGs), two ribosomal RNA genes, and a 2953 bp noncoding region. The nucleotide composition of P. plagiatus is biased toward A and T (A + T: 77.46%). Phylogenetic analysis based on mitogenomic data supports that P. plagiatus is closely related to (Nicrophorus nepalensis Hope, 1831 + Nicrophorus vespilloides Herbst, 1783) within the subfamily Silphinae.
- Published
- 2023
- Full Text
- View/download PDF
5. Evidence of gene nucleotide composition favoring replication and growth in a fastidious plant pathogen
- Author
-
Castillo, Andreina I and Almeida, Rodrigo PP
- Subjects
Biotechnology ,Genetics ,Human Genome ,Life Below Water ,Nucleotides ,Phylogeny ,Plants ,Replicon ,Genomics ,Plant Diseases ,Xylella fastidiosa ,nucleotide composition ,GC content ,information storage and processing ,Xylella fastidiosa - Abstract
Nucleotide composition (GC content) varies across bacteria species, genome regions, and specific genes. In Xylella fastidiosa, a vector-borne fastidious plant pathogen infecting multiple crops, GC content ranges between ∼51-52%; however, these values were gathered using limited genomic data. We evaluated GC content variations across X. fastidiosa subspecies fastidiosa (N = 194), subsp. pauca (N = 107), and subsp. multiplex (N = 39). Genomes were classified based on plant host and geographic origin; individual genes within each genome were classified based on gene function, strand, length, ortholog group, core vs accessory, and recombinant vs non-recombinant. GC content was calculated for each gene within each evaluated genome. The effects of genome and gene-level variables were evaluated with a mixed effect ANOVA, and the marginal-GC content was calculated for each gene. Also, the correlation between gene-specific GC content vs natural selection (dN/dS) and recombination/mutation (r/m) was estimated. Our analyses show that intra-genomic changes in nucleotide composition in X. fastidiosa are small and influenced by multiple variables. Higher AT-richness is observed in genes involved in replication and translation, and genes in the leading strand. In addition, we observed a negative correlation between high-AT and dN/dS in subsp. pauca. The relationship between recombination and GC content varied between core and accessory genes. We hypothesize that distinct evolutionary forces and energetic constraints both drive and limit these small variations in nucleotide composition.
- Published
- 2021
6. The divergence of mutation rates and spectra across the Tree of Life.
- Author
-
Lynch, Michael, Ali, Farhan, Lin, Tongtong, Wang, Yaohai, Ni, Jiahao, and Long, Hongan
- Abstract
Owing to advances in genome sequencing, genome stability has become one of the most scrutinized cellular traits across the Tree of Life. Despite its centrality to all things biological, the mutation rate (per nucleotide site per generation) ranges over three orders of magnitude among species and several‐fold within individual phylogenetic lineages. Within all major organismal groups, mutation rates scale negatively with the effective population size of a species and with the amount of functional DNA in the genome. This relationship is most parsimoniously explained by the drift‐barrier hypothesis, which postulates that natural selection typically operates to reduce mutation rates until further improvement is thwarted by the power of random genetic drift. Despite this constraint, the molecular mechanisms underlying DNA replication fidelity and repair are free to wander, provided the performance of the entire system is maintained at the prevailing level. The evolutionary flexibility of the mutation rate bears on the resolution of several prior conundrums in phylogenetic and population‐genetic analysis and raises challenges for future applications in these areas. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Subfunctionalisation of paralogous genes and evolution of differential codon usage preferences: The showcase of polypyrimidine tract binding proteins.
- Author
-
Bourret, Jérôme, Borvető, Fanni, and Bravo, Ignacio G.
- Subjects
- *
DIFFERENTIAL evolution , *CARRIER proteins , *GENE expression , *GENOMICS , *GENES , *GENETIC code - Abstract
Gene paralogs are copies of an ancestral gene that appear after gene or full genome duplication. When two sister gene copies are maintained in the genome, redundancy may release certain evolutionary pressures, allowing one of them to access novel functions. Here, we focused our study on gene paralogs on the evolutionary history of the three polypyrimidine tract binding protein genes (PTBP) and their concurrent evolution of differential codon usage preferences (CUPrefs) in vertebrate species. PTBP1‐3 show high identity at the amino acid level (up to 80%) but display strongly different nucleotide composition, divergent CUPrefs and, in humans and in many other vertebrates, distinct tissue‐specific expression levels. Our phylogenetic inference results show that the duplication events leading to the three extant PTBP1‐3 lineages predate the basal diversification within vertebrates, and genomic context analysis illustrates that local synteny has been well preserved over time for the three paralogs. We identify a distinct evolutionary pattern towards GC3‐enriching substitutions in PTBP1, concurrent with enrichment in frequently used codons and with a tissue‐wide expression. In contrast, PTBP2s are enriched in AT‐ending, rare codons, and display tissue‐restricted expression. As a result of this substitution trend, CUPrefs sharply differ between mammalian PTBP1s and the rest of PTBPs. Genomic context analysis suggests that GC3‐rich nucleotide composition in PTBP1s is driven by local substitution processes, while the evidence in this direction is thinner for PTBP2‐3. An actual lack of co‐variation between the observed GC composition of PTBP2‐3 and that of the surrounding non‐coding genomic environment would raise an interrogation on the origin of CUPrefs, warranting further research on a putative tissue‐specific translational selection. Finally, we communicate an intriguing trend for the use of the UUG‐Leu codon, which matches the trends of AT‐ending codons. Our results are compatible with a scenario in which a combination of directional mutation–selection processes would have differentially shaped CUPrefs of PTBPs in vertebrates: the observed GC‐enrichment of PTBP1 in placental mammals may be linked to genomic location and to the strong and broad tissue‐expression, while AT‐enrichment of PTBP2 and PTBP3 would be associated with rare CUPrefs and thus, possibly to specialized spatio‐temporal expression. Our interpretation is coherent with a gene subfunctionalisation process by differential expression regulation associated with the evolution of specific CUPrefs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Comparative mitogenomics of native European and alien Ponto-Caspian amphipods.
- Author
-
Macher, Jan-Niklas, Šidagytė-Copilas, Eglė, and Copilaș-Ciocianu, Denis
- Subjects
- *
AMPHIPODA , *GENE rearrangement , *TRANSFER RNA , *SPECIES pools , *INTRODUCED species , *GENOMES , *CRUSTACEA , *MOLECULAR phylogeny - Abstract
European inland surface waters are home to a rich diversity of native amphipod crustaceans, many of which face threats from invasive Ponto-Caspian counterparts. In this study, we analyse mitochondrial genomes to deduce phylogenetic relationships and compare gene order and nucleotide composition between representative native European and invasive Ponto-Caspian taxa across five families, ten genera and 20 species (with 13 newly sequenced herein). We observe various gene rearrangement patterns in the phylogenetically diverse native species pool. Pallaseopsis quadrispinosa and Synurella ambulans exhibit notable deviations from the typical organisation, featuring extensive translocations of tRNAs and the nad1 gene, as well as a tRNA-F polarity switch in the latter. The monophyletic invasive Ponto-Caspian gammarids display a conserved gene order, primarily differing from native species by a tRNA-E and tRNA-R translocation, which reinforces previous findings. However, Chaetogammarus warpachowskyi shows extensive rearrangement with translocations of six tRNAs. The invasive corophiid, Chelicorophium curvispinum, maintains a highly conserved gene order despite its distant phylogenetic position. We also discover that native species have a significantly higher GC and lower AT content compared to invasive species. The mitogenomic differences observed between native and invasive amphipods warrant further investigation and could provide insights into the mechanisms underlying invasion success. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Effect of Different Types of Sequence Data on Palaeognath Phylogeny.
- Author
-
Takezaki, Naoko
- Subjects
- *
PHYLOGENY , *KIWIS (Birds) , *PARSIMONIOUS models , *LOCUS (Genetics) , *CHICKENS , *OSTRICHES - Abstract
Palaeognathae consists of five groups of extant species: flighted tinamous (1) and four flightless groups: kiwi (2), cassowaries and emu (3), rheas (4), and ostriches (5). Molecular studies supported the groupings of extinct moas with tinamous and elephant birds with kiwi as well as ostriches as the group that diverged first among the five groups. However, phylogenetic relationships among the five groups are still controversial. Previous studies showed extensive heterogeneity in estimated gene tree topologies from conserved nonexonic elements, introns, and ultraconserved elements. Using the noncoding loci together with protein-coding loci, this study investigated the factors that affected gene tree estimation error and the relationships among the five groups. Using closely related ostrich rather than distantly related chicken as the outgroup, concatenated and gene tree–based approaches supported rheas as the group that diverged first among groups (1)–(4). Whereas gene tree estimation error increased using loci with low sequence divergence and short length, topological bias in estimated trees occurred using loci with high sequence divergence and/or nucleotide composition bias and heterogeneity, which more occurred in trees estimated from coding loci than noncoding loci. Regarding the relationships of (1)–(4), the site patterns by parsimony criterion appeared less susceptible to the bias than tree construction assuming stationary time-homogeneous model and suggested the clustering of kiwi and cassowaries and emu the most likely with ∼40% support rather than the clustering of kiwi and rheas and that of kiwi and tinamous with 30% support each. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Chromatic Differentiation of Functional Mappings of the Composition of Nucleic Acids.
- Author
-
Stepanyan, Ivan V. and Lednev, Mihail Y.
- Subjects
- *
NUCLEIC acids , *GENETIC algorithms , *DNA , *VISUALIZATION - Abstract
Color visualization of the DNA of diverse living beings can help in the exploration of the issue of chromatic differentiation of functional mappings of the nucleotide composition of DNA molecules. By "chromatic differentiation", we mean the coloring of these mappings. Algorithms for coloring genetic representations improve the perception of complex genetic information using color. Methodologically, to build the chromatic differentiation of functional mappings of the nucleotide composition of DNA, we employed the system of nucleotide Walsh functions and the Chaos Game Representation (CGR) algorithm. The authors compared these two approaches and proposed a modified CGR algorithm. The work presents various algorithms of chromatic differentiation based on the nucleotide Walsh functions at a specific location of the fragment in the nucleotide chain and on the frequencies of those fragments. The results of the analysis provide examples of chromatic differentiation in a variety of parametric spaces. The paper describes various approaches to coloring and video animation of DNA molecules in their chromatically differentiated spans of physicochemical parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. Analysis of codon usage patterns in open reading frame 4 of hepatitis E viruses
- Author
-
Zoya Shafat, Anwar Ahmed, Mohammad K. Parvez, and Shama Parveen
- Subjects
Hepatitis E virus (HEV) ,Open reading frame 4 (ORF4) ,Nucleotide composition ,Synonymous codon usage ,Mutational pressure ,Natural selection ,Medicine (General) ,R5-920 ,Science - Abstract
Abstract Background Hepatitis E virus (HEV) is a member of the family Hepeviridae and causes acute HEV infections resulting in thousands of deaths worldwide. The zoonotic nature of HEV in addition to its tendency from human to human transmission has led scientists across the globe to work on its different aspects. HEV also accounts for about 30% mortality rates in case of pregnant women. The genome of HEV is organized into three open reading frames (ORFs): ORF1 ORF2 and ORF3. A reading frame encoded protein ORF4 has recently been discovered which is exclusive to GT 1 isolates of HEV. The ORF4 is suggested to play crucial role in pregnancy-associated pathology and enhanced replication. Though studies have documented the ORF4’s importance, the genetic features of ORF4 protein genes in terms of compositional patterns have not been elucidated. As codon usage performs critical role in establishment of the host–pathogen relationship, therefore, the present study reports the codon usage analysis (based on nucleotide sequences of HEV ORF4 available in the public database) in three hosts along with the factors influencing the codon usage patterns of the protein genes of ORF4 of HEV. Results The nucleotide composition analysis indicated that ORF4 protein genes showed overrepresentation of C nucleotide and while A nucleotide was the least-represented, with random distribution of G and T(U) nucleotides. The relative synonymous codon usage (RSCU) analysis revealed biasness toward C/G-ended codons (over U/A) in all three natural HEV-hosts (human, rat and ferret). It was observed that all the ORF4 genes were richly endowed with GC content. Further, our results showed the occurrence of both coincidence and antagonistic codon usage patterns among HEV-hosts. The findings further emphasized that both mutational and selection forces influenced the codon usage patterns of ORF4 protein genes. Conclusions To the best of our knowledge, this is first bioinformatics study evaluating codon usage patterns in HEV ORF4 protein genes. The findings from this study are expected to increase our understanding toward significant factors involved in evolutionary changes of ORF4.
- Published
- 2022
- Full Text
- View/download PDF
12. Decoding the codon usage patterns in Y-domain region of hepatitis E viruses
- Author
-
Zoya Shafat, Anwar Ahmed, Mohammad K. Parvez, and Shama Parveen
- Subjects
YDR ,Nucleotide composition ,Codon usage bias ,Mutation pressure ,Natural selection ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background Hepatitis E virus (HEV) is a positive-sense RNA virus belonging to the family Hepeviridae. The genome of HEV is organized into three open-reading frames (ORFs): ORF1, ORF2, and ORF3. The ORF1 non-structural Y-domain region (YDR) has been demonstrated to play an important role in the HEV pathogenesis. The nucleotide composition, synonymous codon usage bias in conjunction with other factors influencing the viral YDR genes of HEV have not been studied. Codon usage represents a significant mechanism in establishing the host-pathogen relationship. The present study for the first time elucidates the detailed codon usage patterns of YDR among HEV and HEV-hosts (Human, Rabbit, Mongoose, Pig, Wild boar, Camel, Monkey). Results The overall nucleotide composition revealed the abundance of C and U nucleotides in YDR genomes. The relative synonymous codon usage (RSCU) analysis indicated biasness towards C and U over A and G ended codons in HEV across all hosts. Codon frequency comparative analyses among HEV-hosts showed both similarities and discrepancies in usage of preferred codons encoding amino acids, which revealed that HEV codon preference neither completely differed nor completely showed similarity with its hosts. Thus, our results clearly indicated that the synonymous codon usage of HEV is a mixture of the two types of codon usage: coincidence and antagonism. Mutation pressure from virus and natural selection from host seems to be accountable for shaping the codon usage patterns in YDR. The study emphasised that the influence of compositional constraints, codon usage biasness, mutational alongside the selective forces were reflected in the occurrence of YDR codon usage patterns. Conclusions Our study is the first in its kind to have reported the analysis of codon usage patterns on a total of seven different natural HEV hosts. Therefore, knowledge of preferred codons obtained from our study will not only augment our understanding towards molecular evolution but is also envisaged to provide insight into the efficient viral expression, viral adaptation, and host effects on the HEV YDR codon usage.
- Published
- 2022
- Full Text
- View/download PDF
13. Mitochondrial cytochrome oxidase 1 reveals genetic diversity of the African Snakehead fish Parachanna obscura, Gunther, 1861 from Nigeria's freshwater environment.
- Author
-
Osho, Friday Elijah, Omitoyin, Bamidele Oluwarotimi, Ajani, Emmanuel Kolawole, Azuh, Victor O., and Adediji, Adedapo Olutola
- Subjects
- *
NUCLEOTIDES , *CYTOCHROME oxidase , *POLYMERASE chain reaction , *HAPLOTYPES - Abstract
The study investigated the genetic variation of Parachanna obscura from five rivers (Anambra, Ibbi, Imo, Katsina- Ala and Ogun) in Nigeria using the mitochondrial cytochrome oxidase 1 gene. DNA was extracted from 19, 22, 16, 18 and 21 fin clips per river population, respectively and subjected to polymerase chain reaction. A total of 96 sequences, each with 671 bp were obtained with 38 (5.6%) polymorphic, 27 (3.8%) parsimoniously informative and 659 (98.2%) conserved sites. Mean nucleotide composition was C = 28.07%, T = 29.43%, A = 22.18%, G = 20.32%. A total of 40 haplotypes with 38 unique sequences as well as 24 substitutions with 22 transversions and two transitions were obtained. Nucleotide diversity among populations ranged from 0.00184 to 0.00888 representing Ibbi and Imo, respectively while haplotype diversity ranged from 0.77056 to 0.95000 also, from Ibbi and Imo, respectively. Analyses of molecular variance showed that the intra-population variation accounted for 50.05%. Topology from phylogenetic analyses revealed that P. obscura from Imo River was distinctly different from the rest. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Parametric Multispectral Mappings and Comparative Genomics.
- Author
-
Stepanyan, Ivan V. and Lednev, Michail Y.
- Subjects
- *
COMPARATIVE genomics , *COMPARATIVE genetics , *DISCRETE geometry , *GENETIC code , *MULTISPECTRAL imaging , *GENETIC genealogy , *NEURAL codes - Abstract
This article describes new algorithms that allow for viewing genetic sequences in the form of their multispectral images. We presented examples of the construction of such mappings with a demonstration of the practical problems of comparative genomics. New DNA visualization tools seem promising, thanks to their informativeness and representativeness. The research illustrates how a novel sort of multispectral mapping, based on decomposition in several parametric spaces, can be created for comparative genetics. This appears to be a crucial step in the investigation of the genetic coding phenomenon and in practical activities, such as forensics, genetic testing, genealogical analysis, etc. The article gives examples of multispectral parametric sets for various types of coordinate systems. We build mappings using binary sub-alphabets of purine/pyrimidine and keto/amino. We presented 2D and 3D renderings in different characteristic spaces: structural, integral, cyclic, spherical, and third-order spherical. This research is based on the method previously developed by the author for visualizing genetic information based on new molecular genetic algorithms. One of the types of mappings, namely two-dimensional, is an object of discrete geometry, a symmetrical square matrix of high dimension. The fundamental properties of symmetry, which are traced on these mappings, allow us to speak about the close connection between the phenomenon of genetic coding and symmetry when using the developed mathematical apparatus for representing large volumes of complexly organized molecular genetic information. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Analysis of Compositional Biasness in the Y-domain of Hepatitis E Viruses
- Author
-
Shafat, Zoya, Rizvi, Syed Abuzar Raza, Ahmed, Anwar, Parvez, Mohammad K., and Parveen, Shama
- Published
- 2021
- Full Text
- View/download PDF
16. Analysis of codon usage bias of classical swine fever virus
- Author
-
Sharanagouda S. Patil, Uma Bharathi Indrabalan, Kuralayanapalya Puttahonnappa Suresh, and Bibek Ranjan Shome
- Subjects
classical swine fever virus ,codon usage bias ,india ,nucleotide composition ,synonymous codons ,Animal culture ,SF1-1100 ,Veterinary medicine ,SF600-1100 - Abstract
Background and Aim: Classical swine fever (CSF), caused by CSF virus (CSFV), is a highly contagious disease in pigs causing 100% mortality in susceptible adult pigs and piglets. High mortality rate in pigs causes huge economic loss to pig farmers. CSFV has a positive-sense RNA genome of 12.3 kb in length flanked by untranslated regions at 5' and 3' end. The genome codes for a large polyprotein of 3900 amino acids coding for 11 viral proteins. The 1300 codons in the polyprotein are coded by different combinations of three nucleotides which help the infectious agent to evolve itself and adapt to the host environment. This study performed and employed various methods/techniques to estimate the changes occurring in the process of CSFV evolution by analyzing the codon usage pattern. Materials and Methods: The evolution of viruses is widely studied by analyzing their nucleotides and coding regions/ codons using various methods. A total of 115 complete coding regions of CSFVs including one complete genome from our laboratory (MH734359) were included in this study and analysis was carried out using various methods in estimating codon usage bias and evolution. This study elaborates on the factors that influence the codon usage pattern. Results: The effective number of codons (ENC) and relative synonymous codon usage showed the presence of codon usage bias. The mononucleotide (A) has a higher frequency compared to the other mononucleotides (G, C, and T). The dinucleotides CG and CC are underrepresented and overrepresented. The codons CGT was underrepresented and AGG was overrepresented. The codon adaptation index value of 0.71 was obtained indicating that there is a similarity in the codon usage bias. The principal component analysis, ENC-plot, Neutrality plot, and Parity Rule 2 plot produced in this article indicate that the CSFV is influenced by the codon usage bias. The mutational pressure and natural selection are the important factors that influence the codon usage bias. Conclusion: The study provides useful information on the codon usage analysis of CSFV and may be utilized to understand the host adaptation to virus environment and its evolution. Further, such findings help in new gene discovery, design of primers/probes, design of transgenes, determination of the origin of species, prediction of gene expression level, and gene function of CSFV. To the best of our knowledge, this is the first study on codon usage bias involving such a large number of complete CSFVs including one sequence of CSFV from India.
- Published
- 2021
- Full Text
- View/download PDF
17. Multidimensional Approach for Analysis of Chromosomes Nucleotide Composition
- Author
-
Stepanyan, Ivan V., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Hu, Zhengbing, editor, Petoukhov, Sergey, editor, Dychka, Ivan, editor, and He, Matthew, editor
- Published
- 2020
- Full Text
- View/download PDF
18. Silent codon positions in the A-rich HIV RNA genome that do not easily become A: Restrictions imposed by the RNA sequence and structure.
- Author
-
Berkhout, Ben and Hemert, Formijn J van
- Subjects
RNA modification & restriction ,NUCLEOTIDE sequence ,RNA ,HIV ,GENOMES - Abstract
There is a strong evolutionary tendency of the human immunodeficiency virus (HIV) to accumulate A nucleotides in its RNA genome, resulting in a mere 40 per cent A count. This A bias is especially dominant for the so-called silent codon positions where any nucleotide can be present without changing the encoded protein. However, particular silent codon positions in HIV RNA refrain from becoming A, which became apparent upon genome analysis of many virus isolates. We analyzed these 'noA' genome positions to reveal the underlying reason for their inability to facilitate the A nucleotide. We propose that local RNA structure requirements can explain the absence of A at these sites. Thus, noA sites may be prominently involved in the correct folding of the viral RNA. Turning things around, the presence of multiple clustered noA sites may reveal the presence of important sequence and/or structural elements in the HIV RNA genome. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. Decoding the codon usage patterns in Y-domain region of hepatitis E viruses.
- Author
-
Shafat, Zoya, Ahmed, Anwar, Parvez, Mohammad K., and Parveen, Shama
- Subjects
WILD boar ,HEPATITIS viruses ,HEPATITIS E virus ,NATURAL selection ,HOST-parasite relationships ,VIRAL genes ,AFRICAN swine fever - Abstract
Background: Hepatitis E virus (HEV) is a positive-sense RNA virus belonging to the family Hepeviridae. The genome of HEV is organized into three open-reading frames (ORFs): ORF1, ORF2, and ORF3. The ORF1 non-structural Y-domain region (YDR) has been demonstrated to play an important role in the HEV pathogenesis. The nucleotide composition, synonymous codon usage bias in conjunction with other factors influencing the viral YDR genes of HEV have not been studied. Codon usage represents a significant mechanism in establishing the host-pathogen relationship. The present study for the first time elucidates the detailed codon usage patterns of YDR among HEV and HEV-hosts (Human, Rabbit, Mongoose, Pig, Wild boar, Camel, Monkey). Results: The overall nucleotide composition revealed the abundance of C and U nucleotides in YDR genomes. The relative synonymous codon usage (RSCU) analysis indicated biasness towards C and U over A and G ended codons in HEV across all hosts. Codon frequency comparative analyses among HEV-hosts showed both similarities and discrepancies in usage of preferred codons encoding amino acids, which revealed that HEV codon preference neither completely differed nor completely showed similarity with its hosts. Thus, our results clearly indicated that the synonymous codon usage of HEV is a mixture of the two types of codon usage: coincidence and antagonism. Mutation pressure from virus and natural selection from host seems to be accountable for shaping the codon usage patterns in YDR. The study emphasised that the influence of compositional constraints, codon usage biasness, mutational alongside the selective forces were reflected in the occurrence of YDR codon usage patterns. Conclusions: Our study is the first in its kind to have reported the analysis of codon usage patterns on a total of seven different natural HEV hosts. Therefore, knowledge of preferred codons obtained from our study will not only augment our understanding towards molecular evolution but is also envisaged to provide insight into the efficient viral expression, viral adaptation, and host effects on the HEV YDR codon usage. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Synonymous Codon Pattern of Cowpea Mild Mottle Virus Sheds Light on Its Host Adaptation and Genome Evolution.
- Author
-
Yang, Siqi, Liu, Ye, Wu, Xiaoyun, Cheng, Xiaofei, and Wu, Xiaoxia
- Subjects
COWPEA ,NATURAL selection ,VIRUSES ,GENETIC code ,LEGUMES - Abstract
Cowpea mild mottle virus (CpMMV) is an economically significant virus that causes severe disease on several legume crops. Aside from recombination, other factors driving its rapid evolution are elusive. In this study, the synonymous codon pattern of CpMMV and factors shaping it were analyzed. Phylogeny and nucleotide composition analyses showed that isolates of different geography or hosts had very similar nucleotide compositions. Relative synonymous codon usage (RSCU) and neutrality analyses suggest that CpMMV prefers A/U-ending codons and natural selection is the dominative factor that affects its codon bias. Dinucleotide composition and codon adaptation analyses indicate that the codon pattern of CpMMV is mainly shaped by the requirement of escaping of host dinucleotide-associated antiviral responses and translational efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. Prediction of m5C Modifications in RNA Sequences by Combining Multiple Sequence Features
- Author
-
Lijun Dou, Xiaoling Li, Hui Ding, Lei Xu, and Huaikun Xiang
- Subjects
5-methylcytosine ,position-specific propensity ,nucleotide composition ,electron-ion interaction pseudopotentials of trinucleotide ,PC-PseDNC-general ,support vector machine ,Therapeutics. Pharmacology ,RM1-950 - Abstract
5-Methylcytosine (m5C) is a well-known post-transcriptional modification that plays significant roles in biological processes, such as RNA metabolism, tRNA recognition, and stress responses. Traditional high-throughput techniques on identification of m5C sites are usually time consuming and expensive. In addition, the number of RNA sequences shows explosive growth in the post-genomic era. Thus, machine-learning-based methods are urgently requested to quickly predict RNA m5C modifications with high accuracy. Here, we propose a noval support-vector-machine (SVM)-based tool, called iRNA-m5C_SVM, by combining multiple sequence features to identify m5C sites in Arabidopsis thaliana. Eight kinds of popular feature-extraction methods were first investigated systematically. Then, four well-performing features were incorporated to construct a comprehensive model, including position-specific propensity (PSP) (PSNP, PSDP, and PSTP, associated with frequencies of nucleotides, dinucleotides, and trinucleotides, respectively), nucleotide composition (nucleic acid, di-nucleotide, and tri-nucleotide compositions; NAC, DNC, and TNC, respectively), electron-ion interaction pseudopotentials of trinucleotide (PseEIIPs), and general parallel correlation pseudo-dinucleotide composition (PC-PseDNC-general). Evaluated accuracies over 10-fold cross-validation and independent tests achieved 73.06% and 80.15%, respectively, which showed the best predictive performances in A. thaliana among existing models. It is believed that the proposed model in this work can be a promising alternative for further research on m5C modification sites in plant.
- Published
- 2020
- Full Text
- View/download PDF
22. Horizontal Gene Transfer in Marine Environment: A Technical Perspective on Metagenomics
- Author
-
Nakamura, Yoji, Gojobori, Takashi, editor, Wada, Tokio, editor, Kobayashi, Takanori, editor, and Mineta, Katsuhiko, editor
- Published
- 2019
- Full Text
- View/download PDF
23. B‐DNA Structure and Stability: The Role of Nucleotide Composition and Order
- Author
-
Celine Nieuwland, Dr. Trevor A. Hamlin, Prof. Dr. Célia Fonseca Guerra, Prof. Dr. Giampaolo Barone, and Prof. Dr. F. Matthias Bickelhaupt
- Subjects
activation strain model ,density functional calculations ,diagonal interactions ,DNA structures ,nucleotide composition ,Chemistry ,QD1-999 - Abstract
Abstract We have quantum chemically analyzed the influence of nucleotide composition and sequence (that is, order) on the stability of double‐stranded B‐DNA triplets in aqueous solution. To this end, we have investigated the structure and bonding of all 32 possible DNA duplexes with Watson–Crick base pairing, using dispersion‐corrected DFT at the BLYP‐D3(BJ)/TZ2P level and COSMO for simulating aqueous solvation. We find enhanced stabilities for duplexes possessing a higher GC base pair content. Our activation strain analyses unexpectedly identify the loss of stacking interactions within individual strands as a destabilizing factor in the duplex formation, in addition to the better‐known effects of partial desolvation. Furthermore, we show that the sequence‐dependent differences in the interaction energy for duplexes of the same overall base pair composition result from the so‐called “diagonal interactions” or “cross terms”. Whether cross terms are stabilizing or destabilizing depends on the nature of the electrostatic interaction between polar functional groups in the pertinent nucleobases.
- Published
- 2022
- Full Text
- View/download PDF
24. B‐DNA Structure and Stability: The Role of Nucleotide Composition and Order.
- Author
-
Nieuwland, Celine, Hamlin, Trevor A., Fonseca Guerra, Célia, Barone, Giampaolo, and Bickelhaupt, F. Matthias
- Subjects
BASE pairs ,STACKING interactions ,ELECTROSTATIC interaction ,DNA structure ,AQUEOUS solutions ,FUNCTIONAL groups - Abstract
We have quantum chemically analyzed the influence of nucleotide composition and sequence (that is, order) on the stability of double‐stranded B‐DNA triplets in aqueous solution. To this end, we have investigated the structure and bonding of all 32 possible DNA duplexes with Watson–Crick base pairing, using dispersion‐corrected DFT at the BLYP‐D3(BJ)/TZ2P level and COSMO for simulating aqueous solvation. We find enhanced stabilities for duplexes possessing a higher GC base pair content. Our activation strain analyses unexpectedly identify the loss of stacking interactions within individual strands as a destabilizing factor in the duplex formation, in addition to the better‐known effects of partial desolvation. Furthermore, we show that the sequence‐dependent differences in the interaction energy for duplexes of the same overall base pair composition result from the so‐called "diagonal interactions" or "cross terms". Whether cross terms are stabilizing or destabilizing depends on the nature of the electrostatic interaction between polar functional groups in the pertinent nucleobases. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Compositional features and pattern of codon usage for mitochondrial CO genes among reptiles.
- Author
-
Chakraborty, Supriyo, Basumatary, Priyanka, Nath, Durbba, Paul, Sunanda, and Uddin, Arif
- Subjects
- *
REPTILES , *NATURAL selection , *GENES , *MITOCHONDRIA - Abstract
• Nucleotide composition analysis of mitochondrial CO genes across three reptilian orders showed strong AT bias. • GC1 content was the highest in three CO genes across reptilian orders. • SCUO values indicated lower CUB for three CO genes in reptiles. • Relative synonymous codon usage (RSCU) values revealed mostly A-ending codons were preferred. • Neutrality plot showed the dominance of natural selection. The phenomenon of non-random occurrence of synonymous nucleotide triplets (codons) in the coding sequences of genes is the codon usage bias (CUB). In this study, we used bioinformatic tool kit to analyze the compositional pattern and CUB of mitogenes namely COI, COII and COIII across different orders of reptiles. Estimation of overall base composition in the protein-coding sequences of COI, COII and COIII genes of the reptilian orders revealed an uneven usage of nucleotides. The overall count of A nucleotide was found to be the highest while the overall count of G nucleotide was the least. The CO genes across the three reptilian orders were prominently AT biased. Comparison of the GC proportion at each codon position displayed that GC1 percentage ranked the highest in all the three CO genes of the reptilian orders. SCUO values indicated weaker CUB, while considerable variation of SCUO values existed in the three CO genes across the studied reptiles. Relative synonymous codon usage (RSCU) values indicated that mostly the A ending codons were preferred. Based on the parameters namely neutrality plot, mutational responsive index and translational selection, we could conclude that natural selection was the major evolutionary force in COI, COII and COIII genes in the studied reptilian orders. However, correspondence analysis, parity plot and correlation studies indicated the existence of mutation pressure as well on the CO genes. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. Nucleotide composition of transposable elements likely contributes to AT/GC compositional homogeneity of teleost fish genomes
- Author
-
Radka Symonová and Alexander Suh
- Subjects
Teleost fish ,Transposon ,GC content ,Genome evolution ,Nucleotide composition ,Genetics ,QH426-470 - Abstract
Abstract Background Teleost fish genome size has been repeatedly demonstrated to positively correlate with the proportion of transposable elements (TEs). This finding might have far-reaching implications for our understanding of the evolution of nucleotide composition across vertebrates. Genomes of fish and amphibians are GC homogenous, with non-teleost gars being the single exception identified to date, whereas birds and mammals are AT/GC heterogeneous. The exact reason for this phenomenon remains controversial. Since TEs make up significant proportions of genomes and can quickly accumulate across genomes, they can potentially influence the host genome with their own GC content (GC%). However, the GC% of fish TEs has so far been neglected. Results The genomic proportion of TEs indeed correlates with genome size, although not as linearly as previously shown with fewer genomes, and GC% negatively correlates with genome size in the 33 fish genome assemblies analysed here (excluding salmonids). GC% of fish TE consensus sequences positively correlates with the corresponding genomic GC% in 29 species tested. Likewise, the GC contents of the entire repetitive vs. non-repetitive genomic fractions correlate positively in 54 fish species in Ensembl. However, among these fish species, there is also a wide variation in GC% between the main groups of TEs. Class II DNA transposons, predominant TEs in fish genomes, are significantly GC-poorer than Class I retrotransposons. The AT/GC heterogeneous gar genome contains fewer Class II TEs, a situation similar to fugu with its extremely compact and also GC-enriched but AT/GC homogenous genome. Conclusion Our results reveal a previously overlooked correlation between GC% of fish genomes and their TEs. This applies to both TE consensus sequences as well as the entire repetitive genomic fraction. On the other hand, there is a wide variation in GC% across fish TE groups. These results raise the question whether GC% of TEs evolves independently of GC% of the host genome or whether it is driven by TE localization in the host genome. Answering these questions will help to understand how genomic GC% is shaped over time. Long-term accumulation of GC-poor(er) Class II DNA transposons might indeed have influenced AT/GC homogenization of fish genomes and requires further investigation.
- Published
- 2019
- Full Text
- View/download PDF
27. Evidence of gene nucleotide composition favoring replication and growth in a fastidious plant pathogen.
- Author
-
Castillo, Andreina I. and Almeida, Rodrigo P. P.
- Subjects
- *
PHYTOPATHOGENIC microorganisms , *PLANT growth , *XYLELLA fastidiosa , *NATURAL selection , *HOST plants , *SUBSPECIES - Abstract
Nucleotide composition (GC content) varies across bacteria species, genome regions, and specific genes. In Xylella fastidiosa, a vectorborne fastidious plant pathogen infecting multiple crops, GC content ranges between ~51-52%; however, these values were gathered using limited genomic data. We evaluated GC content variations across X. fastidiosa subspecies fastidiosa (N=194), subsp. pauca (N=107), and subsp. multiplex (N=39). Genomes were classified based on plant host and geographic origin; individual genes within each genome were classified based on gene function, strand, length, ortholog group, core vs accessory, and recombinant vs non-recombinant. GC content was calculated for each gene within each evaluated genome. The effects of genome and gene-level variables were evaluated with a mixed effect ANOVA, and the marginal-GC content was calculated for each gene. Also, the correlation between gene-specific GC content vs natural selection (dN/dS) and recombination/mutation (r/m) was estimated. Our analyses show that intra-genomic changes in nucleotide composition in X. fastidiosa are small and influenced by multiple variables. Higher AT-richness is observed in genes involved in replication and translation, and genes in the leading strand. In addition, we observed a negative correlation between high-AT and dN/dS in subsp. pauca. The relationship between recombination and GC content varied between core and accessory genes. We hypothesize that distinct evolutionary forces and energetic constraints both drive and limit these small variations in nucleotide composition. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
28. Codon Usage Bias Analysis of Bluetongue Virus Causing Livestock Infection
- Author
-
Xiaoting Yao, Qinlei Fan, Bo Yao, Ping Lu, Siddiq Ur Rahman, Dekun Chen, and Shiheng Tao
- Subjects
bluetongue virus ,Reoviridae ,Culicoides ,nucleotide composition ,codon usage bias ,evolution ,Microbiology ,QR1-502 - Abstract
Bluetongue virus (BTV) is a double-stranded RNA virus with multiple segments and belongs to the genus Orbivirus within the family Reoviridae. BTV is spread to livestock through its dominant vector, biting midges of genus Culicoides. Although great progress has been made in genomic analyses, it is not fully understood how BTVs adapt to their hosts and evade the host’s immune systems. In this study, we retrieved BTV genome sequences from the National Center for Biotechnology Information (NCBI) database and performed a comprehensive research to explore the codon usage patterns in 50 BTV strains. We used bioinformatic approaches to calculate the relative synonymous codon usage (RSCU), codon adaptation index (CAI), effective number of codons (ENC), and other indices. The results indicated that most of the overpreferred codons had A-endings, which revealed that mutational pressure was the major force shaping codon usage patterns in BTV. However, the influence of natural selection and geographical factors cannot be ignored on viral codon usage bias. Based on the RSCU values, we performed a comparative analysis between BTVs and their hosts, suggesting that BTVs were inclined to evolve their codon usage patterns that were comparable to those of their hosts. Such findings will be conducive to understanding the elements that contribute to viral evolution and adaptation to hosts.
- Published
- 2020
- Full Text
- View/download PDF
29. Comparative analysis of codon usage patterns in Rift Valley fever virus
- Author
-
Hayeon Kim, Myeongji Cho, and Hyeon S. Son
- Subjects
Rift Valley fever virus ,phylogenetic analysis ,nucleotide composition ,codon usage ,Genetics ,QH426-470 - Abstract
Abstract Rift Valley fever virus (RVFV) is a vector-borne pathogen and is the most widely known virus in the genus Phlebovirus. Since it was first reported, RVFV has spread to western Africa, Egypt and Madagascar from its traditional endemic region, and infections continue to occur in new areas. In this study, we analyzed genomic patterns according to the infection properties of RVFV. Among the four segments of RVFV, the nucleotide composition, overall GC content and the difference of GC composition in the third position of the codons (%GC3) between groups were the largest in the S (NP) segment, showing that more diverse codons were used than in other segments. Furthermore, the results of CAI analysis of the S (NP) segment showed that viruses isolated from regions where no previous infections had been reported had the highest values, indicating greater adaptability to human hosts compared with other viruses. This result suggests that mutations in the S (NP) segment co-evolve with the infected hosts and may lead to expansion of the geographic range. The distinctive codon usage patterns observed in specific genomic regions of a group with similar infection properties may be related to the increasing likelihood of RVFV infections in new areas.
- Published
- 2020
- Full Text
- View/download PDF
30. Aerobic prokaryotes do not have higher GC contents than anaerobic prokaryotes, but obligate aerobic prokaryotes have
- Author
-
Sidra Aslam, Xin-Ran Lan, Bo-Wen Zhang, Zheng-Lin Chen, Li Wang, and Deng-Ke Niu
- Subjects
Oxygen requirement ,Reactive oxygen species ,Aerobe ,Anaerobe ,Phylogenetically independent ,Nucleotide composition ,Evolution ,QH359-425 - Abstract
Abstract Background Among the four bases, guanine is the most susceptible to damage from oxidative stress. Replication of DNA containing damaged guanines results in G to T mutations. Therefore, the mutations resulting from oxidative DNA damage are generally expected to predominantly consist of G to T (and C to A when the damaged guanine is not in the reference strand) and result in decreased GC content. However, the opposite pattern was reported 16 years ago in a study of prokaryotic genomes. Although that result has been widely cited and confirmed by nine later studies with similar methods, the omission of the effect of shared ancestry requires a re-examination of the reliability of the results. Results When aerobic and obligate aerobic prokaryotes were mixed together and anaerobic and obligate anaerobic prokaryotes were mixed together, phylogenetic controlled analyses did not detect significant difference in GC content between aerobic and anaerobic prokaryotes. This result is consistent with two generally neglected studied that had accounted for the phylogenetic relationship. However, when obligate aerobic prokaryotes were compared with aerobic prokaryotes, anaerobic prokaryotes, and obligate anaerobic prokaryotes separately using phylogenetic regression analysis, a significant positive association was observed between aerobiosis and GC content, no matter it was calculated from whole genome sequences or the 4-fold degenerate sites of protein-coding genes. Obligate aerobes have significantly higher GC content than aerobes, anaerobes, and obligate anaerobes. Conclusions The positive association between aerobiosis and GC content could be attributed to a mutational force resulting from incorporation of damaged deoxyguanosine during DNA replication rather than oxidation of the guanine nucleotides within DNA sequences. Our results indicate a grade in the aerobiosis-associated mutational force, strong in obligate aerobes, moderate in aerobes, weak in anaerobes and obligate anaerobes.
- Published
- 2019
- Full Text
- View/download PDF
31. Synonymous Codon Pattern of Cowpea Mild Mottle Virus Sheds Light on Its Host Adaptation and Genome Evolution
- Author
-
Siqi Yang, Ye Liu, Xiaoyun Wu, Xiaofei Cheng, and Xiaoxia Wu
- Subjects
cowpea mild mottle virus ,codon pattern ,dinucleotide bias ,codon adaptation ,nucleotide composition ,Medicine - Abstract
Cowpea mild mottle virus (CpMMV) is an economically significant virus that causes severe disease on several legume crops. Aside from recombination, other factors driving its rapid evolution are elusive. In this study, the synonymous codon pattern of CpMMV and factors shaping it were analyzed. Phylogeny and nucleotide composition analyses showed that isolates of different geography or hosts had very similar nucleotide compositions. Relative synonymous codon usage (RSCU) and neutrality analyses suggest that CpMMV prefers A/U-ending codons and natural selection is the dominative factor that affects its codon bias. Dinucleotide composition and codon adaptation analyses indicate that the codon pattern of CpMMV is mainly shaped by the requirement of escaping of host dinucleotide-associated antiviral responses and translational efficiency.
- Published
- 2022
- Full Text
- View/download PDF
32. Composition of Mitochondrial DNA 16S Nucleotide of Dwarf Snakehead (Channa gachua Hamilton, 1822) from Keji River, Magelang, Central Java
- Author
-
Warisatul Ilmi and Tuty Arisuryanti
- Subjects
dwarf snakehead ,genetic characterization ,mtDNA 16S ,nucleotide composition ,Agriculture (General) ,S1-972 ,Plant culture ,SB1-1110 - Abstract
Indonesia has a high marine and freshwater biodiversity including freshwater fish biodiversity. One of freshwater fish which is commonly consumed by Indonesian people is dwarf snakehead (Channa gachua Hamilton, 1822). However, research on genetic characterization, especially the composition of mtDNA 16S nucleotide of dwarf snakehead has poorly understood. Therefore, the aim of this study was to determine the composition of mtDNA 16S nucleotide of dwarf snakehead as a part of genetic characterization of the fish species taken from Keji River, Magelang, Central Java which has not been previously examined. This study analyzed 16S mt-DNA of two samples of dwarf snakehead from Keji River (KTS-01 and KTS-02). In addition, two sequences of Channa gachua with accession number KU986900, KU238074, and HM117234-HM117238 taken from GenBank were used as a comparison. A method used in this research was a PCR method and primers used in this research were 16Sar and 16Sbr. The results revealed that the average of nucleotide composition T, C, A and G of the fish species was 23.04%, 25.13%, 29.06% and 22.77% respectively whereas the average rate of nucleotide composition A+ T and G+ C was 52.10% and 47.90% respectively. The two dwarf snakehead had similar T and C composition but different in A and G composition. In addition, the G+C content in KTS-01 and KTS-02 had the highest frequency compared to other dwarf snakehead taken from GenBank. From this finding it could be assumed that there is genetic variation between the two dwarf snakehead from Keji River which is important genetic data for breeding program of the fish species in the future.
- Published
- 2018
- Full Text
- View/download PDF
33. PseUI: Pseudouridine sites identification based on RNA sequence information
- Author
-
Jingjing He, Ting Fang, Zizheng Zhang, Bei Huang, Xiaolei Zhu, and Yi Xiong
- Subjects
Pseudouridine site ,Position specific nucleotide propensity ,Nucleotide composition ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. However, the predictive accuracy of existing computational methods is not satisfactory and still needs improvement. Results In this study, we developed a new model, PseUI, for Ψ sites identification in three species, which are H. sapiens, S. cerevisiae, and M. musculus. Firstly, five different kinds of features including nucleotide composition (NC), dinucleotide composition (DC), pseudo dinucleotide composition (pseDNC), position-specific nucleotide propensity (PSNP), and position-specific dinucleotide propensity (PSDP) were generated based on RNA segments. Then, a sequential forward feature selection strategy was used to gain an effective feature subset with a compact representation but discriminative prediction power. Based on the selected feature subsets, we built our model by using a support vector machine (SVM). Finally, the generalization of our model was validated by both the jackknife test and independent validation tests on the benchmark datasets. The experimental results showed that our model is more accurate and stable than the previously published models. We have also provided a user-friendly web server for our model at http://zhulab.ahu.edu.cn/PseUI, and a brief instruction for the web server is provided in this paper. By using this instruction, the academic users can conveniently get their desired results without complicated calculations. Conclusion In this study, we proposed a new predictor, PseUI, to detect Ψ sites in RNA sequences. It is shown that our model outperformed the existing state-of-art models. It is expected that our model, PseUI, will become a useful tool for accurate identification of RNA Ψ sites.
- Published
- 2018
- Full Text
- View/download PDF
34. Codon Usage Bias Analysis of Bluetongue Virus Causing Livestock Infection.
- Author
-
Yao, Xiaoting, Fan, Qinlei, Yao, Bo, Lu, Ping, Rahman, Siddiq Ur, Chen, Dekun, and Tao, Shiheng
- Subjects
BLUETONGUE virus ,VETERINARY virology ,CERATOPOGONIDAE ,CULICOIDES ,DOUBLE-stranded RNA ,NATURAL selection - Abstract
Bluetongue virus (BTV) is a double-stranded RNA virus with multiple segments and belongs to the genus Orbivirus within the family Reoviridae. BTV is spread to livestock through its dominant vector, biting midges of genus Culicoides. Although great progress has been made in genomic analyses, it is not fully understood how BTVs adapt to their hosts and evade the host's immune systems. In this study, we retrieved BTV genome sequences from the National Center for Biotechnology Information (NCBI) database and performed a comprehensive research to explore the codon usage patterns in 50 BTV strains. We used bioinformatic approaches to calculate the relative synonymous codon usage (RSCU), codon adaptation index (CAI), effective number of codons (ENC), and other indices. The results indicated that most of the overpreferred codons had A-endings, which revealed that mutational pressure was the major force shaping codon usage patterns in BTV. However, the influence of natural selection and geographical factors cannot be ignored on viral codon usage bias. Based on the RSCU values, we performed a comparative analysis between BTVs and their hosts, suggesting that BTVs were inclined to evolve their codon usage patterns that were comparable to those of their hosts. Such findings will be conducive to understanding the elements that contribute to viral evolution and adaptation to hosts. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
35. Locate-R: Subcellular localization of long non-coding RNAs using nucleotide compositions.
- Author
-
Ahmad, Ahsan, Lin, Hao, and Shatabda, Swakkhar
- Subjects
- *
NON-coding RNA , *LINCRNA , *WEB-based user interfaces , *FEATURE selection , *SUPPORT vector machines , *CANCER , *MACHINE learning - Abstract
Knowledge of the sub-cellular localization of the most diverse class of transcribed RNA, long non-coding RNAs (lncRNAs) will lead us to identify different types of cancers and other diseases as lncRNAs play key role in related cellular functions. In recent days with the exponential growth of known records, it becomes essential to establish new machine learning based techniques to identify the new one due to faster and cheaper solutions provided compared to laboratory methods. In this paper, we propose Locate-R, a novel method for predicting the sub-cellular location of lncRNAs. We have used only n -gapped l -mer composition and l -mer composition as features and select best 655 features to build the model. This model is based locally deep support vector machines which significantly enhance the prediction accuracy with respect to exiting state-of-the-art methods. Our predictor is readily available for use as a stand-alone web application from: http://locate-r.azurewebsites.net/. • An efficient feature extraction and selection procedure. • An extensible framework/methodology for binary and multi-class classification problems. • A prediction tool available for RNA location prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
36. Regulatory Contexts in the 5'-Region of mRNA from Arabidopsis thaliana Plants and Their Role in Translation Efficiency.
- Author
-
Kabardaeva, K. V., Turin, A. A., Kouchoro, F., Mustafaev, O. N., Deineko, I. V., Fadeev, V. S., and Goldenkova-Pavlova, I. V.
- Subjects
- *
ARABIDOPSIS thaliana , *MESSENGER RNA , *NUCLEOTIDE sequence , *ABSOLUTE value , *DINUCLEOTIDES - Abstract
In this study, the polysome profiling method was used for the separation of mRNA depending on their loading by ribosomes into polysomal and monosomal fractions. Separation of pools of such mRNA and analysis of transcripts (mRNA), which are characterized by a constant level of transcription in a wide range of absolute values at all stages of plant ontogenesis and associated with each pool of mRNA due to RNA sequencing, allowed for obtaining an idea about the translational efficiency of individual mRNA. The consequent in silico analysis allowed performing a search for regulatory contexts in the 5'-region of mRNA of Arabidopsis thaliana plants that may be potentially important for efficient mRNA translation. The results of the study revealed that pyrimidine dinucleotides and motifs are characteristic of a 5'-untranslated mRNA region with high translation efficiency, whereas purine dinucleotides and motifs are associated with transcripts with low translational efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
37. Characterization of the complete mitochondrial genome of Okenia hiroi (Baba, 1938) (Nudibranchia, Goniodorididae)
- Author
-
Thinh Dinh Do, Dae-Wui Jung, Tae-June Choi, Hyung-Eun An, and Chang-Bae Kim
- Subjects
okenia ,mitogenome ,nucleotide composition ,phylogeny ,Genetics ,QH426-470 - Abstract
Okenia is a speciose genus of the family Goniodorididae with more than 50 valid species. The phylogenetic relationships within the genus are little known. The mitogenome is a good marker to understand the phylogenetic relationships of relative species. This study was performed to sequence the mitogenome of O. hiroi. The mitogenome of O. hiroi was 14,583 bp in size and was composed of 37 genes, including 13 protein-coding genes, two ribosomal RNA genes, and 22 tRNA genes. The nucleotide composition was 30.5% A, 13.6% C, 16.5% G, and 39.4% T. The phylogenetic analysis showed that O. hiroi is sister to Notodoris gardineri (Aegiridae). This study recorded the first mitochondrial genome sequence of the family Goniodorididae.
- Published
- 2021
- Full Text
- View/download PDF
38. Modeling of Evolving RNA Replicators
- Author
-
Aguirre, Jacobo, Stich, Michael, Formaggia, Luca, Editor-in-chief, Gerbeau, Jean-Frédéric, Series editor, Martinez-Seara Alonso, Tere, Series editor, Parés, Carlos, Series editor, Pareschi, Lorenzo, Series editor, Pedregal, Pablo, Editor-in-chief, Tosin, Andrea, Series editor, Vazquez, Elena, Series editor, Zubelli, Jorge P., Series editor, Zunino, Paolo, Series editor, Carballido-Landeira, Jorge, editor, and Escribano, Bruno, editor
- Published
- 2016
- Full Text
- View/download PDF
39. HIV-1 tolerates changes in A-count in a small segment of the pol gene
- Author
-
Bep Klaver, Yme van der Velden, Formijn van Hemert, Antoinette C. van der Kuyl, and Ben Berkhout
- Subjects
HIV-1 evolution ,RNA genome ,Nucleotide composition ,Evolution ,A-rich ,Subtypes ,Immunologic diseases. Allergy ,RC581-607 - Abstract
Abstract Background The HIV-1 RNA genome has a biased nucleotide composition with a surplus of As. Several hypotheses have been put forward to explain this striking phenomenon, but the A-count of the HIV-1 genome has thus far not been systematically manipulated. The reason for this reservation is the likelihood that known and unknown sequence motifs will be affected by such a massive mutational approach, thus resulting in replication-impaired virus mutants. We present the first attempt to increase and decrease the A-count in a relatively small polymerase (pol) gene segment of HIV-1 RNA. Results To minimize the mutational impact, a new mutational approach was developed that is inspired by natural sequence variation as present in HIV-1 isolates. This phylogeny-instructed mutagenesis allowed us to create replication-competent HIV-1 mutants with a significantly increased or decreased local A-count. The local A-count of the wild-type (wt) virus (40.2%) was further increased to 46.9% or reduced to 31.7 and 26.3%. These HIV-1 variants replicate efficiently in vitro, despite the fact that the pol changes cause a quite profound move in HIV–SIV sequence space. Conclusions Extrapolating these results to the complete 9 kb RNA genome, we may cautiously suggest that the A-rich signature does not have to be maintained. This survey also provided clues that silent codon changes, in particular from G-to-A, determine the subtype-specific sequence signatures.
- Published
- 2017
- Full Text
- View/download PDF
40. Tree rooting with outgroups when they differ in their nucleotide composition from the ingroup: the Drosophila saltans and willistoni groups, a case study.
- Author
-
Tarrío, R, Rodríguez-Trelles, F, and Ayala, FJ
- Subjects
Animals ,Drosophila ,Xanthine Dehydrogenase ,Evolution ,Molecular ,Phylogeny ,Base Composition ,Genetic Variation ,outgroup and midpoint rooting ,Drosophila saltans and willistoni groups ,Xdh ,nucleotide composition ,GC content ,maximum-likelihood ,among-site rate variation ,Evolutionary Biology ,Genetics ,Zoology - Abstract
Rooting is frequently the most precarious step in any phylogenetic analysis. Outgroups can become useless for rooting if they are too distantly related to the ingroup. Specifically, little attention has been paid to scenarios where outgroups have evolved different nucleotide frequencies from the ingroup. We investigate one empirical example that arose seeking to determine the phylogenetic relationship between the saltans and the willistoni groups of Drosophila (subgenus Sophophora). We have analyzed 2085 coding nucleotides from the xanthine dehydrogenase (Xdh) gene in 14 species, 6 from the saltans group and 8 from the willistoni group. We adopt a two-step strategy: (1) we investigate the phylogeny without outgroups, rooting the network by the midpoint method; (2) we reinvestigate the rooting of this phylogeny using predefined outgroups in both a parsimony- and a model-based maximum-likelihood framework. A satisfactory description of the substitution process along the Xdh region calls for six substitution types and substitution rate variation among codon positions. When the ingroup sequences are considered alone, the phylogeny obtained using this description corroborates the known relationships derived from anatomical criteria. Inclusion of the outgroups makes the root unstable, apparently because of differences between ingroups and outgroups in the substitution processes; these differences are better accounted for by a simplified model of evolution than by more complex, realistic descriptions of the substitution process.
- Published
- 2000
41. Tree Rooting with Outgroups When They Differ in Their Nucleotide Composition from the Ingroup: The Drosophila saltans and willistoni Groups, a Case Study
- Author
-
Tarrı́o, Rosa, Rodrı́guez-Trelles, Francisco, and Ayala, Francisco J
- Subjects
Genetics ,Animals ,Base Composition ,Drosophila ,Evolution ,Molecular ,Genetic Variation ,Phylogeny ,Xanthine Dehydrogenase ,outgroup and midpoint rooting ,Drosophila saltans and willistoni groups ,Xdh ,nucleotide composition ,GC content ,maximum-likelihood ,among-site rate variation ,Evolutionary Biology ,Zoology - Abstract
Rooting is frequently the most precarious step in any phylogenetic analysis. Outgroups can become useless for rooting if they are too distantly related to the ingroup. Specifically, little attention has been paid to scenarios where outgroups have evolved different nucleotide frequencies from the ingroup. We investigate one empirical example that arose seeking to determine the phylogenetic relationship between the saltans and the willistoni groups of Drosophila (subgenus Sophophora). We have analyzed 2085 coding nucleotides from the xanthine dehydrogenase (Xdh) gene in 14 species, 6 from the saltans group and 8 from the willistoni group. We adopt a two-step strategy: (1) we investigate the phylogeny without outgroups, rooting the network by the midpoint method; (2) we reinvestigate the rooting of this phylogeny using predefined outgroups in both a parsimony- and a model-based maximum-likelihood framework. A satisfactory description of the substitution process along the Xdh region calls for six substitution types and substitution rate variation among codon positions. When the ingroup sequences are considered alone, the phylogeny obtained using this description corroborates the known relationships derived from anatomical criteria. Inclusion of the outgroups makes the root unstable, apparently because of differences between ingroups and outgroups in the substitution processes; these differences are better accounted for by a simplified model of evolution than by more complex, realistic descriptions of the substitution process.
- Published
- 2000
42. Fluctuating mutation bias and the evolution of base composition in Drosophila.
- Author
-
Rodríguez-Trelles, F, Tarrío, R, and Ayala, FJ
- Subjects
Animals ,Drosophila ,Alcohol Dehydrogenase ,Xanthine Dehydrogenase ,Insect Proteins ,Codon ,Likelihood Functions ,Evolution ,Molecular ,Phylogeny ,GC Rich Sequence ,Mutation ,Genetic Variation ,mutation bias ,nucleotide composition ,nonsynonymous/synonymous rate ratio ,among-site rate heterogeneity ,willistoni group ,saltans group ,Evolutionary Biology ,Biochemistry and Cell Biology ,Genetics - Abstract
The idea that the pattern of point mutation in Drosophila has remained constant during the evolution of the genus has recently been challenged. A study of the nucleotide composition focused on the Drosophila saltans group has evidenced unsuspected nucleotide composition differences among lineages. Compositional differences are associated with an accelerated rate of amino acid replacement in functionally less constrained regions. Here we reassess this issue from a different perspective. Adopting a maximum-likelihood estimation approach, we focus on the different predictions that mutation and selection make about the nonsynonymous-to-synonymous rate ratio. We investigate two gene regions, alcohol dehydrogenase (Adh) and xanthine dehydrogenase (Xdh), using a balanced data set that comprises representatives from the melangaster, obscura, saltans, and willistoni groups. We also consider representatives of the Hawaiian picture-winged group. These Hawaiian species are known to have experienced repeated bottlenecks and are included as a reference for comparison. Our results confirm patterns previously detected. The branch ancestral to the fast-evolving willistoni/saltans lineage, where most of the change in GC content has occurred, exhibits an excess of synonymous substitutions. The shift in mutation bias has affected the extent of the rate variation among sites in Xdh.
- Published
- 2000
43. Mitochondrial Genomic Landscape: A Portrait of the Mitochondrial Genome 40 Years after the First Complete Sequence
- Author
-
Alessandro Formaggioni, Andrea Luchetti, and Federico Plazzi
- Subjects
mitochondrial genome ,mtDNA architecture ,mtDNA structure ,nucleotide composition ,compositional bias ,strand asymmetry ,Science - Abstract
Notwithstanding the initial claims of general conservation, mitochondrial genomes are a largely heterogeneous set of organellar chromosomes which displays a bewildering diversity in terms of structure, architecture, gene content, and functionality. The mitochondrial genome is typically described as a single chromosome, yet many examples of multipartite genomes have been found (for example, among sponges and diplonemeans); the mitochondrial genome is typically depicted as circular, yet many linear genomes are known (for example, among jellyfish, alveolates, and apicomplexans); the chromosome is normally said to be “small”, yet there is a huge variation between the smallest and the largest known genomes (found, for example, in ctenophores and vascular plants, respectively); even the gene content is highly unconserved, ranging from the 13 oxidative phosphorylation-related enzymatic subunits encoded by animal mitochondria to the wider set of mitochondrial genes found in jakobids. In the present paper, we compile and describe a large database of 27,873 mitochondrial genomes currently available in GenBank, encompassing the whole eukaryotic domain. We discuss the major features of mitochondrial molecular diversity, with special reference to nucleotide composition and compositional biases; moreover, the database is made publicly available for future analyses on the MoZoo Lab GitHub page.
- Published
- 2021
- Full Text
- View/download PDF
44. Nucleotide composition of transposable elements likely contributes to AT/GC compositional homogeneity of teleost fish genomes.
- Author
-
Symonová, Radka and Suh, Alexander
- Subjects
ANIMAL diversity ,SALMONIDAE ,GENOME size ,FISHES ,SIZE of fishes ,HOMOGENEITY ,TRANSPOSONS - Abstract
Background: Teleost fish genome size has been repeatedly demonstrated to positively correlate with the proportion of transposable elements (TEs). This finding might have far-reaching implications for our understanding of the evolution of nucleotide composition across vertebrates. Genomes of fish and amphibians are GC homogenous, with non-teleost gars being the single exception identified to date, whereas birds and mammals are AT/GC heterogeneous. The exact reason for this phenomenon remains controversial. Since TEs make up significant proportions of genomes and can quickly accumulate across genomes, they can potentially influence the host genome with their own GC content (GC%). However, the GC% of fish TEs has so far been neglected. Results: The genomic proportion of TEs indeed correlates with genome size, although not as linearly as previously shown with fewer genomes, and GC% negatively correlates with genome size in the 33 fish genome assemblies analysed here (excluding salmonids). GC% of fish TE consensus sequences positively correlates with the corresponding genomic GC% in 29 species tested. Likewise, the GC contents of the entire repetitive vs. non-repetitive genomic fractions correlate positively in 54 fish species in Ensembl. However, among these fish species, there is also a wide variation in GC% between the main groups of TEs. Class II DNA transposons, predominant TEs in fish genomes, are significantly GC-poorer than Class I retrotransposons. The AT/GC heterogeneous gar genome contains fewer Class II TEs, a situation similar to fugu with its extremely compact and also GC-enriched but AT/GC homogenous genome. Conclusion: Our results reveal a previously overlooked correlation between GC% of fish genomes and their TEs. This applies to both TE consensus sequences as well as the entire repetitive genomic fraction. On the other hand, there is a wide variation in GC% across fish TE groups. These results raise the question whether GC% of TEs evolves independently of GC% of the host genome or whether it is driven by TE localization in the host genome. Answering these questions will help to understand how genomic GC% is shaped over time. Long-term accumulation of GC-poor(er) Class II DNA transposons might indeed have influenced AT/GC homogenization of fish genomes and requires further investigation. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
45. COUSIN (COdon Usage Similarity INdex): A Normalized Measure of Codon Usage Preferences.
- Author
-
Bourret, Jérôme, Alizon, Samuel, and Bravo, Ignacio G
- Subjects
- *
NORMALIZED measures , *COUSINS , *MANNERS & customs , *NULL hypothesis , *GENE expression - Abstract
Codon Usage Preferences (CUPrefs) describe the unequal usage of synonymous codons at the gene, chromosome, or genome levels. Numerous indices have been developed to evaluate CUPrefs, either in absolute terms or with respect to a reference. We introduce the normalized index COUSIN (for COdon Usage Similarity INdex), that compares the CUPrefs of a query against those of a reference and normalizes the output over a Null Hypothesis of random codon usage. The added value of COUSIN is to be easily interpreted, both quantitatively and qualitatively. An eponymous software written in Python3 is available for local or online use (http://cousin.ird.fr). This software allows for an easy and complete analysis of CUPrefs via COUSIN, includes seven other indices, and provides additional features such as statistical analyses, clustering, and CUPrefs optimization for gene expression. We illustrate the flexibility of COUSIN and highlight its advantages by analyzing the complete coding sequences of eight divergent genomes. Strikingly, COUSIN captures a bimodal distribution in the CUPrefs of human and chicken genes hitherto unreported with such precision. COUSIN opens new perspectives to uncover CUPrefs specificities in genomes in a practical, informative, and user-friendly way. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
46. Reanalysis and Revision of the Complete Mitochondrial Genome of Artemia urmiana Günther, 1899 (Crustacea: Anostraca)
- Author
-
Alireza Asem, Amin Eimanifar, Weidong Li, Chun-Yang Shen, Farnaz Mahmoudi Shikhsarmast, Ya-Ting Dan, Hao Lu, Yang Zhou, You Chen, Pei-Zheng Wang, and Michael Wink
- Subjects
brine shrimp ,Artemia ,mitochondrial genome ,phylogeny ,nucleotide composition ,maternal ancestor ,Biology (General) ,QH301-705.5 - Abstract
In the previously published mitochondrial genome sequence of Artemia urmiana (NC_021382 [JQ975176]), the taxonomic status of the examined Artemia had not been determined, due to parthenogenetic populations coexisting with A. urmiana in Urmia Lake. Additionally, NC_021382 [JQ975176] has been obtained with pooled cysts of Artemia (0.25 g cysts consists of 20,000–25,000 cysts), not a single specimen. With regard to coexisting populations in Urmia Lake, and intra- and inter-specific variations in the pooled samples, NC_021382 [JQ975176] cannot be recommended as a valid sequence and any attempt to attribute it to A. urmiana or a parthenogenetic population is unreasonable. With the aid of next-generation sequencing methods, we characterized and assembled a complete mitochondrial genome of A. urmiana with defined taxonomic status. Our results reveal that in the previously published mitogenome (NC_021382 [JQ975176]), tRNA-Phe has been erroneously attributed to the heavy strand but it is encoded in the light strand. There was a major problem in the position of the ND5. It was extended over the tRNA-Phe, which is biologically incorrect. We have also identified a partial nucleotide sequence of 311 bp that was probably erroneously duplicated in the assembly of the control region of NC_021382 [JQ975176], which enlarges the control region length by 16%. This partial sequence could not be recognized in our assembled mitogenome as well as in 48 further examined specimens of A. urmiana. Although, only COX1 and 16S genes have been widely used for phylogenetic studies in Artemia, our findings reveal substantial differences in the nucleotide composition of some other genes (including ATP8, ATP6, ND3, ND6, ND1 and COX3) among Artemia species. It is suggested that these markers should be included in future phylogenetic studies.
- Published
- 2021
- Full Text
- View/download PDF
47. Genomic Features and Evolution of the Parapoxvirus during the Past Two Decades
- Author
-
Xiaoting Yao, Ming Pang, Tianxing Wang, Xi Chen, Xidian Tang, Jianjun Chang, Dekun Chen, and Wentao Ma
- Subjects
parapoxvirus ,Poxviridae ,nucleotide composition ,selection pressure ,evolution ,Medicine - Abstract
Parapoxvirus (PPV) has been identified in some mammals and poses a great threat to both the livestock production and public health. However, the prevalence and evolution of this virus are still not fully understood. Here, we performed an in silico analysis to investigate the genomic features and evolution of PPVs. We noticed that although there were significant differences of GC contents between orf virus (ORFV) and other three species of PPVs, all PPVs showed almost identical nucleotide bias, that is GC richness. The structural analysis of PPV genomes showed the divergence of different PPV species, which may be due to the specific adaptation to their natural hosts. Additionally, we estimated the phylogenetic diversity of seven different genes of PPV. According to all available sequences, our results suggested that during 2010–2018, ORFV was the dominant virus species under the selective pressure of the optimal gene patterns. Furthermore, we found the substitution rates ranged from 3.56 × 10−5 to 4.21 × 10−4 in different PPV segments, and the PPV VIR gene evolved at the highest substitution rate. In these seven protein-coding regions, purifying selection was the major evolutionary pressure, while the GIF and VIR genes suffered the greatest positive selection pressure. These results may provide useful knowledge on the virus genetic evolution from a new perspective which could help to create prevention and control strategies.
- Published
- 2020
- Full Text
- View/download PDF
48. Aerobic prokaryotes do not have higher GC contents than anaerobic prokaryotes, but obligate aerobic prokaryotes have.
- Author
-
Aslam, Sidra, Lan, Xin-Ran, Zhang, Bo-Wen, Chen, Zheng-Lin, Wang, Li, and Niu, Deng-Ke
- Subjects
PROKARYOTES ,MICROORGANISMS ,OXIDATIVE stress ,DNA ,DEOXYRIBOSE - Abstract
Background: Among the four bases, guanine is the most susceptible to damage from oxidative stress. Replication of DNA containing damaged guanines results in G to T mutations. Therefore, the mutations resulting from oxidative DNA damage are generally expected to predominantly consist of G to T (and C to A when the damaged guanine is not in the reference strand) and result in decreased GC content. However, the opposite pattern was reported 16 years ago in a study of prokaryotic genomes. Although that result has been widely cited and confirmed by nine later studies with similar methods, the omission of the effect of shared ancestry requires a re-examination of the reliability of the results. Results: When aerobic and obligate aerobic prokaryotes were mixed together and anaerobic and obligate anaerobic prokaryotes were mixed together, phylogenetic controlled analyses did not detect significant difference in GC content between aerobic and anaerobic prokaryotes. This result is consistent with two generally neglected studied that had accounted for the phylogenetic relationship. However, when obligate aerobic prokaryotes were compared with aerobic prokaryotes, anaerobic prokaryotes, and obligate anaerobic prokaryotes separately using phylogenetic regression analysis, a significant positive association was observed between aerobiosis and GC content, no matter it was calculated from whole genome sequences or the 4-fold degenerate sites of protein-coding genes. Obligate aerobes have significantly higher GC content than aerobes, anaerobes, and obligate anaerobes. Conclusions: The positive association between aerobiosis and GC content could be attributed to a mutational force resulting from incorporation of damaged deoxyguanosine during DNA replication rather than oxidation of the guanine nucleotides within DNA sequences. Our results indicate a grade in the aerobiosis-associated mutational force, strong in obligate aerobes, moderate in aerobes, weak in anaerobes and obligate anaerobes. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
49. PseUI: Pseudouridine sites identification based on RNA sequence information.
- Author
-
He, Jingjing, Fang, Ting, Zhang, Zizheng, Huang, Bei, Zhu, Xiaolei, and Xiong, Yi
- Subjects
PSEUDOURIDINE ,RNA sequencing ,RNA ,URIDINE ,NUCLEOTIDES ,DINUCLEOTIDES ,SUPPORT vector machines - Abstract
Background: Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. However, the predictive accuracy of existing computational methods is not satisfactory and still needs improvement. Results: In this study, we developed a new model, PseUI, for Ψ sites identification in three species, which are H. sapiens, S. cerevisiae, and M. musculus. Firstly, five different kinds of features including nucleotide composition (NC), dinucleotide composition (DC), pseudo dinucleotide composition (pseDNC), position-specific nucleotide propensity (PSNP), and position-specific dinucleotide propensity (PSDP) were generated based on RNA segments. Then, a sequential forward feature selection strategy was used to gain an effective feature subset with a compact representation but discriminative prediction power. Based on the selected feature subsets, we built our model by using a support vector machine (SVM). Finally, the generalization of our model was validated by both the jackknife test and independent validation tests on the benchmark datasets. The experimental results showed that our model is more accurate and stable than the previously published models. We have also provided a user-friendly web server for our model at
http://zhulab.ahu.edu.cn/PseUI , and a brief instruction for the web server is provided in this paper. By using this instruction, the academic users can conveniently get their desired results without complicated calculations. Conclusion: In this study, we proposed a new predictor, PseUI, to detect Ψ sites in RNA sequences. It is shown that our model outperformed the existing state-of-art models. It is expected that our model, PseUI, will become a useful tool for accurate identification of RNA Ψ sites. [ABSTRACT FROM AUTHOR]- Published
- 2018
- Full Text
- View/download PDF
50. NucPosPred: Predicting species-specific genomic nucleosome positioning via four different modes of general PseKNC.
- Author
-
Jia, Cangzhi, Yang, Qing, and Zou, Quan
- Subjects
- *
MOLECULAR structure of chromatin , *EUKARYOTIC cell genetics , *DNA replication , *GENOMES , *RNA splicing , *CAENORHABDITIS elegans genetics - Abstract
The nucleosome is the basic structure of chromatin in eukaryotic cells, with essential roles in the regulation of many biological processes, such as DNA transcription, replication and repair, and RNA splicing. Because of the importance of nucleosomes, the factors that determine their positioning within genomes should be investigated. High-resolution nucleosome-positioning maps are now available for organisms including Saccharomyces cerevisiae, Drosophila melanogaster and Caenorhabditis elegans , enabling the identification of nucleosome positioning by application of computational tools. Here, we describe a novel predictor called NucPosPred, which was specifically designed for large-scale identification of nucleosome positioning in C. elegans and D. melanogaster genomes. NucPosPred was separately optimized for each species for four types of DNA sequence feature extraction, with consideration of two classification algorithms (gradient-boosting decision tree and support vector machine). The overall accuracy obtained with NucPosPred was 92.29% for C. elegans and 88.26% for D. melanogaster , outperforming previous methods and demonstrating the potential for species-specific prediction of nucleosome positioning. For the convenience of most experimental scientists, a web-server for the predictor NucPosPred is available at http://121.42.167.206/NucPosPred/index.jsp. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.