Back to Search Start Over

Genome-Wide Analysis to Identify Palindromes, Mirror and Inverted Repeats in SARS-CoV-2, MERS-CoV and SARS-CoV-1

Authors :
Nimisha Ghosh
Indrajit Saha
Dariusz Plewczynski
Source :
IEEE Access, Vol 10, Pp 23708-23715 (2022)
Publication Year :
2022
Publisher :
IEEE, 2022.

Abstract

Research pertaining to SARS-CoV-2 is in full swing to understand the origin and evolution of this deadly virus that can lead to its rapid detection. To achieve this, atypical genomic sequences which may be unique to SARS-CoV-2 or Coronaviridae family in general may be investigated. Such sequences in virus genomes may be responsible for target prediction, replication, defence mechanisms and viral packaging. This fact has motivated us to explore the different types of repeats such as palindromes, mirror repeats and inverted repeats in SARS-CoV-2, MERS-CoV and SARS-CoV-1. For this purpose, the respective reference sequence of SARS-CoV-2, MERS-CoV and SARS-CoV-1 is divided into descriptors of sequences of length ${k}$ using ${k}$ -mer technique. Thereafter, these descriptors are represented as a collection of tokens which are subsequently used for the identification of palindrome, mirror repeat and inverted repeat in the respective reference sequence. The highest number of palindromes, mirror repeats and inverted repeats are identified for descriptor length 10. As a result, for palindromes such values are 38, 42 and 33 and for mirror repeats they are 52, 38 and 33 for SARS-CoV-2, MERS-CoV and SARS-CoV-1 respectively. For inverted repeats, with a descriptor length 10 and intervening length 5, the values are 59, 56 and 70 respectively. Moreover, the identified repeats are then searched for in 108246, 291 and 340 SARS-CoV-2, MERS-CoV and SARS-CoV-1 virus sequences respectively to find the population coverage of such repeats. It surpasses 99% in most cases and even 100% for some. Furthermore, GC contents which mostly lie between 20%-50% are evaluated for these repeats as well in order to understand their binding efficacy.

Details

Language :
English
ISSN :
21693536
Volume :
10
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.2b86b59a6e464e9c4a278b02dc892e
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2022.3154053