Pharma Focus Asia

Unveiling Hidden Structural Patterns in the SARS-CoV-2 Genome: Computational Insights and Comparative Analysis

Alison Ziesel, Hosna Jabbari.


SARS-CoV-2, the causative agent of COVID-19, is known to exhibit secondary structures in its 5’ and 3’ untranslated regions, along with the frameshifting stimulatory element situated between ORF1a and 1b. To identify additional regions containing conserved structures, we utilized a multiple sequence alignment with related coronaviruses as a starting point. We applied a computational pipeline developed for identifying non-coding RNA elements.


SARS-CoV-2, the virus responsible for covid-19, is a member of the clade Betacoronavirus, and is a positive sense, single stranded RNA virus, with a genome size of 29,903 nucleotides. It has a possible zoonotic origin, with its most recent non-human host possibly a bat species. SARS-CoV-2 is capable of forming RNA secondary structure, which is the phenomenon where an RNA molecule self-base pairs to form a non-linear structure.

Materials and Methods:

The genomes of thirteen viruses, including the reference genome for SARS-CoV-2, were obtained from NCBI’s Nucleotide database.

Our analyses included viruses belonging to the genus Betacoronoviridae, coronaviruses known to infect humans, and closely related non-human host coronaviruses.

Guided by the previously constructed phylogenetic tree, MULTIZ-TBA then produces aligned blocksets of sequence projected against a genome of choice, in this case SARS-CoV-2.


Forty subregions of the SARS-CoV-2 genome were predicted by our pipeline as very likely to contain RNA secondary structure. These structures tend towards the 5’ end and 3’ third of the genome and cover structures already known to exist in the SARS-CoV-2 genome, including portions of the 5’ UTR and FSE, although the structures predicted here do not perfectly recapitulate those putative structures.


AZ would like to thank Morgan Cunningham for his advice regarding appropriate statistical analyses.

Citation: Ziesel A, Jabbari H (2024) Unveiling hidden structural patterns in the SARS-CoV-2 genome: Computational insights and comparative analysis. PLoS ONE 19(4): e0298164.

Editor: Salman Sadullah Usmani, Albert Einstein College of Medicine, UNITED STATES

Received: October 6, 2023; Accepted: January 19, 2024; Published: April 4, 2024.

Copyright: © 2024 Ziesel, Jabbari. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data and scripts employed in this study may be found at

Funding: This study was financially supported by Microsoft Azure AI for Health ( in the form of an award received by HJ. This study was also financially supported by Natural Sciences and Engineering Research Council of Canada ( in the form of a NSERC Discovery grant (RGPIN-2020-04243) received by HJ. This study was also financially supported by National Research Council of Canada ( in the form of a DHGA grant (DHGA-110-1) received by HJ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


Competing interests: The authors have declared that no competing interests exist.


magazine-slider-imageMFA + MMA 20244th Annual Cleaning Validation 20242nd Annual Pharma Impurity Conclave 2024CPHI Korea 2024CHEMICAL INDONESIA 2024World Orphan Drug Congress Europe 2024INALAB 2024Thermo Fisher - Drug Discovery and the impact of mAbsAdvanced Therapies USA 2024ISPE Singapore Affiliate Conference & Exhibition 20242024 PDA Aseptic Manufacturing Excellence Conference2024 PDA Aseptic Processing of Biopharmaceuticals Conference