The authors have declared that no competing interests exist.
We apply the new GenomeBits method to uncover underlying genomic features of omicron and delta coronavirus variants. This is a statistical algorithm whose salient feature is to map the nucleotide bases into a finite alternating (±) sum series of distributed terms of binary (0,1) indicators. We show how by this method, distinctive signals can be uncovered out of the intrinsic data organization of amino acid progressions along their base positions. Results reveal a sort of
Since the coronavirus outbreak in Wuhan, China, in December 2019, the SARS-CoV-2 pandemic has became a major risk in global public health. The impact of the outbreak on access to healthcare services has left important repercussions. Severe effects on the mental health and well-being of medical staff and people around the world have had also a lot of relevant implications [
Understanding the coronavirus pathogen is still a global challenge for scientific research. The identification by Similarity studies and fast genomic analysis of the positive-stranded RNA virus –continuously provided through the complete genome sequences confirmed from different laboratories around the world, allowed to shed light into the evolutionary origins of SARS-CoV-2 lineage [
To this last aim we consider in this work, and apply anew, the GenomeBits method [
Previously by GenomeBits [
The delta and omicron variants share some parts of their structures [
The ongoing SARS-CoV-2 research is currently focusing on understanding the essential functions of the conforming proteins in the ribonucleic acid RNA coronaviruses [
Genome sequence comparisons and the discovery of new signaling pathways can be analyzed using the GenomeBits method [
Similarity between pairs of full-length genome sequences is the standard method for determining whether there are sequence equivalences in terms of shared ancestry between them by using alignment methods [
Upper curves: genetic similarity curves between the query sequence SARS-CoV-2 Wuhan-Hu-1 and representative delta and omicron complete genome sequences. In clear blue is the genomic region encoding the spike (S-protein). Lower curves: delta genome sequences used as query against omicron data from Spain and USA. A typical sliding 1000 base pair window in steps of 100 nucleotide bases position was used in these calculations.
By the Similarity plot, we verified more deviations of omicron variants than delta variants from Spain and USA with respect to the first Wuhan-China sequences identified over a year ago (MN908947) [
Conventional Similarity comparisons via “lalign36” alignment provides limited information on the single nucleotide bases A,C,G,T. To determine the best parameters to achieve optimal alignments is difficult. There are several user-defined parameters to overcome gaps and mismatches usually found between genome sequences. Furthermore, the computational resources required increase considerably depending on the length and number of sequences to be aligned.
In the figure regions with clustering of SARS-CoV-2 sequences (< 1%) from the city of Wuhan in relation to the delta lineages from SPAIN, suggest some genetic similarities outside the S-spike gene region (bp 21563–25384, colored in clear blue). More divergent genetic similarities (∼ 97%) are found between the first Wuhan-China sequences and omicron strains from Spain and USA, and in between the delta sequences from Spain and USA against omicron variants from Spain and USA, respectively. This is a clear consequence of the coronavirus mutations.
Our new quantitative method for the examination of distinctive patterns of complete genome sequences considers a certain type of alternating series having terms converted to (0,1) binary values for the nucleotide variables
The mapping into four binary projections of genome sequences follows previous studies on the three-base periodicity characteristic of protein-coding DNA sequences [
Base position |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | GenomeBits sums |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Sequence (string) | A | G | A | T | C | T | G | T | T | C | T | C | |
(−1) |
+1 | 0 | +1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
(−1) |
0 | 0 | 0 | 0 | +1 | 0 | 0 | 0 | 0 | -1 | 0 | -1 | -1 |
(−1) |
0 | -1 | 0 | 0 | 0 | 0 | +1 | 0 | 0 | 0 | 0 | 0 | 0 |
(−1) |
0 | 0 | 0 | -1 | 0 | -1 | 0 | -1 | +1 | 0 | +1 | 0 | -1 |
The variable
There is a user-friendly Graphics User Interface (GUI) to the present signal analysis method of genome sequences. The GUI runs under Linux Ubuntu O.S. and can be downloaded freely from Github [
In brief, with just one click the GenomeBits GUI allows to
run the alternating sums in
separate concatenated genome sequences and save it in single FASTA files (for each country);
get into single files, each of the four nucleotide bases represented by the symbols A,C,T,G;
get the alternating sums results in single files for each of the four nucleotide bases A,C,T,G associated with (±) binary values;
plot the alternating sums to compare behavior of pairs A,T and C,G nucleotide bases versus nucleotide bp;
compare in a plot alternating sums curves versus bp for all four nucleotide bases A,C,T,G;
plot the alternating sums curves versus bp for each of the four nucleotide bases A,C,T,G;
plot results for each nucleotide base A,C,T or G for up to six variants/species and up to 4 FASTA files by country;
comparison of GenomeBits GUI curves for (up to six) given variants/species and (up to two) selected Countries, with those results from our original paper in [
We shall show next that analyzing genomic sequencing via the present type of finite alternating sums allows to extract unique features for omicron and delta mutations with little data noise variations. From the viewpoint of statistics, such series are equivalent to a discrete-valued time series for the statistical identification and characterization of (random) data sets [
The GenomeBits representation of coronavirus genome variants, by adding binary values with ± signs following
In
Delta (in blue) and omicron (in green) variant imprints displayed by the nucleotides A,C,G,T according to
It is interesting to note how in the figure there are regions where the curves for the delta variant (in blue) mirror those of the omicron variant (in green). This peculiar behavior becomes clear by averaging both curves as shown by the red lines. The regions of almost null (with low data noise), or rather constant average values, indicates rather perfect mirroring matching, which is driven by the ± signs of the alternating series. This reveals coding regions of correspondence between delta and omicron variants.
The regions of main discrepancies as found in the Similarity identities curves of
To some degree, there are also other distinctive trends especially around the S-Protein. As seen in
We have applied the GenomeBits method
Numerical representations of genome sequences have gained great attention in bioinformatics studies. One advantage for this approach is that large sequence data can be handled statistically to find various characterizations. Additional properties of the genome sequences for mutant pathogens, as derived in this work for an
GenomeBits may shed light on the bioinformatics surveillance behind future infectious diseases. By a comparison of numerical results, it may be also of some relevance to assist in further developments of synthetic mRNA-based vaccine designs [
(TEX)
We thank all authors and Labs who have kindly deposited and shared genome data on GISAID. We are also indebted to the academic editor and the two reviewers for providing insightful comments on this manuscript.
This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.
PONE-D-22-01925GenomeBits insight into omicron and delta variants of coronavirus pathogenPLOS ONE
Dear Dr. Canessa,
Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.
Please submit your revised manuscript by Jun 11 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at
Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see:
We look forward to receiving your revised manuscript.
Kind regards,
Vladimir Makarenkov
Academic Editor
PLOS ONE
Journal Requirements:
When submitting your revision, we need you to address these additional requirements.
1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at
Additional Editor Comments:
It would be important to add a discussion (preferably to the Introduction section) about the origins and impact of the SARS-Cov-2 pandemic. Here you could cite the following works:
Boni, Maciej F., et al. "Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic." Nature microbiology 5.11 (2020): 1408-1417.
Makarenkov, V., Mazoure, B., Rabusseau, G. et al. Horizontal gene transfer and recombination analysis of SARS-CoV-2 genes helps discover its close relatives and shed light on its origin. BMC Ecol Evo 21, 5 (2021).
Domingo JL. What we know and what we need to know about the origin of SARS-CoV-2. Environ Res. 2021;200:111785.
[Note: HTML markup is below. Please do not edit.]
Reviewers' comments:
Reviewer's Responses to Questions
1. Is the manuscript technically sound, and do the data support the conclusions?
The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.
Reviewer #1: Partly
Reviewer #2: Partly
**********
2. Has the statistical analysis been performed appropriately and rigorously?
Reviewer #1: N/A
Reviewer #2: Yes
**********
3. Have the authors made all data underlying the findings in their manuscript fully available?
The
Reviewer #1: Yes
Reviewer #2: Yes
**********
4. Is the manuscript presented in an intelligible fashion and written in standard English?
PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.
Reviewer #1: No
Reviewer #2: No
**********
5. Review Comments to the Author
Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)
Reviewer #1: 1. The write-up of the manuscript seems to be premature and very difficult to understand. There are several sweeping statements without any references. For instance, the statement “The first known confirmed delta variant was detected in India in late 2020 and the B.1.1.529 infection appeared from a specimen collected in South Africa a year later, early November 2021” should have a citation. Several sentences are incomplete or do not have a meaning. For example “Although omicron seems to cause less severe COVID-19 than delta.”
2. Authors should clearly explain the computation of binary scores for a given genomic sequence. Illustration or outline graphic may be required.
3. It is trivial that a mononucleotide composition can reveal similar information. How Genomebit information is different from mononucleotide computation with a sliding window?.
4. There is no discussion on their results. The authors should compare their analysis with their previous work and also with similar research by others.
Reviewer #2: The manuscript on the topic GenomeBits insight into omicron and delta variants of coronavirus pathogen is an interesting research article. The manuscript is with the interest to the reader and fully in the scope of journal.
I will suggest the manuscript to be accepted for publication after revision.
1. Abstract section looks incomplete. I will suggest the author to focus on following important points on writing the abstract. An abstract summarizes, usually in one paragraph of 300 words or less, the major aspects of the entire paper in a prescribed sequence that includes: 1) the overall purpose of the study and the research problem(s) you investigated; 2) the basic design of the study; 3) major findings or trends found as a result of your analysis; and, 4) a brief summary of your interpretations and conclusions.
2. English of the script is very poorly written. Please write your text in good English (American or British usage is accepted, but not a mixture of these). English language manuscript may require editing to eliminate possible grammatical or spelling errors and to conform to correct scientific English.
3. There are several sentences in the script which are really hard to understand. I will suggest the authors should carefully read the script and amend the English language correction throughout the script.
4. Introduction section need to be more elaborated.
5. Provide a general interpretation of the results in the context of other evidence, and implications for future research.
**********
6. PLOS authors have the option to publish the peer review history of their article (
If you choose “no”, your identity will remain anonymous but your review may still be made public.
Reviewer #1:
Reviewer #2:
[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]
While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,
See also attached file "Response_to_Reviewers.pdf" for our reply to each specific reviewer and editor comments.
Response to Reviewers of PONE-D-22-01925
“GenomeBits insight into omicron and delta variants of coronavirus pathogen”,
by E. Canessa, L. Tenze
We are indebted to the academic editor and the two reviewers for providing
insightful comments on the manuscript. Our responds to each point raised are
as follows:
� Reply to Additional Editor Comments:
The Introduction section was revised and completely rewritten. In particular, we
added a discussion in the Introduction section about the origins of the SARSCov-
2 pandemic as sugested by the Editor, and cited the three works
suggested: (i) Boni, Maciej F., et al. (ii) Makarenkov, V., Mazoure, B.,
Rabusseau, G. et al., and (iii) Domingo JL. We also briefly mentioned the
impact of the SARS-Cov-2 pandemic and added 11 new references to support
our findings throughout the text. The abstract and conclusions have also been
improved based on the data presented. All sequence data analysed here are
publicly available at GSIAD and all custom code used in the manuscript is fully
available at
to present our manuscript in an intelligible fashion with an standard English. At
revision, we corrected few typographical and grammatical errors.
� Reply to Reviewer #1:
1. We have made an effort to rewrite our manuscript and present it in an
intelligible fashion with an standard English. Our statements now carry out
associated new references and few sentences have been completed with a
clear meaning.
2. We have added a new Table to aIllustrate the computation of binary scores
for a given genomic sequence. This Table helps us to explain that in our case,
the mapping into four binary projections of genome sequences carries
alternating plus and minus signs, where + and - signs are chosen sequentially
starting with +1 at k = 1 as default. The Table shows a particular mapping
1
example for converting the brief fragment AGATCTGTTCTC of 12 nucleotides
into the alternating binary array. As a principal difference with previous binary
representations (see, e.g., new Refs [19, 20]), our mapping (whose terms
change sign –i.e., if a term X_k is positive then X_k+1 is negative and vice
versa) displays their occurrence as +1 or -1 as well as their non existence as 0
at a given base pair k.
3. The inclusion of different smoothing sliding window sizes of up to about 500
bp (moving along the target genome sequences and repeating the GenomeBits
procedure as described, lead to a data noise reduction in the curves and
preserve the average behavior of the sums displayed in Figures 1 and 2.
4. Our results have been further elaborated. We emphasize that the regions of
main discrepancies as found in the Similarity identities curves of Fig 1, e.g.,
around N=10000 are also reflected by the red lines of Fig 2 via GenomeBits.
The main difference between both comparative genomics approaches is that
changes via Eq (1) can be analyzed and characterized at each single A,C,G,T
nucleotide level separately. Beside such comparision of Similarity studies and
our previous analysis on Fast Fourier Transforms for coronavirus genome of
other variants as reported in Ref[6], our present findings are new and have not
similars to those reported by others. We report a kind of 'ordered' (constant) to
'disordered' (peaked) phase transition phenomena around the NSP5
polymerase within the open reading frames ORF1a region, up to the nucleotide
region of the S-Protein. As seen in Fig 2, the black arrows indicate an
analogous phase transition point appearing close to the coding region of the Sspike
genes.
� Reply to Reviewer #2:
We particularly thanks this referee for finding our research article interesting
and fully in the scope of PLOS One journal.
1. As suggested, the new Abstract now reads: “We apply the new GenomeBits
method to uncover underlying genomic features of omicron and delta
2
coronavirus variants. This is a statistical algorithm whose salient feature is to
map the nucleotide bases into a finite alternating (+-) sum series of distributed
terms of binary (0,1) indicators. We show how by this method, distinctive
signals can be uncovered out of the intrinsic data organization of amino acid
progressions along their base positions. Results show a sort of 'ordered' (or
constant) to 'disordered' (or peaked) transition around the coronavirus S-spike
protein region. Together with our previous results for past variants of
coronavirus: Alpha, Beta, Gamma, Epsilon and Eta, we conclude that the
mapping into GenomeBits strands of omicron and delta variants can help to
characterize mutant pathogens.”
2. We have made an effort to present our manuscript in an intelligible fashion
with an standard English. At revision, we corrected few typographical and
grammatical errors.
3. (idem as 2)
4. As also replied to the Academic Editor, the Introduction section was revised
and completely rewritten. In particular, we added a discussion in the
Introduction section about the origins of the SARS-Cov-2 pandemic and cited
several new references. The abstract and conclusions have also been improved
based on the data presented
5. In the Conclusion section we now provide a general reinterpretation of the
results in the context of the present study and other evidence published, and
also briefly discussed implications for future research.
-oOo-
Submitted filename:
GenomeBits insight into omicron and delta variants of coronavirus pathogen
PONE-D-22-01925R1
Dear Dr. Canessa,
We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.
Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.
An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at
If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact
Kind regards,
Vladimir Makarenkov
Academic Editor
PLOS ONE
Additional Editor Comments (optional):
Reviewers' comments:
Reviewer's Responses to Questions
1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.
Reviewer #2: All comments have been addressed
**********
2. Is the manuscript technically sound, and do the data support the conclusions?
The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.
Reviewer #2: Yes
**********
3. Has the statistical analysis been performed appropriately and rigorously?
Reviewer #2: N/A
**********
4. Have the authors made all data underlying the findings in their manuscript fully available?
The
Reviewer #2: Yes
**********
5. Is the manuscript presented in an intelligible fashion and written in standard English?
PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.
Reviewer #2: Yes
**********
6. Review Comments to the Author
Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)
Reviewer #2: (No Response)
**********
7. PLOS authors have the option to publish the peer review history of their article (
If you choose “no”, your identity will remain anonymous but your review may still be made public.
Reviewer #2:
**********
PONE-D-22-01925R1
GenomeBits insight into omicron and delta variants of coronavirus pathogen
Dear Dr. Canessa:
I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.
If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact
If we can help with anything else, please email us at
Thank you for submitting your work to PLOS ONE and supporting open access.
Kind regards,
PLOS ONE Editorial Office Staff
on behalf of
Dr. Vladimir Makarenkov
Academic Editor
PLOS ONE