Fig 1.
Design of chimeric spike proteins.
The process of chimeric S1/S2 spike protein design is illustrated for a sample betacoronavirus sequence. The sequence of SARS-CoV-2 (structure rendered in magenta) is spliced with that of Bat Hp-betacoronavirus/Zhejiang2013 (predicted structure rendered in cyan) at the predicted S1/S2 junction. The predicted structure of the chimera is rendered in a mix of cyan and magenta representing the parental sequences for each domain. pLDDT (predicted local distance difference test) scores from AlphaFold for both the parental domains and chimeric sequences are calculated, and these are averaged per residue to yield the relative stability score.
Fig 2.
Relative sequence entropy among analyzed coronavirus spike proteins.
Sequence entropy was calculated for a multiple sequence alignment of all coronavirus sequences used and rendered on the 6VSB structure, where green represents lowest sequence entropy and blue represents highest. As expected, the receptor-binding domain and N-terminal domain have the greatest sequence entropy, and loops tend to have higher entropy than adjacent structural elements.
Fig 3.
Predicted stability of coronavirus spike chimeras.
The relative stability score for each S1/S2 chimera is plotted against the sequence similarity between the parental S1 sequence and SARS-CoV-2. The similarity between the two domain sequences was determined using EMBOSS [49]. Sequences fall into two broad groups: one with high similarity and a gain in relative stability, and one with low similarity and a loss in relative stability. The outliers that are low similarity but relatively high relative stability are of particular interest for immunogen design. Predicted high-stability chimeras selected for simulation are plotted in black, and predicted low relative-stability chimeras used as controls are plotted in magenta. Additional low overall-stability controls are plotted in purple. Data are tabulated in S4 Table.
Fig 4.
Predicted and simulated structures of top-scoring chimeras.
Structures of two of the top-scoring chimeras are rendered as follows. Chimeras from Rousettus bat coronavirus GCCDC1 (panels (a) and (b)) and Hedgehog coronavirus 1 (panels (c) and (d)) are shown before and after 100 ns of molecular dynamics simulation (left column), with the before and after structures also superimposed on renderings of the SARS-CoV-2 6VSB PDB structure (panels b and d, dark green). The chimera from Bat Hp-betacoronavirus Zhejiang2013 is rendered in panel (e), with a zoomed rendering in panel (f) showing the n-terminal and furin-cleavage loops that were initially modeled as extended becoming more compact over the simulation. For each chimera, the structure before simulation is rendered in a lighter shade and after simulation is rendered in a darker shade. The 6VSB SARS-CoV-2 structure has one receptor-binding domain in the “up” conformation, whereas both chimeras rendered are modeled as fully down and remain so throughout the simulation.
Fig 5.
Structural stability of simulated chimeras.
Molecular dynamics simulations were computed for predicted structures and root-mean-squared deviation (RMSD) from a reference structure is plotted versus time. Plots are given for (a-b) the eight chimeras in the low-sequence-similarity cluster with the highest predicted relative stability(c-d) five low-stability controls. SARS-Cov-2 is plotted in both panels as an additional control. Panels (a,c) plot RMSD from the starting structure, whereas panels (b,d) plot RMSD from the ending structure. As expected, the low-stability controls showed the greatest RMSD, with one showing gross structural changes. The eight predicted-stable chimeras and SARS-CoV-2 all showed an initial increase in RMSD followed by a stabilization over the course of the 100-ns simulation. This can also be seen in RMSD plots relative to the end of the simulation (S6 Fig).
Fig 6.
Per-residue structural stability of simulated chimeras.
Root-mean-squared fluctuation (RMSF) values are plotted per residue for each of the simulated chimeras. Values were calculated on nanosecond intervals throughout the simulation trajectory. Panel (a) shows the predicted high-stability chimeras, and panel (b) shows the predicted low-stability chimeras. SARS-CoV-2 is included as a comparator. As expected, major loops as well as the C- and N-termini show the greatest fluctuation in the high-stability chimeras, and the low-stability chimeras have globally greater fluctuations.
Fig 7.
Expression of chimeric spike proteins on purified pseudoviral particles as assessed by single-particle immunostaining.
Pseudoviral particles were purified, biotinylated, immobilized in a microfluidic flow cell via a biotin-neutravidin-biotin linkage, and then immunoassayed using an anti-S2 antibody. The number of spike-positive particles was then scaled by the total protein in the pseudoviral preparation to give a relative measure of spike expression per pseudovirus, which is plotted here. Eidolon Bat and Bat GCCDC1 showed substantially better expression than native SARS-CoV-2, Bat2006 was comparable to native SARS-CoV-2, while Zhejiang2013 was somewhat lower. BatHKU25 and the two predicted-unstable chimeras, Swine and Sorex T14 showed substantially worse expression, likely indicating spike instability at the stage of expression and pseudoviral budding. Error bars indicate 90% confidence intervals from bootstrap resampling across fields of view.
Fig 8.
Thermal stability of pseudoviruses expressing chimeric spike proteins.
Plotted are dF/dT values for pseudoviruses expressing SARS-CoV-2, 5 predicted-stable chimeric spikes, 2 predicted-unstable chimeric spikes, and bald pseudovirus produced with no spike. The dF/dT values (first derivative of the 350/330 nm fluorescence ratio with respect to temperature) are plotted in panel a, and the background-subtracted dF/dT are plotted in panels b-c, using bald pseudovirus as an estimator of non-spike background. Fluorescence ratios were smoothed using locally weighted scatterplot smoothing and showed peaks that were robust to choice of smoothing parameter over the range 40-80, with smoothing parameter of 50 shown in a-b and 80 shown in c. Eidolon bat chimeras showed stability comparable to native SARS-CoV-2, while Bat GCCDC1 chimeras were less well determined but consistent with thermal stabilization relative to SARS-CoV-2. Bat 2006 chimeras showed lower signal but were likely similar to SARS-CoV-2. Predicted-unstable chimeras had poor expression on pseudovirus produced in cell culture at 37 C (Fig 7) and were thus likely unstable at the point of expression and pseudoviral budding.