In-silico design of a potential inhibitor of SARS-CoV-2 S protein

The SARS-CoV-2 virus has caused a pandemic and is public health emergency of international concern. As of now, no registered therapies are available for treatment of coronavirus infection. The viral infection depends on the attachment of spike (S) glycoprotein to human cell receptor angiotensin-converting enzyme 2 (ACE2). We have designed a protein inhibitor (ΔABP-D25Y) targeting S protein using computational approach. The inhibitor consists of two α helical peptides homologues to protease domain (PD) of ACE2. Docking studies and molecular dynamic simulation revealed that the inhibitor binds exclusively at the ACE2 binding site of S protein. The computed binding affinity of the inhibitor is higher than the ACE2 and thus will likely out compete ACE2 for binding to S protein. Hence, the proposed inhibitor ΔABP-D25Y could be a potential blocker of S protein and receptor binding domain (RBD) attachment.

The virus has four structural proteins, known as S (spike), E (envelope), M (membrane), and N (nucleocapsid) proteins. An envelope-anchored SARS-CoV-2 spike (S) glycoprotein facilitates coronavirus entry into host cells [6,7]. The S proteins (~1200 aa) are class-I viral fusion proteins and exist as trimers with two of the receptor binding sites (RBDs) facing "up" and the third RBD facing "down". The monomeric S protein consists of a large ectodomain, a a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 single-pass transmembrane anchor, and a short intracellular tail at C-terminus [8,9]. A total of 22 N-glycosylation sites are present in S protein of SARS-CoV and SARS-CoV-2 at similar positions. However, SARS-CoV S protein has an extra glycosylation site at N370 [10][11][12][13]. SARS-CoV-2 spike (S) glycoprotein binds to the cell membrane protein receptor angiotensinconverting enzyme 2 (ACE2) to enter human cells [14,15]. Interestingly, SARS-CoV-2 virus does not use other coronavirus receptors such as aminopeptidase N and dipeptidyl peptidase 4 [1]. Following receptor recognition, the S protein is cleaved into S1 and S2 subunits at furinlike cleavage site [16][17][18]. The receptor binding domain (RBD) in S1 directly binds to the peptidase domain (PD) of ACE2 [19,20]. RBD consist of a core structure and a receptor-binding motif (RBM), which interacts with the claw-like structure of ACE2 [21,22]. Foremost, the Nterminal α1/α2 helices of ACE2 engage with the RBM motif. The S1 undergoes transient hinge-like motions to become either receptor accessible or inaccessible. RBD binding to cell receptor ACE2 induces the S1 to dissociate from ACE2, prompting the S2 for membrane fusion [18][19][20].
ACE2 is a type I membrane protein mainly expressed in lungs, heart, kidneys, and intestine [23][24][25]. Downregulation of ACE2 expression is associated with cardiovascular diseases [26]. The full-length ACE2 consists of an N-terminal PD domain and a collectrin-like domain (CLD) [24]. The CLD domain is followed by a single transmembrane helix and~40 aa long intracellular segment [24,25]. The primary physiological function of ACE2 is maturation of angiotensin (Ang). The catalytic PD domain cleaves Ang I to produce Ang-(1-9), which is converted to Ang-(1-7) by other enzymes. ACE2 also converts Ang II to Ang-(1-7) directly. These are peptide hormone that controls vasoconstriction and blood pressure [23].
A recent study revealed that human HeLa cells expressing ACE2 become susceptible to SARS-CoV-2 infection [1]. Overexpression of ACE2 enhances the disease severity in mice [27]. Injecting SARS-CoV S protein into mice further worsen the lung injury. However, the lung injury was reversed by blocking the renin-angiotensin pathway [28]. Thus, for SARS-CoV pathogenesis, ACE2 is not only the entry receptor of the virus but it also protects from lung damage. Furthermore, isolated SARS-CoV monoclonal antibodies are not able to neutralize SARS-CoV-2 [29]. Thus, despite the high sequence and structural similarity, there are notable differences between the SARS-CoV and SARS-CoV-2 RBDs. The affinity between the viral RBD and host ACE2 during the initial attachment of virus determines the susceptibility of hosts to the SARS-CoV infection [22,30,31]. Surface plasmon resonance (SPR) study showed that SARS-CoV-2 spike (S) protein binds to ACE2 with 10-to 20-fold higher affinity than the other SARS-CoV S proteins [32], a likely reason for SARS-CoV-2 high infectivity. Thus, viral entry into the host cell is a critical step in viral infection and could be exploited for therapeutics developments.
There is a rapid ongoing search for therapeutics against SARS-CoV-2. Computational approaches have been employed to discover the potential drugs against SARS-CoV-2 [3,[33][34][35][36]. Drugs targeting either the S protein or main protease have been screened. These approaches have led to the discovery of small molecules with high binding affinities to the aforementioned proteins. However, only a few of these molecules bind at the interface of the RBD−ACE2 complex. For example, hesperidin is predicted to bind at the RBD-ACE2 interface [37]. A short α-helical peptide from the protease domain (PD) of ACE2 showed highly specific and stable blocking to SARS-CoV-2 in a MD simulation study. These peptides blockers cover the full interface of RBD-ACE2 [38].
Till date, there are no clinically approved drugs or antibodies specific for SARS-CoV-2. Here, we have identified a double helical inhibitor that binds very tightly at RBD. The PROD-IGY server predicted a dissociation constant (K D ) of 0.6 nM [39]. MD simulation studies suggest that bound inhibitor is very stable, and it covers the whole RBD-ACE2 interface.
Potentially, it can block the binding of RBD to ACE2 and hence it can be used as a potential inhibitor.

Structural analysis
All the protein structures were downloaded from Protein Data Bank (PDB) and their codes are mention as they appear in the manuscript [40] . The structures were manually visualised in Coot, Pymol and UCSF Chimera [41][42][43]. Structural comparison was done using Pymol. The structure coordinates consisting of interface residues RBD (473-508) and ACE2 (21-100) extracted from PDB 6M17 [44] were used as input for structural homolog search using DALI server [45]. Multiple sequence alignment was done using Clustal omega server [46]. All the figures were prepared using Pymol [42].

Molecular docking
Molecular docking is a powerful tool that can be used to dock the binding of a peptide or ligand at the preferred site and orientation on a macromolecule. Docking programs sample the conformations of the peptide/ligand and then rank these conformations using a scoring function. The RBD fragment (336-518) of spike protein of SARS-CoV-2 (PDB 6M17) was used as a receptor molecule. The truncated peptide (ABP) was docked using HADDOCK2.4 program [47]. HADDOCK2.4 is a freely available platform for observing and analysing protein-protein interactions. Docking by HADDOCK program is driven by prediction of likely residues (called ambiguous interaction restraints (AIRs)) found at the interface. These residues may be active (interacting residue) or passive (nearby interacting residue). Binding interface of RBD and ABP was predicted using CPORT [48] and BIPSPI [49] servers. Before the docking protocol, the pdbs were "cleaned" by removing water and non-bonded ions. The docking results were cross verified using Cluspro 2.0 [50], pyDockWEB [51], and ZDOCK Server [52].

Molecular Dynamics (MD) simulations
The MD simulation was performed for RBD (aa 436-508) and ΔABP-D25Y complex in Gromacs 2020.2 for 100 ns [53]. Simulation inputs were built using CHARMM-GUI web with CHARMM36 force field [54,55]. The energy minimization and equilibration steps comprising gradual reduction of side chain and backbone restraints was carried out for 250 and 500 ps respectively. Time step was 2fs and the trajectory was saved every 10ps. During production run temperature was maintained at 303 K using velocity rescaling. Bond lengths were constrained with the LINCS algorithm. The pressure was controlled by isotropic coupling using Parrinello-Rahman barostat. A Verlet scheme was used for van der Waals and Particle Mesh Ewald electrostatics interactions within 1.2 nm. Van der Waals interactions were switched above 1.0 nm.

Results and discussion
Interface of the ACE2-RBD complex Conformation of ACE2s. The SARS-CoV-2 genome is about 80% and 96% identical with the genomes of SARS-CoV and bat coronavirus BatCoV RaTG13, respectively [56]. Therefore, the binding pattern between the three viral spikes and human ACE2 is expected to be rather similar. Superimposition of ACE2 from various RBD-ACE2 complexes on unbound ACE2 (pdb 1r42) shows that ACE2 superimposes very well (RMSD 0.4-1.0) (S1A Fig). Interestingly, majority of ACE2s maintain their native state upon binding to RBD, however in NL63 coronavirus (NL63-CoV) (pdb 3kbh), low pH-treated SARS-CoV (pdb 6acg), and SARS-CoV-2 (pdb 6vw1), ACE2 α1/α2 (residues25-102) move "upwards" and come nearer to α4. However, α4 and subsequent structures do not show any conformation shifts ( Fig 1A). In these structures, α1/α2 helices rotate about 8.6-11.0˚around the pivot Asn103. Further, residues 287-370 also rotate in a similar manner ( Fig 1A). Thus, in these structures, ACE2 adopts a more compact conformation compared to unbound ACE2. As cryo-EM structure (6acg) also shows such compact conformation, it is likely not crystal packing artefact but rather reflects a functional state of ACE2. Notably, in case of SARS-CoV-2, only a chimeric RBD consisting of SARS-CoV RBD core and SARS-CoV-2 RBM shows compact conformation [20]. However, it is not clear how the compact conformation of ACE2 will affect the RBD binding.

PLOS ONE
that uses ACE2 as receptor. It does not share any structural homology with SARS-CoV-2 virus (S1C Fig). NL63-CoV RBD binds only at hotspot 353 with unique interacting residues [57].
Conformation of RBD-ACE2 interface. The α1-α3 helices (residues 21-102) of ACE2 and the RBM region (residues 473-490 in SARS-CoV and 487-504 in SARS-CoV-2) superimpose very well (Fig 1C). The RMSD and interface RMSD (i-RMSD) of all aligned structures are less than 1.0 Å and 0.5 Å, respectively. While human ACE2 is similar in all structures in Fig  1C, the sequence identity of interface residues of SARS-CoV and SARS-CoV2 is 61%. Thus, ACE2 stabilizes the binding interface on RBD. The 15 residues long β sheet (also called RBM site) surrounded by two capping loops from the S protein interacts with ACE2. The N-terminal α1 helix of the ACE2 (aa 20-52) aligns in parallel with the β sheet of the RBD (Fig 1C). However, the α2 helix of the ACE2 is situated in a different plane and does not interact with RBD. Thus, adaptations of the viral RBD to the N-terminal helix of host ACE2 are critical for viral infections. Mutating one or few residues in RBD could change the binding affinity between the spike protein and ACE2 and thus transform a previously unsusceptible host to a susceptible one [56]. Shang et al, have suggested two virus-binding hot spots in ACE2-hotspot 31 (HS31) and hotspot 353 (HS353) (Figs 1C and 2A). RBM site at these hot spots exhibit more mutations in different corona viruses [20].
Comparison of RBM site of different corona viruses suggest that amino acids Tyr442, Leu472, Asn479, Asp480, and Thr487 (SARS-CoV numbering) affect viral binding to human ACE2. These residues correspond to Leu455, Phe486, Gln493, Ser494, and Asn501 in SARS--CoV-2 RBD, respectively. At hotspot31, Tyr442 interacts with Lys31 of ACE2 in SARS-CoV. In SARS-CoV-2, the corresponding residue Leu455 interacts with Asp30, Lys31, and His34 of ACE2. Because Tyr442 in SARS-CoV is replaced by a less bulky Leu455 in SARS-CoV-2, the salt bridge between Lys31 and Glu35 cannot form (Fig 2A). The residue Gln493 in SARS--CoV-2 RBM forms strong electrostatic interactions with Lys31 and Glu35 in ACE2. Whereas the corresponding residue Asn479 in SARS-CoV interacts only with His34 (Fig 2A). Asn479 is mutated to Lys and Arg in civet and bat, respectively. Its substitution to Lys lowers the affinity by 30-fold [56]. However, its substitution to Gln493 in SARS-CoV-2 enables it to form stronger interaction with ACE2 because of a shorter uncharged polar side chain compared to Lys (Fig 2A).
In contrast, hotspot Lys353 in SARS-CoV-2 is quite similar to SARS-CoV. RBM residue Asn501 (SARS-CoV-2) supports Lys353 in a similar way as Thr487 (SARS-CoV) does. Thr487 in SARS-CoV and Asn501 in SARS-CoV-2 maintain interactions with Tyr41 of ACE2. However, Thr487 substitution to Ser increases the K D by 20-fold [56]. Thus, a bulky polar amino acid is preferred at this position. Its substitution with the bulkier polar amino Asn in SARS--CoV-2 will increase the affinity compared to Thr in SARS-CoV (Fig 2B).
Similarly, mutation Phe486 increases the interaction with Leu79, Met82, and Tyr83 in SARS-CoV-2, whereas corresponding residue Leu472 in SARS-CoV interacts with only Leu79 and Met82 ( Fig 2C). As discussed above, Phe486 is in the large flexible loop (480-CNGVEGFNC-488) of RBD. Due to this, Phe486 is buried deep in the hydrophobic pocket formed by residues Gln24, Phe28, Leu79, Met82, Tyr83, and Leu97 ( Fig 2C). Lastly, the residue Asp480 in SARS-CoV and the corresponding residue Ser494 in SARS-CoV-2 are not involved in direct interactions with ACE2 (Fig 2A). However, the relatively short and neutral side chain of Ser in SARS-CoV-2 sterically favours the interaction compared to Asp at this position in SARS-CoV.
Thus, the unique amino acids at the specific positions in the SARS-CoV-2 S protein energetically favour the contacts with the ACE2 receptor by stabilizing many interactions. Hence, SARS-CoV-2 S protein shows the highest affinity for human ACE2. Therefore, an inhibitor could be designed to destabilize these interactions to block the association of RBD and ACE2.

Structural comparison with non-ACE2 homologs
A search for structurally similar interfaces in PDB database was performed using the DALI server [45]. Apart from ACE2, the most structurally similar proteins with known function were de novo designed peptides used for drug delivery (Table 1). Interestingly, the designed proteins (pdb 6n9h and 6naf) bind the drug amantadine (a C3 symmetric small molecule) and are called ABP (amantadine-binding protein) [58]. Despite low sequence identity, we observe close alignment of ABP with the α1/α2 helices of ACE2 (S2 Fig). Thus, ABP is structurally very similar to the RBD binding region of ACE2. Hence, ABP might compete with ACE2 for binding with SARS-CoV-2 S protein. The fragment α1/α2 (residues 21-100) from PD domain of ACE2 has been suggested as inhibitor for the RBD protein [38]. MD simulation studies have confirmed a strong binding of ACE2 peptides to RBD of SARS-CoV-2. Thus, using this type of peptides with drugs attached to them could prove to be an efficient treatment strategy against the COVID-19. Such peptides together with bound drug could be attached to the surface of nanoparticles.

Design of peptide inhibitor
Based on the above analysis, we set to design a peptide inhibitor that can interact with the viral RBD and prevents its binding to ACE2. The variants of ABP were tested for their ability to bind the S protein. BIPSPI server [49] predicted the binding interface at the RBM site ( Table 3). The peptides were docked on various docking server to confirm the binding. In all the docking studies ABP binds on the RBD surface where α1 helix of ACE2 is known to bind. However, ABP is slightly longer than ACE2 peptides. So, to complement the RBM surface, we removed 4 residues from N-and 10 residues from C-terminus of ABP (called ΔABP). Bound peptide forms strong interaction with RBD and specially interferes with critical residues involved in binding (Figs 3B and 4). PRODIGY server predicted a dissociation constant (K D ) of 2.8 nM (ΔG -11.7 kcal mol -1 ). A close examination of aligned structure of ΔABP and α1/α2 helices of ACE2 suggests that both have similar side chains except Val10/Lys26 and Asp25/ Tyr41 (ΔABP/ACE2) (S2 Fig). Thus, we sequentially introduce the aforementioned mutations into ΔABP (ΔABP-V10K, ΔABP-D25Y and ΔABP-V10K-D25Y) and docked the mutant peptides into RBD ( Table 2). The interaction restraints were generated using CPORT and BIPSPI servers [48,49]. As expected from structural homology of ΔABP with α1/α2 of ACE2, the interaction restraints were same as CoV-SARS-2 RBD and ACE2 complex interface (Table 3). RBM site residues of S protein and all residues of ΔABP were used as active residues for the HADDOCK computation. Best complex was selected based on cluster size, HADDOCK score, and electrostatic energy. HADDOCK score is computed as a weighted sum of van de Waals, electrostatic, desolvation, and ambiguous interaction restraints energies. Electrostatic and van der Waal forces dominate in RBD and ΔABP interactions. The ΔABP-D25Y showed the best HADDOCK A) The α1 and α2 helices of ACE2 (6m17, grey) are structurally very similar to the de novo designed peptides 5j0h (forest), 6naf (blue), 6msq (red), 6msr (magenta), and 6n9h (or ABP; orange). B) The ΔABP inhibitor binds at the SARS-CoV-2 RBM site. C) Sequence alignment of α1 and α2 with ABP peptide sequence. Similar residues are shown in red, whereas identical sequences are highlighted as white letters on a red background. This diagram was prepared using the program ESPript [67].

Interaction of ΔABP-D25Y
HADDOCK docking analysis suggests that all ΔABPs bind at the RBM site of S protein. The bound ΔABPs cover the whole surface of the RBM site. However, ΔABPs binding does not exactly overlap with the α1/α2 of ACE2 (Figs 3B and 4A). The position of α1 helix of both ACE2 and ABP is very similar. However, the α2 helix of ACE2 and ΔABP are situated in a different plane and do not overlap with each other (Figs 3B and 4A). The α2 helix of ACE2 does not form direct contact with the virus RBD whereas both helices of ΔABP interact with the  RBD. Thus, ΔABP covers a larger surface area on RBD than ACE2. Interestingly, Asp25Tyr mutation in ΔABP-D25Y and ΔABP-V10K-D25Y shifts the peptides laterally by~6.5 Å. However, the bound peptide remains in same plane as ΔABP and ΔABP-V10K (Fig 4B). The Leu455 of SARS-CoV-2 RBD is buried deep in the hydrophobic pocket. The Leu455 forms strong hydrophobic interactions with Asn53, Asn54, and ILE57. (Fig 4C). The N-terminus of the peptide makes direct contact with the RBD loop (residues 470-489). Particularly, Phe486 residue makes cation-π interactions with Arg18 ( Fig 4D). Similarly, the Gln493 interacts hydrophobically with Leu24, Leu27, Glu28, and Leu31 of the peptides. The adjacent Ser494 is also in close contact with Leu31 of the peptide (Fig 4D). The residue Asn501 contacts with Leu43 and Asn47 of ΔABP-D25Y (Fig 4E). The mutated residues Asp25Tyr and Val10Lys in ΔABP-V10K-D25Y interact with Glu484 and Ser477, respectively (Fig 5A and 5B). Thus, the individual mutations Val10Lys and Asp25Tyr improve the affinity. However, the combined effect of mutations Val10Lys and Asp25Tyr is detrimental and decreases the affinity of the double mutant inhibitor to SARS-CoV-2 RBD.
Thus, based on the HADDOCK score, the ΔABP-D25Y interacts very strongly with S protein at the RBM site of viral S protein. The biolayer interferometry and surface plasmon resonance experiment showed the dissociation constant (K D ), 1.2 nM and 4.7 nM respectively, for ACE2 and ectodomain S protein interaction [10,59]. The PRODIGY server predicted K D value of ACE2 binding to RBD is 10 nM (ΔG -10.9 kcal mol -1 ). The slight difference between experimental and calculated K D values may be because full-length S protein was used in experiments. Since full length S protein and ACE2 complex structure is not solved yet, so we could not calculate the K D for the complex. The PRODIGY server predicted the dissociation constant (K D ) of ΔABP-D25Y to RBD, 0.6 nM at 25.0˚C (ΔG -12.6 kcal mol -1 ). Thus, ΔABP-D25Y binds RBD with much higher affinity. Particularly, ΔABP-D25Y directly contacts the unique residues Table 3. BIPSPI server prediction of binding interface of RBD and ΔABP peptide.

PLOS ONE
of RBM involved in interaction with ACE2. Protein-protein interactions (PPIs) have emerged as important targets in medicinal chemistry. PPI inhibitors can bind over larger surface area and are therefore more efficient than small chemical ligands [60]. The peptides HR1P and HR2P have been used to inhibit the MERS-CoV infection particularly the replication and fusion steps in calu-3 and HFL cells [61]. Similarly, three peptides SARS WW-III , SARS WW-IV and MHV WW-IV showed antiviral activity [62]. Mutant mucropin-M1 peptide (LFRLIK-SLIKRLVSAFK) is derived from antimicrobial peptide mucropin ((LFGLIPSLIGGLVSAFK) AMP by replacing several Gly and Pro residues either with Lys or Arg. Mutant mucropin-M1 was active against SARS-CoV inhibiting its replication [63]. The two peptides K12 and K29 obtained from nsp10 of SARS-CoV inhibited the replication of SARS-CoV [64]. Similarly, the peptide OC43-HR2P and its optimized form EK1 are derived from the HR2 domain of human coronavirus (HCoV-OC43). The EK1 showed broad viral membrane and host membrane fusion inhibitory activity against multiple human CoVs. Interestingly, the EK1 forms stable helical complex with HR1 domain of S protein of various coronaviruses [65]. Thus, short α helical peptides have proved effective therapeutics against coronaviruses.

MD simulation
Root mean square deviation (RMSD) measures the spatial differences between a starting and simulated structure. Similarly, root mean square fluctuation (RMSF) measures the average displacement of a particle over time from a reference position. The RMSF is useful to identify the rigid and flexible regions in protein. The RMSD and RMSF were calculated between initially docked structure and simulated structure for all protein atoms (Fig 5C and 5D). The RBD and ΔABP-D25Y complex has overall low RMSD (~0.5 nm). RMSD increases in the beginning up to 18 ns. The plateau after 18 ns indicates that the complex does not fluctuates too much from reference ( Fig 5C). A sudden jump in RMSD at around 70 ns is possibly due to the flexibility of capping loop ( Fig 5C) and end terminal residues. However, the possibility of insufficient sampling due to rough energy landscapes cannot be ruled out [66]. The RBD capping loop (residues 472-488) is flexible during simulation (Fig 5D). The RMSD of interface of RBD and ΔABP-D25Y complex is very low (~0.2 nm). In contrast, the RMSD of capping loop is very close to the overall complex ( Fig 5C). So, the overall RMSD (~0.8 nm) is mainly contributed by the capping loop and similar flexible region of the complex. However, the interface of the complex is stable, and the inhibitor does not dissociate from RBD. It is important for inhibitors to have complementary conformation matching its target to have better affinity. An inhibitor should have selective binding and low RMSD for the critical amino acids. Such peptides can be easily be chemically synthesized or, alternatively, can be produced in large quantity by recombinant overexpression. Affinity can be further improved by attaching many such inhibitors on the surface of nanoparticles, dendrimers, and clusters. Though we have verified our findings by various in silico methods, further experimental verification is warranted.

Conclusion
By using computational methods, a protein inhibitor was designed to target the viral S protein.
The inhibitor was structurally based on the α helical region of ACE2 at the S protein binding interface. Docking and MD simulation studies suggest that the ΔABP-D25Y could acts as a potential blocker of SARS-CoV-2 infection. It covers the entire RBM surface and its shape is complementary to the RBD groove. Thus, it fits nicely at the RBD groove and block S proteins attachment to ACE2. Predicted binding affinity of the inhibitor is higher than the ACE2 and thus will compete with ACE2 for binding with S protein. Thus, proposed inhibitor could be a potential blocker of S protein and RBD association and thereby hinder virus entry to cells.