Molecular dynamics analysis of N-acetyl-D-glucosamine against specific SARS-CoV-2’s pathogenicity factors

The causative agent of the pandemic identified as SARS-CoV-2 leads to a severe respiratory illness similar to SARS and MERS with fever, cough, and shortness of breath symptoms and severe cases that can often be fatal. In our study, we report our findings based on molecular docking analysis which could be the new effective way for controlling the SARS-CoV-2 virus and additionally, another manipulative possibilities involving the mimicking of immune system as occurred during the bacterial cell recognition system. For this purpose, we performed molecular docking using computational biology techniques on several SARS-CoV-2 proteins that are responsible for its pathogenicity against N-acetyl-D-glucosamine. A similar molecular dynamics analysis has been carried out on both SARS-CoV-2 and anti-Staphylococcus aureus neutralizing antibodies to establish the potential of N-acetyl-D-glucosamine which likely induces the immune response against the virus. The results of molecular dynamic analysis have confirmed that SARS-CoV-2 spike receptor-binding domain (PDB: 6M0J), RNA-binding domain of nucleocapsid phosphoprotein (PDB: 6WKP), refusion SARS-CoV-2 S ectodomain trimer (PDB: 6X79), and main protease 3clpro at room temperature (PDB: 7JVZ) could bind with N-acetyl-D-glucosamine that these proteins play an important role in SARS-CoV-2’s infection and evade the immune system. Moreover, our molecular docking analysis has supported a strong protein-ligand interaction of N-acetyl-D-glucosamine with these selected proteins. Furthermore, computational analysis against the D614G mutant of the virus has shown that N-acetyl-D-glucosamine affinity and its binding potential were not affected by the mutations occurring in the virus’ receptor binding domain. The analysis on the affinity of N-acetyl-D-glucosamine towards human antibodies has shown that it could potentially bind to both SARS-CoV-2 proteins and antibodies based on our predictive modelling work. Our results confirmed that N-acetyl-D-glucosamine holds the potential to inhibit several SARS-CoV-2 proteins as well as induce an immune response against the virus in the host.


Introduction
The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has led to the pandemic cases with significant rises in number of patients in the world. The main theory supported by scientific evidence was the outbreak started from a local seafood market in Huanan showed that human-to-human transmission of the virus was not limited [1]. Globally, pandemic has already caused more than 160 million confirmed cases, in nearly 210 other countries (including around 5 million cases in Turkey, 4.9 million cases in Russia and 371.000 cases in the Republic of Belarus), which has become a unique public health event attracting public attention [2].
SARS-CoV-2 is a positive-chain RNA virus that belongs to the beta group of coronaviruses. The SARS-CoV-2 genome consists of approximately 29.700 nucleotides and has 79.5% sequence identity with SARS-CoV. It has a long polyprotein ORF1ab at the 5' end, which encodes 15 or 16 non-structural proteins. The 3' end of the genome encodes 4 major structural proteins, including the spike protein (S), the nucleocapsid protein (N), the membrane protein (M), and the envelope protein (E) [3,4]. SARS-CoV-2 binds to the receptor of angiotensinconverting enzyme 2 (ACE2) on the host cell for virus penetration and subsequent pathogenesis, leading to severe respiratory disease with symptoms of fever, cough, shortness of breath and severe cases that could be fatal [5,6].
Coronaviruses have error-prone RNA-dependent RNA polymerases, mutations, and recombination events occurring frequently that raise the concern regarding its rapidly strengthening and its increased capacity to cause disease [7]. Many mutations detected on the virus genome suggest the formation of the strains with high pathogenicity and contagiousness, which makes the control of the pathogen quite difficult [8]. In our previous study, we have investigated unvarying regions with less mutations than the other parts of SARS-CoV-2 genome obtained from 134 different genome sequences of the GISAID database from distinct parts of the world. The amino acid sequence of the conserved region (ORF1ab region) was obtained, then subjected to homology modeling and introduced N-acetyl-D-glucosamine (D-GlcNAc) as a potential inhibitor for the selected proteins; spike receptor-binding domain bound with ACE2 (PDB 6M0J) and RNA-binding domain of nucleocapsid phosphoprotein (PDB 6WKP) from SARS-CoV-2 [9]. Previous studies have also indicated that the effectiveness of D-GlcNAc against influenza and as a starting material of oseltamivir, a purposed drug for SARS-Cov-2 treatment, D-GlcNAc has been transformed into azido group followed by implantation of a 3-pentoxy group of the desired chemical structure [10,11]. While, oseltamivir has very complex structure which causes side effects involving nausea, vomiting, headaches, kidney, and psychiatric events [12]. In another study, Song and colleagues reported that the O-GlcNAcylation of mitochondrial antiviral-signaling protein (MAVS) which is a key mediator of interferon signaling that plays role in regulation to activate the host innate immunity against RNA viruses [13]. Accordingly, a previous research demonstrated that, azithromycin can be combined with glucosamine early in the course of RNA virus infections, which could aid the control of enhanced type 1 interferon induction [14]. Currently, there are some specific antiviral vaccines or therapies to treat SARS-CoV-2, but their long-lasting effects remain unclear. We should also consider drug repurposing due to its beneficial properties for searching new advantages or purposes of the existing drugs that could reduce the cost and time of drug development and lower the risk of unexpected side effects of the drug [15,16].
In this present study, we focused on the interaction of four different proteins playing major role in SARS-CoV-2 pathogenesis with D-GlcNAc by using molecular docking and further validating the results with molecular dynamics [9].

Preparation of structure files for molecular docking analysis
As for structure preparations, all of the four protein PDB files had their water and heteroatoms (except for ions) removed using UCSF Chimera, then they were loaded into AutoDock 4.2 where only polar hydrogen atoms and Kollman charges were added, the structures were then exported as PDBQT files. As for the ligand, it was prepared using a similar procedure with further geometrical optimization using the MMFF94 forcefield [22][23][24].

Molecular docking analysis
Docking experiments were performed using AutoDock Vina on a Linux cluster [25]. Docking of 6M0J and 6WKP were performed with exhaustiveness of 64 and 24 modes, the grid box calculated for each were 25 x 50 x 20 Å from their center (-12.972, 19.833, 0.667) and 14 x 16 x 12 Å from their center (4.36, -5.444, 22.056) along their X, Y, and Z axes, respectively. Blind docking experiments were performed for 6X79 and 7JVZ with exhaustiveness of 128 and 32 modes, the grid box calculated for each were 122.456 x 130.663 x 149.16 Å and 40.222 x 63.997 x 61.273 Å from the center of the protein along their X, Y, and Z axes, respectively. The best docking pose with the highest affinity (lowest kcal/ mol) was selected.

Preparation of structure files for molecular dynamics analysis
The best docking pose for each structure was produced using UCSF Chimera's ViewDock plugin and later the protein and ligand (D-GlcNAc) were saved in separate files. The ligands poses were submitted to LigParGen for the generation of topology and parameter files using OPLS-AA forcefield [26][27][28]. The protein structure files for each protein were generated using OPLS-AA/M topology for proteins in VMD using the psfgen builder plugin [27,29]. The protein structure file (PSF) and the protein data bank file (PDB) for each protein was merged with its corresponding ligand PSF and PDB files (from LigParGen) into a single protein-ligand complex PSF and PDB files. Each protein-ligand complex was solvated with water in a rectangular box, the box dimensions were determined such that every edge was 5 Å away from the complex. The box was further neutralized with 0.15 mol/ L NaCl such that a distance of 5 Å was maintained between the ions and the solute, and between the ions themselves.

Molecular dynamics analysis
Molecular dynamics simulations were performed using the University of Illinois Nanoscale Molecular Dynamics (NAMD) software [30]. Each simulation was run using the solvated and ionized protein-ligand complex with OPLS-AA/M protein parameter files and their respective D-GlcNAc parameter files from the LigParGen server, as for the water and ions, the topology file from CHARMM-GUI was utilized [31]. All the protein-ligand complexes were minimized (energy minimization) with the steepest descent algorithm under NVT ensemble at 310 K for 50.000 steps except 6M0J which was minimized with NVT ensemble at 310 K for 100.000 steps, the graphs of backbone Root-mean-square deviation (RMSD) against the frames were plotted to check if the systems were successfully minimized. After confirming the minimization, each protein-ligand complex was simulated for 1.000.000 steps (�1 ns), each time step was set to 1 fs and the outputs were written to the trajectory every 50 steps, the simulations were run under Periodic Boundary Conditions (PBC) with space partitioning cutoff set to 10, initial and bath temperature set to 310 K with Langevin dynamics at the same temperature. All the water molecules and the protein-ligand complex were wrapped around the PBC. The graph of backbone RMSD against the frames were plotted for each simulation to analyze the protein-ligand complexes' stability.

D-GlcNAc-antibody affinity analysis
Two neutralizing antibodies, one specific to the SARS-CoV-2 spike protein from PDB: 7JV2 and another anti-Staphylococcus aureus from PDB: 6P9H were extracted from their structures via UCSF Chimera, the structures were prepared for docking with the same configurations in the previous step, and blind docking was performed via AutoDock Vina. The top pose for each antibody-D-GlcNAc was simulated via NAMD with the same procedures and configurations as the previous molecular dynamic simulation. The backbone RMSD of the antibodies and D-GlcNAc was plotted for further analysis.

Molecular docking
All the results are promising as D-GlcNAc has shown a decent affinity to each of the protein structure. Table 1 shows the results of the best-docked pose in the molecular docking experiment using AutoDock Vina. The docking poses corresponding to the values in Table 1

Molecular dynamics and all analyses
The RMSD for the backbone (C, Cα, and N) of each protein-ligand complex against their minimization plot was analyzed to check if the minimization steps have successfully minimized the protein-ligand complex, as shown in Fig 5, all the slopes are plateauing near the end frames hence 100.000 steps for 6M0J and 50.000 steps for 6WKP, 6X79 and 7JVZ have produced a minimized structure for the molecular dynamic simulation. Similarly, the backbone RMSD of the protein in protein-ligand complex against their corresponding frames were extracted and plotted, the RMSD of the ligand within the same complex was also plotted in the same graph (Fig 6). The data used for plotting the graphs of both the minimization and production runs are included in S2 Data.
Both the docking analysis of S2H13 neutralizing antibody Fab fragment and that of anti-Staphylococcus aureus antibody (STAU-281 Fab) with D-GlcNAc has shown an affinity of -6.5 kcal/ mol and -6.2 kcal/ mol which satisfies our hypothesis that D-GlcNAc has a significant affinity towards human antibodies (Fig 7) regardless its specificity hence, it has the potential to induce the immune response by bridging a bond between SARS-CoV-2 and antibodies. The

PLOS ONE
RMSD analysis of both the antibodies tested has shown that D-GlcNAc remains at very close proximity to the antibodies (Figs 7 and 8), supporting our hypothesis further. The detailed logfile from the AutoDock Vina and the data used for the RMSD plot are included in S3 and S4 Datas, respectively.

Discussion
The structural differences considering the active sites of both Mpro proteins cause difficulties in modelling for molecular target. In fact, computational prediction studies carried out involve a massive virtual screening for Mpro inhibitors of SARS-CoV-2 using Deep Docking [33]. The recent studies focused on virtual screening for putative inhibitors of the same main protease of SARS-CoV-2 relied on the clinically approved drugs [34][35][36][37][38][39] and the compounds according to different databases [40][41][42]. Molecular docking is a helpful tool to estimate the binding affinity

PLOS ONE
between protein and ligand by using scoring functions [43]. Molecular dynamics simulation as a computational method provides insights into the interaction and atoms according to laws of physics [44].
It is also important to mention the British, South African, and Brazilian variants of SARS--CoV-2 have circulated in the population (as of the time of this article). These variants have shown higher transmissibility due to the collection of few mutations in its spike protein sequence [45][46][47]. The findings of Cheng and colleagues provided a higher affinity to the human ACE2 receptor, among the mutations, are the N501Y, K417N, and E484K mutations [46]. All these mutations have affected the receptor-binding domain of the spike protein. We assumed that the loading of D-GlcNAc binding on proteins playing role in pathogenicity and the recognition of SARS-CoV-2 entering host cell through ACE2 receptor side could

PLOS ONE
be increased by the immunologic response. Additionally, D-GlcNAc, which is present in the cell wall of bacteria, has been the reason for the recognition of human IgG1 mAb (F598) through a large groove-shaped binding site cause of the entire light-and heavy-chain interface embedded at least five GlcNAc residues receptor affinity, which makes it possible to recognize the bacterial cells [48,49]. Therefore, it can be suggested that D-GlcNAc can shift immune response, which is responsible for combating with the invading bacteria [49,50], to SARS--CoV-2. In a previous study, D-GlcNAc cluster antigens (TGCA) account for the highest immunoreactivity [51], which could be suggested for the activation of the immune system against SARS-CoV-2. In another study on HIV-1, cause of its structures involving along with glycan-array binding with the variation of the oligosaccharide have shown an existing affinity that affects the neutralizing of antibodies targeting the glycan-shielded trimer on the virus [52]. Correspondingly, this structure and its binding with molecular dynamics could affect the neutralizing of antibodies targeting the virus, we can postulate that this process could change in favor of recognition by antibodies when D-GlcNAc is administrated. This D-GlcNAc stimulated mechanism could also increase the defense capacity of the immunologic response to SARS-CoV-2 invasion. Furthermore, N-acetyl-D-glucosamine-coated polyamidoamine structures induced upregulation of antibody formation on the rat recombinant cells [50] that the similar response could be expected in the human cell.

Conclusion
The SARS-CoV-2 epidemic has been a major reason threatening human health and resulted in an important economic recession in the world. Our previous findings indicated that D-GlcNAc is a potential compound suggesting strong interaction with proteins involving the ORF1ab region. The data of our previous study showed a drastically unvarying sequence fragment in whole paired genome sequences that can be used as a target origin for designing an effective drug to SARS-CoV-2 [9]. Our results stressed that D-GlcNAc could be considered as a therapeutic option after testing its different concentrations which would be repositioned against SARS-CoV-2.

PLOS ONE
In the present study, we have carried out molecular dynamics analysis on different proteins of SARS-Cov-2 besides the verification of our previous findings related to binding possibilities with D-GlcNAc and docking analysis. Our results confirmed and highlighted the strong binding of D-GlcNAc to the domain region of the selected proteins on aggressive mutant variants (D614G). We believe that the bacterial cell recognition system could be directed to virus neutralization with the help of D-GlcNAc administration, which results in attaching on both virus and antibodies surface that could increase the epitope binding possibilities via recognition surface, besides the other antibodies. We strongly recommend testing of D-GlcNAc with designed and conducted randomized clinical trials in patients due to its high potential as a repurposed compound.