Potential bioactive glycosylated flavonoids as SARS-CoV-2 main protease inhibitors: A molecular docking and simulation studies

A novel coronavirus responsible of acute respiratory infection closely related to SARS-CoV has recently emerged. So far there is no consensus for drug treatment to stop the spread of the virus. Discovery of a drug that would limit the virus expansion is one of the biggest challenges faced by the humanity in the last decades. In this perspective, to test existing drugs as inhibitors of SARS-CoV-2 main protease is a good approach. Among natural phenolic compounds found in plants, fruit, and vegetables; flavonoids are the most abundant. Flavonoids, especially in their glycosylated forms, display a number of physiological activities, which makes them interesting to investigate as antiviral molecules. The flavonoids chemical structures were downloaded from PubChem and protease structure 6LU7 was from the Protein Data Bank site. Molecular docking study was performed using AutoDock Vina. Among the tested molecules Quercetin-3-O-rhamnoside showed the highest binding affinity (-9,7 kcal/mol). Docking studies showed that glycosylated flavonoids are good inhibitors for the SARS-CoV-2 protease and could be further investigated by in vitro and in vivo experiments for further validation. MD simulations were further performed to evaluate the dynamic behavior and stability of the protein in complex with the three best hits of docking experiments. Our results indicate that the rutin is a potential drug to inhibit the function of Chymotrypsin-like protease (3CL pro) of Coronavirus.

The virus is responsible of an elevated number of deaths as its transfer speed is higher than other coronaviruses. SARS-CoV-2 infection has then become a threat to public health. This disease caused by SARS-CoV-2 is affecting patients of a mild disease in most of them. However, in few cases, patients develop severe acute respiratory distress syndrome [13].
Until today, no specific vaccine has been developed to stop the spread of the virus and the need of a cure is now urgent. Viruses possess the ability to adapt and mutate quickly, leading research to find new therapeutic approaches. The main approach is to block the life cycle of the virus or to interfere in its fusion with the membrane so that the life cycle can be interrupted [14][15][16][17].
Another approach is the development of new drugs based on computational research tools. These tools are now fundamental to gain time and accuracy for developing new pharmacophores [18][19][20]. The process of identifying novel antiviral drugs to face the disease can rely on the screening of existing natural compounds known for their antiviral activity [21].
Flavonols, the most widely spread flavonoids (Fig 1) are characterized by a 3-hydroxy-2-phenylchromen-4-one backbone; and quercetin is the first representative compound and the most extensively investigated. Recent investigations also pointed out the antiviral activity of quercetin against a wide spectrum of influenza virus strains. It interacts with influenza hemagglutinin protein, thereby inhibiting viral-cell fusion [37]. Several other flavonols and derivatives have antivirals activities as a sulfated substituted rutin, showing significant activity against different HIV-1 isolates [38]. Several flavonoids are known by inhibiting virus proliferation either by stopping virus multiplication or by blocking cell entry. Indeed, several natural compounds including baicalin, scutellarin, hesperetin, nicotianamine and glycyrrhizin were predicted to have a capacity to binding ACE2 receptor with potential anti-SARS-CoV-2effects [36]. Moreover, quercetin, daidzein, puerarin, epigallocatechin, epigallocatechingallate, gallocatechingallate and kaempferol were reported to inhibit the proteolytic activity of SARS-CoV 3CL hydrolase [39]. Proteases and spike proteins are targets of choice for inhibition of SARS and MERS, and constant work is made to find novel molecules which can be able to interfere with virus life cycle either by computational methods or by experimental investigation [17,[40][41][42].
The SARS-CoV-2 main protease (Mpro), a chymotrypsine like hydrolase (3CL hydrolase) is an attractive target for anti-CoV drug design due to its responsibility for the maturation of main functional enzymes such as replicase and helicase [43][44][45].
In this work, a screening of natural flavonoids has been made and their potential inhibitory effect against SARS-CoV-2 Mpro by molecular docking and molecular dynamics has been tested. Indeed, our computed data suggest that strong interaction occur between flavonoids with sugar moieties and the main protease of the SARS-CoV-2virus. These results can represent a promising starting point for antiviral therapies that are alternative to the vaccination strategy.

2-1-Ligand and protein preparation
The three-dimensional structure of flavonoids were obtained from PubChem, in.sdf format and the structures were optimized using the Avogadro software and saved as mol2 files. The 3D structure of SARS-CoV2 main protease (Mpro, 3CLpro) (PDB ID: 6LU7) protein was obtained from Protein Data Bank (https://www.rcsb.org/). The resolution of the retrieved structure was 2.16 Å. The Mpro 3D structure was loaded into UCSF Chimera for molecular docking preparation [46].

2-2-Molecular docking
The inhibitory effect of thirty eight flavonoids on SARS-CoV-2 Mpro was studied by docking experiments using Autodock Vina software. The protease crystal structure consists of a 306 residues length chain with an excellent resolution of 2.16 Å co-crystallized with an engineered peptide inhibitor (N3) [47]. UCSF Chimera was used to remove all water molecules, heteroatoms, and co-crystallized solvent. The inhibitor was also abstracted from the PDB file and saved as a PDB file and used as positive control during docking studies. Finally, polar hydrogens and partial charges were added to the structure using UCSF Chimera point for antiviral therapies that are alternative to the vaccination strategy [46]. Molecular docking was finally performed by the AutoDock Vina [48] program in a rectangular box. The retained flavonoids were those with conformations of lower binding energy which are matching the active site defined by the co-crystallized N3 inhibitor.

2-3-Molecular dynamics simulation
Molecular dynamics simulations were performed with the GROMACS 2020 package using CHARMM36 force field installed on Linux system [49,50]. The protein topology was prepared with the pdb2gmx tool. The protein complexes were solvated in a triclinic box.
Energy minimization of the system was performed by using the steepest descent algorithm [51]. Protein was solvated by TIP3 solvent model and 4 Na + ions were introduced into the system to make the whole system neutral. The complexes consisting of the flavonoids and the Mpro were solvated in a water box with a minimal margin of 1.5 Å from any edge to any protein atom. An overall pressure and a temperature equal to 1 bar and 300 K were used with a time gap of 2 fs to stabilize the system. Temperature was kept constant inside the box, the vrescale, an optimized Berendsen thermostat temperature coupling technique, was used. Parrinello-Rahman pressure coupling method was utilized in NPT equilibration. Energy minimization of the system was carried out by the steepest descent algorithm for 50,000 iteration steps and cut-off up to 1000 kJ.mol -1 to reduce the steric clashes. Particle mesh Ewald (PME) was employed to treat long-range electrostatics interactions with a Coulomb cut-off of 1.0 nm. The system was subjected to a final 50 ns molecular dynamic run each step of 2 fs.

2-4-Trajectory analysis and protein-ligand interaction energy analysis
Trajectory analysis was performed using various GROMACS analysis tools. The gmx rms, and gmx rmsf tools were respectively used to calculate the root mean square deviation (RMSD) and the root mean square fluctuations (RMSF) of protein. Hydrogen bonds were analyzed using gmx hbond tool, and the radius of gyration with the gyrate tool. Finally, plots were prepared using Grace Software.

3-1-Molecular docking
The crystallographic structure of 3C like protease of SARS-CoV-2 (PDB id: 6LU7) in complex with a peptide-like inhibitor (N3) was made very recently available in the Protein Data Bank. From the above mentioned structure, we can clearly see a closed binding pocket around the peptide-like inhibitor.
Docking of all flavonoid structures (Fig 2) was done against the active site of Mpro protein. Docking analysis provided several configurations that were scored to determine favorable binding modes. The flavonoid structures with the highest docking scores are summarized in Table 1. The three compounds with highest affinity (Fig 2) for the active site are quercetin 3-rhamonoside, myricetin 3-rutinoside and rutin with binding energies of -9.7, -9,3 and -9.2 kcal.mol -1 respectively. These values are higher than those obtained for the N3 inhibitor (-7.5 kcal.mol -1 ).
Thirty eight flavonoids have been tested in this study by molecular docking against the active site of the SARS-CoV-2Mpro. The results which are summarized in Table 1 show that natural aglycone flavonoids possess high docking scores; with a maximum score achieved by EGCG with -7,9 kcal.mol -1 .
This value is comparable to the score obtained for N3 inhibitor co-crystallized with the protease (6LU7), used here as a positive control (-7,7 kcal/mol). Furthermore, we can clearly see that glycosylated flavonoids display the highest scores; with a maximum value of 9,7 kcal.mol -1 for quercetin 3-rhamnoside. As shown in Fig 3, quercetin 3-rhamnoside is well installed in the active site and the position is matching with the N3 peptide inhibitor.
The best compounds possess sugar moieties in their structure, and the position seems to be important. Compounds substituted at the carbon 3 display higher scores than those substituted at the carbone 7. Genistin, a flavonoid substituted at position 7 displays low score (-7,5 kcal. mol -1 ) compared to datiscin or hyperoside (-8,2; -8,4 kcal.mol -1 ) among others.

PLOS ONE
Glycosylated flavonoids as SARS-CoV-2 Inhibitors: A molecular docking and simulation studies The nature of the sugar moiety is also important to the activity.As we can see, the top 5 compounds are flavonols and they possess either a rhamnose or a rutinoside (a dissaccharide: Glucose-Rhamnose) at position 3. Quercetin 3-Rhamnoside (Fig 3) and myricetin 3-Rutinoside bind with more strength than datiscin which is substituted at position 3 with glucose.
However, low bioavailability of flavonoids is a major inconvenient to their use as natural drugs. The presence of sugar moieties usually leads to enhanced bioavailability of the respective flavonoid aglycone depending on the nature of the substituted sugar. Flavonoids with glucose moieties seems to be absorbed more rapidly than flavonoids with rhamnosides, rhamnoglucosides, rutinosides and neohesperidosides substitutions [52]. It is assumed that bioavailability of glyco-flavonoids is achieved by aglycone part in blood plasma after the cleavage of the glycosidic part by hydrolyzing enzymes in the human gastrointestinal tract [53]. Glucose moieties are easily hydrolyzed in the small intestine epithelial cells. However, the lack of α-L -rhamnosidase or rutinosidase in human cells makes the

PLOS ONE
bioavailability of relative flavonoids largely dependent on their hydrolysis by intestinal bacteria [53]. Furthermore, few intestinal bacterial strains can achieve cleavage of these types of bonds [54].
Discovery studio visualizer was further used to determine hydrogen bonds occurring between amino acids of the active site and flavonoids (Fig 5). The N3 inhibitor was docked to the SARS-CoV-2Mpro to be further compared to docked generated model with Autodock vina. The superimposed structures of the co-crystallized N3 inhibitor (Fig 4) are closely related indicating a good accuracy of the automated tool.
Results (Table 2) show that compounds that undertake bonds with the highest strength usually bind to Thr26, Ty54, His41, Asn142, Gly143, Cys145and Glu166 in the substrate binding subsite S1 Fig for the proteolytic activity [55,56], leading to strong binding due to hydrogen bonds. Mpro consists of three domains and substrate binding site resides in a cleft between domain I and II, while domain III is involved in catalytic function. Dimerization of the protein is required for its catalytic regulation [57]. Glu166 residue is a key amino acid involved in the dimerization of Mpro and creation of substrate binding pocket [45,58]. Cys141 and His41 residue forms a catalytic dyad on the active site of the protein essential for its catalytic function. Similar results were recently obtained by molecular docking with different molecules against the Mpro protein [57,59,60]. Quercetin 3-Rhamnose, a flavonol with a rhamnose at position 3 of the C ring, is involved in several hydrogen bonds contracted with Thr26, Phe140, Leu141 and Gly143 (Fig 5A).
The second and third flavonoids with highest affinity, myrictin 3-Rutinoside and rutin are also flavonols but with a rutinoside (Glucose-Rhamnose) at the 3 position. Fig 5B and 5C show that both compounds display hydrogen bonds with Mpro leading to high affinity binding to the active site. Results show that glycosylated flavonols with a sugar moiety at position 3 with at least a rhamnose in their structure are the compounds that bind with higher affinity to the active site pocket of the Mpro.

3-2-Molecular dynamics simulations
Molecular dynamics simulations are of proven in-silico methods for obtaining dynamic data at atomic spatial resolution [2,51]. The docked Mpro complexes with the three compounds with highest scores: Quercetin -3 Rhamnose, Myricetin 3-rutinoside and Rutin were subjected to MD simulations for 50 ns to analyze the stability of the complexes.
RMSD plot is usually used to understand the stability of the complex while RMSF is used to understand the structural flexibility (Fig 6) The Mpro in complex with rutin (green) showed the most stable RMSD during the 50 ns simulation around 0.2 nm. The M pro with compound Myricetin 3-rutinoside (red) is also stable and constant range of RMSD between 0.15 nm and 0.25 nm. Slight increase was observed after 40 ns simulation. The quercetin 3-rhamnose (black) showed stable RMSD between 5 and 40 ns with values close to 0.3 nm. After 40 ns RMSD value decreased to 0.25 nm (Fig 6). The differences of backbone RMSD in quercetin 3-Rhamnoside and myricetin 3-rutinoside indicate that M pro undergoes conformational changes.
RMSF trajectories provide important information regarding the stability of the complex. High fluctuations in the plot indicate more flexibility and unstable bonds. On the other side, low value or less fluctuation indicates well structured regions in the complex and less distortion. As depicted in Fig 6, all the systems showed almost similar patterns. The overall value of RMSF for all the three complexes was around 0.1 nm as clearly seen in Fig 6. In order to determine the compactness of the system, Rg was plotted against time. Higher Rg values illustrate less compactness with conformational entropy, while low Rg values are The simulation Rg values of the three compounds reported as 2.2-2.3 nm. The fewer changes in the Rg value showed the stability of the protein in the complex suggesting that the binding of these three molecules does not induce structural changes. The Rg range of the three complexes (Fig 7) support their condensed architecture as well as size. Moreover; potential and kinetic energy plots confirm the stability of the complexes (S1 Fig). Hydrogen bonding plays an essential role in determining the binding strength between ligands and protein. Rutin (green) id (black) has constant range of hydrogen bond between 2 to 4 in whole simulation while quercetin 3-rhamnose and myricetin 3-rutinoside showed changes in bonding. More hydrogen(>6) bonds between 10 to 25 ns for myricetin 3-rutinoside, suggesting a conformational change in the binding site during simulation (Fig 7). Over all, observation suggested that Mpro protein in complex with the three compounds is stable during simulation. However, it is clear that rutin complex is more stable and therefore a better candidate as potential drug for inhibiting the Mpro protein.

Conclusion
Until now, the spread of SARS-CoV-2has stimulated research to find an existing drug for a rapid treatment. At that time, the confirmed cases are over 7 million and rising, and the deaths are approaching 400000, and governments fear a second infection wave. Isolation strategy was mandatory to limit the spread of the virus, but solutions have to be found urgently. The main protease taken into account as molecular target for stopping the spread of the virus is an interesting strategy.
Results show that glycosylated flavonoids display a strong inhibitory activity on SARS-CoV-2 Mpro. Glycosylated flavonoids seems to bind with more strength than the N3 co-crystallized inhibitor. The binding of the flavonoids at the protease active site is clearly structure dependant. Compounds binding with more strength are flavonols substituted with mono or disaccharides at position C3. The nature of sugar is also primordial to the activities, as flavonoids with a rhamnose possess the strongest binding affinity. As flavonoids with rhamnose moiety undergo no or little modification in the intestine, they might be a good candidate in the treatment of the SARS-CoV-2coronavirus.