In silico repurposing of a Novobiocin derivative for activity against latency associated Mycobacterium tuberculosis drug target nicotinate-nucleotide adenylyl transferase (Rv2421c)

Nicotinamide-nucleotide adenylyl transferase (Rv2421c) was selected as a potential drug target, because it has been shown, in vitro, to be essential for Mycobacterium tuberculosis growth. It is conserved between mycobacterium species, is up-regulated during dormancy, has a known 3D crystal structure and has no known human homologs. A model of Rv2421c in complex with nicotinic acid adenine dinucleotide and magnesium ion was constructed and subject tovirtual ligand screening against the Prestwick Chemical Library and the ZINC database, which yielded 155 potential hit molecules. Of the 155 compounds identified five were pursued further using an IC50 based 3D-QSAR study. The 3D-QSAR model validated the inhibition properties of the five compounds based on R2 value of 0.895 and Q2 value of 0.944 compared to known inhibitors of Rv2421c. Higher binding affinities was observed for the novel ZINC13544129 and two FDA approved compounds (Novobiocin sodium salt, Sulfasalazine). Similarly, the total interaction energy was found to be the highest for Cromolyn disodium system (-418.88 kJ/mol) followed by Novobiocin (-379.19 kJ/mol) and Sulfasalazine with (-330.13 kJ/mol) compared to substrate DND having (-185.52 kJ/mol). Subsequent in vitro testing of the five compounds identified Novobiocin sodium salt with activity against Mycobacterium tuberculosis at 50 μM, 25μM and weakly at 10μM concentrations. Novobiocin salt interacts with a MG ion and active site residues His20, Thr86, Gly107 and Leu164 similar to substrate DND of Mycobacterium tuberculosis Rv2421c. Additional in silico structural analysis of known Novobiocin sodium salt derivatives against Rv2421c suggest Coumermycin as a promising alternative for the treatment of Mycobacterium tuberculosis based on large number of hydrogen bond interactions with Rv2421c similar in comparison to Novobiocin salt and substrate DND.


Introduction
Globally, Tuberculosis (TB) is the second leading cause of death after Human Immunodeficiency virus and accounts for approximately 1.3 million deaths and 10.4 million new cases per year [1,2]. It is estimated that one-third of the world population is infected with TB. However, only about 10% of infected individuals develop active TB [3]. The remainder of latently infected individuals generate a critical reservoir of TB bacteria in their system which may result in disease reactivation should their immune system be compromised [3]. This poses a significant problem as current drugs require a long course of therapy with patients experiencing various degrees of adverse reactions [2]. Poor adherence due to adverse reactions is one of the main reasons patients default which may lead to further emergence of drug resistance. Furthermore, drug resistant strains of TB are on the rise warranting the urgent development of new drugs to combat this disease before it becomes an epidemic. The sequencing of several strains of Mycobacterium tuberculosis (M. tuberculosis) provides a useful resource to interrogate novel drug targets. The genome of M. tuberculosis is~4.4Mb in size and codes for approximately 4000 proteins [4]. However, not all these proteins are potential drug targets. For a candidate to be considered a promising drug target, it needs to meet certain criteria including being essential with regards to growth, replication and survival and not have any well-conserved homolog within its human host to avoid toxicity. Nicotinamidenucleotide adenylyl transferase (NadD/Rv2421c) is essential for the survival of Staphylococcus aureus, Streptococcus pneumoniae, Escherichia coli (E. coli) and M. tuberculosis as all these organisms harbor both de novo and salvage pathways which is particularly useful during ATP limited conditions [5]. Furthermore, the NadD bacterial enzyme has a different 3D active site conformation compared to the human NadD isoform, providing an opportunity for selective inhibition. A previous study by Sorci and colleagues [5] used in silico screening and in vitro assays to identify structurally diverse compounds that inhibited NadD enzyme activity and cell growth in E. coli and Bacillus anthracis (B. anthracis). In a follow up study, Huang et al. [6] solved crystal structures of three inhibitors bound to NadD in B.anthracis that were identified in the Sorci study [5] and revealed a common binding site near residues Trp117, Tyr112 and Met109.
The enzyme NadD catalyses the transfer of an adenyl group from ATP to NaMN to form NaAD that is then converted to the ubiquitous intermediate nicotinamide adenine dinucleotide (NAD). Metabolic pathway and sequence similarity analyses indicated that Rv2421c is involved in nicotinate and nicotinamide metabolism in M. tuberculosis and has no equivalent human ortholog [7]. It is upregulated during non-growing metabolically active conditions of M. tuberculosis survival and lacks a close homolog in mice making it an attractive drug target [7]. The crystal structure of Rv2421c has been resolved and offers a unique opportunity for structure-based drug design [8]. Since, Rv2421c (PDBID: 4X0E) does not have ligands bound, we used the homologous structure of Nicotinic acid mononucleotide (NaMN) adenylyl transferase from B.anthracis, PDBID: 3E27 in complex with native substrate nicotinic acid adenine dinucleotide (DND) and magnesium ion (MG), to construct the structure of the Rv2421c in complex with DND and MGand then used this information to delineate the binding site of DND which was subsequently used for docking studies. We further used molecular dynamics (MD) simulation studies to validate binding free energy of the protein-ligand complexes to prioritize potential lead compounds. This was followed by experimental validation of the compounds as potential inhibitors of M. tuberculosis growth using in vitro growth assays.

Structure preparation and assessment of Rv2421c (4X0E)
The crystal structure of Rv2421c (4X0E; resolution 2.4 Å, [8]) was superimposed onto the homologous template (the two proteins shared 40% sequence identity) to the structure of NaMN adenylyl transferase (PDBID: 3E27, resolution 2.2 Å, [5]) and the coordinates of substrate DND and MG were extracted from 3E27 to construct a complex of Rv2421c with DND and MG; the structural similarity between Rv2421c and 3E27 was assessed using the root mean square deviation (RMSD) value. Missing residues in the crystal structure of Rv2421c were modelled using the Swissmodel webserver [9]. The resulting complex of Rv2421c in complex with DND and MG was energy minimized using Desmond Schrodinger [10] for 2000 iterations of steepest descent method using the OPLS3 Force Field [11] in a TIP3 waterbox.

Pharmacophore-based virtual screening
The energy minimized Rv2421c-DND-MG complex structure was used to generate a pharmacophore model using the program LIGANDSCOUT [12]. Two pharmacophore models were generated based on the substrate DND. The first model contained three features (H-bonds (carboxylate group), hydrophobic aromatic ring and negative terminal ionic phosphate group) which are conserved for DND interactions and observed in the homologous template 3E27 ( Fig 1A). The second model contained two features (three H-bonds (amine and hydroxyl group) and two hydrophobic aromatic ring contacts also important for DND interactions ( Fig  1B). Subsequently, the generated pharmacophore models were screened against 1278 compounds from the Prestwick Chemical Library (PCL) (http://www.prestwickchemical.frl) to identify novel potential compounds that share similar chemical features when aligned in 3D space. We also generated a pharmacophore model for the Rv2421c-DND-MG complex that spanned the DND active site contact region and included one hydrogen bond and two hydrophobic contacts, using the on-line tool ZINCPharmer [13], which was then used to screen the ZINC database for potential lead compounds. The Pharmer technology implemented in ZINCPharmer is a high-performance search engine that employs novel search methods such as geometric hashing, generalized Hough transforms and Bloom fingerprints to perform an exact pharmacophore match which accelerates the search algorithm [13]. The searches yielded 14 compounds after screening the PCL database, and 141 compounds after screening the ZINC database for purchasable compounds, which were further prioritized using docking studies.

3D Quantitative Structure Activity Relationship (QSAR) studies
The inhibitory efficiency of the obtained inhibitors was evaluated using 3D-QSAR studies. Known-TB NadD inhibitors were identified from the literature [14] to use as a training set to test our pharmacophore model. The structures of inhibitors were constructed and minimized using utilities of Schrodinger Software suite 2020-2. The inhibitors were primarily processed using "LigPrep" module and the IC50 values then converted to corresponding pIC50 values and were selected as the dependent variable. The whole dataset was further divided into the "Training" and "Test" sets using randomized method and a IC50 based 3D-QSAR model was constructed using "Atom-Based QSAR" module of the Schrodinger Software suite.

Molecular docking using AutoDockVina
155 compounds were identified with LIGANDSCOUT and ZINCPharmer, and these were docked to the energy minimized structure of Rv2421c using VINA [15]. The active site region spanning residues Thr8, Thr80, Thr100 and Leu157 was used for docking. Agrid box size of (X = 26, Y = 24 and Z = 26) was sufficient to capture the whole active site region and an exhaustiveness search algorithm setting of 40 was used for the docking simulations. The grid setup is calculated automatically by VINA. VINA was able to reproduce the experimental binding pose as the top scoring pose of the ligand DND when benchmarked using the homologous template 3E27. As a comparative standard we also docked the known inhibitor ZINC58655383 of nicotinate mononucleotide adenylyltransferase (BaNadD for the B.anthracis (strain~CDC 684/NRRL 3496) to Rv2421c. The receptor and ligand DND were prepared using AutoDock tools which adds polar hydrogens, derives gasteiger charges for the atoms and saves the output file in pdbqt format. The 155 compounds and ZINC58655383 were prepared with automated bash and python scripts namely, split_multi_mol2_file.py and prepare_li-gand4.py. After receptor and ligand preparation, the center of mass was calculated for the receptor structure with the ligand DND present. The various parameters for the docking process were stored in configuration files. The configuration file contained input parameters for the docking simulations such as the center of mass coordinates, grid dimensions and exhaustiveness of the search algorithm. The docked complexes were ranked according to their energy scores using the python script developed by the Scripps research institute vina_screen_get_top.py. The compounds that showed comparable and higher binding affinities than substrate DND and/or ZINC58655383 were visually inspected in PyMol and hydrogen bond interactions was calculated using find polar contacts option within PyMol.

Molecular dynamics simulations
The top scoring complexes were subjected to MD simulations using GROMACS 5.1.2 [16], with the topologies of the protein structures generated using the GROMOS96 53a6 force field [17]. The PRODRG server [18] was used to generate GROMOS96 based topologies as well as coordinate files of the inhibitors. The partial charges were corrected using the DFT method implemented in GAUSSIAN which utilized the B3LYP 6-31G (d,p) basis set and the CHELPG program [19].Subsequently, all the docked complexes were solvated using the SPC/E water model [20] and the net charge in each box was neutralized by adding appropriate numbers of sodium (NA) and chloride (CL)counterions. The neutralized systems were energetically minimized by steepest descent and conjugate gradient algorithms.
Molecular dynamics simulations were carried out in the NVT (constant volume) and NPT (constant pressure) ensemble conditions, each for 100 ps during equilibration. The temperature of the system was maintained at 300 K using the Berendsen weak coupling method while the pressure was maintained at 1 bar by utilizing Parrinello-Rahman barostat. The production MD simulations were carried out for 100 ns. The generated trajectories were used to analyze the behavior of each complex for the last 50 ns of the simulation trajectory. The deviations in the distances, H-bonds, RMSD (Root Mean Square Deviations), and Radius of Gyration (Rg) were analyzed between the protein and ligands. The free energy of binding between the protein and ligands were calculated using the Molecular Mechanics Poisson-Boltzmann Surface Area (MM-PBSA) protocol implemented in the g_mmpbsa package for the last 50 ns of the trajectory [21]. Bacterial strains and growth conditions. M. tuberculosis H37Rv (ATCC 27294) with (H37Rv:mCHERRY) or without (H37Rv):pCHERRY3 reporter plasmid [22] was cultured in Middlebrook 7H9 liquid media (Becton Dickinson, USA) supplemented with 0.2% (v/v) glycerol (Merck Laboratories, USA), 0.05% Tween 80 (Sigma-Aldrich, Germany) and 10% albumin-dextrose-catalase (ADC) (Becton Dickinson, USA), in filtered screw cap tissue culture flasks (Greiner Bio-one, Germany). Hygromycin B (50μg/ml) was included for plasmid maintenance, where required. The cultures were incubated at 37˚C until an optical density (OD 600 = 1.0) was reached.

Experimental whole cell assays
In vitro bioluminescent reporter assay. Each culture was strained (40μM filter; Becton Dickinson, USA) and diluted till a final OD 600 of 0.02 (100μl) was added to each of the wells containing a final compound concentration of 800, 400, 200, 50, 25 and 10μM. Positive, negative, and compound controls were included on each plate. The positive control contained 100μg/ml rifampicin and H37Rv:pCHERRY3 reporter. The negative controls included wells containing media only to identify contamination and an untreated fluorescence control with DMSO to confirm that the DMSO used to dissolve the compounds did not affect growth. A compound control was included similar to the final concentration of each compound in the absence of M. tuberculosis to confirm no autofluorescence. The background control contained M. tuberculosis H37Rv (without reporter) which is used to subtract the fluorescence measurement to gain the inherent fluorescence of H37Rv. An undiluted H37Rv:pCHERRY3 control was used for the experiment. Plates were incubated at 37˚C and readings were taken at 0hour and 24-hour intervals thereafter for approximately five days using the BMG Labtech POLARstar Omega plate reader (BMG Labtech, Germany, excitation: 587 nm, emission: 610 nm).

Sequence and structural analysis of Rv2421c (PDB: 4XOE)
Sequence-structural alignments between the target 4X0E and homologous protein structures NAMN adenylyl transferase from B.anthracis (3E27) and nicotinate mononucleotide adenylyl transferase from B.anthracis (2QTN) indicate approximately 40% sequence identity and 85% sequence coverage. Most importantly, 100% conservation of active site residues was observed between the proteins, suggesting that recognition of the ligand DND would be dominated by the same forces (S1 Fig). The structural similarity was less than 2 Å between 4X0E and 3E27 (~1.3 Å), and between 4X0E and 2QTN (1.5 Å) (S2A and S2B Fig). The only major difference between the structures are an over-closed 3 10 helix topology in M. tuberculosis Rv2421c which prevents ATP binding and NAMN binding, thus rendering it inactive [8]. The 211 amino acid structure of Rv2421c (4X0E) consists of 10 alpha helices, 5 beta sheets and 15 coils (Fig 2).

Prediction of inhibitory values using 3D-QSARmodel
The non-cross squared correlation coefficient (R 2 ) and cross squared correlation coefficient (Q 2 ) parameter values for the predicted IC50 based 3D-QSAR model was calculated to be around 0.895 and 0.944 respectively, which was indicative of the reliability of the generated 3D-QSAR model. The parameters highlighting the high predictive ability of the generated 3D-QSAR model are listed in Table 1. The activities of five molecules purchased for experimental testing; Carbenoxolone disodium salt (ZINC3977823), novel ZINC13544129, Cromolyn disodium salt (ZINC1530788), Novobiocin sodium salt (ZINC76945632) and Sulfasalazine (ZINC3831490) were computed to be around 7.92 μM, 5.97 μM, 6.93 μM, 5.81 and 4.99 μM respectively using the generated 3D-QSAR model. The outcome generated from this study indicated that these studied molecules have the potential to inhibit TB Rv2421c protein functionality.

Docking using VINA identifies three compounds that bind stronger than the known drug ZINC58655383 (LJZ)
The 155 compounds identified from the virtual screen against the Prestwick Chemical library and ZINC database were further filtered using VINA by docking to the energy minimized conformation of the protein. In total three compounds showed higher binding affinity values compared to the known drug ZINC58655383. These compounds included; novel ZINC13544129 and two FDA approved compounds used for the treatment of other conditions (Novobiocin sodium salt, Sulfasalazine, Table 2). The other two FDA approved compounds selected for this study, Carbenoxolone disodium salt and Cromolyn disodium salt (ZINC1530788), showed lower binding affinities compared to DND and ZINC58655383 (Table 2).

Interaction analysis of the five compounds selected for experimental validation
To further understand the mode of interaction between Rv2421c and DND, known inhibitor ZINC58655383 and the five compounds we performed interaction analysis using PyMol   Table 1 and in the threedimensional interaction figures generated using PyMol. The number of interactions formed between the FDA approved compounds Novobiocin sodium salt, Sulfasalazine, novel ZINC13544129, Carbenoxolone disodium salt, Cromolyn disodium salt and ZINC58655383 and residues of Rv2421cwere 9, 8, 7, 7, 6 and 3, respectively (S3B-S3D, S4A and S4B Figs, Table 2). Interestingly, Novobiocin sodium salt makes the same amount of contacts compared to DND which also makes 9 contacts with Rv2421c residues and MG. The residues that account for the strong interaction between Rv2421c and DND and Novobiocin sodium salt is due to one MG coordination and residues His20, Thr86, Gly107 and Leu164. The reason why the other compounds do not show similar inhibition could possibly be due to a reduced number of contacts and a loss of one MG contact, and other crucial active site residues such as His20, Thr86, Gly107 and Leu164. S5A Fig shows a

Analyses of conformational dynamics of Rv2421c and the five studied inhibitor docked complexes using MD simulations
Six 100 ns MD simulations were performed within a SPC/E immersed water model, minimized and equilibrated docked systems. The protein backbone stabilities of the systems were analyzed using RMSD analysis and these values showed equilibrium is reached after 50 ns in complex with inhibitors (S6A- S6F Fig). The distance between the Rv2421c and each of the five compounds was calculated using gmx_mpi pairdist. The fluctuations in the distances were observed between 0.1 nm-0.2 nm (Table 3). While for DND, Sulfasalazine and Novobiocin the computed distance values may reach below 0.16 nm (Table 3). Furthermore, the complex of DND showed an average number of 7.63 hydrogen bonds with Rv2421c active site residues and MG, followed by sulfasalazine and Cromolyn disodium up to 6.03 and 6.73 average number of hydrogen bonds formed between Rv2421c residues and MG, respectively ( Table 3). The Novobiocin compound showed 4.25 hydrogen bonds in the system while Carbenoxolone and ZINC13544129 complexes contained less than three number of hydrogen bonds ( Table 3). The compactness of the systems was assessed using the radius of gyration curves, which showed relatively similar range of compactness in the studied systems (S7A- S7E Fig), with all converging towards 1.55 nm except for Carbenoxolone and Novobiocin systems each fluctuating around 1.6 nm. The structure of Rv2421c contain an ATP active sub-site at His17, His20, Leu164, Ser167, Thr169, Arg172, while the residues Thr12, Asp14, Ser41, Ser52, Arg57, Thr86 form the NaMN active sub-site. The secondary structure corresponding to these active site residues are L1, A1, L3, A2, A4, A8, L13 and A9 (S8A Fig). Higher fluctuations in Rv2421c constituent residues were observed for Rv2421c-complex systems DND, Carbenoxolone, Cromolyn disodium salt and ZINC13544129 each having RMSF mean and standard deviation values of 0.17 ± 0.07 nm, 0.16 ± 0.07 nm, 0.16 ± 0.08 nm and 0.16 ± 0.07 nm compared to Rv2421c complex systems were Sulfasalazine and Novobiocin sodium salt had the lowest RMSF values of 0.14 ± 0.07 nm and 0.15 ± 0.07 nm, respectively (S8B- S8G Fig). The latter two inhibitors resulted in lower RMSF flexibility values of the protein residues thereby increasing stability of the Rv2421c protein structure. Furthermore, the interaction energies contribution to the binding between the Rv2421c and studied potential inhibitors were analyzed using MMPBSA protocol (Table 3). For Cromolyn disodium system, the highest total energy was calculated to be around -418.88kJ/mol followed by Novobiocin and Sulfasalazine with the energy reaching up to -379.19kJ/mol and -330.13 kJ/mol, while DND showed -185.52kJ/mol ( Table 3). The highest energy contribution were due to the electrostatic interaction energy for the Cromolyn disodium system while for the DND system there was a weak electrostatic contribution (Table 3).
These findings indicate the nature of binding between the Rv2421c and the studied complexes. The generated outcomes emphasized the need for further validation which are discussed in subsequent sections. In summary, Novobiocin showed closer distance to Rv2421c active site residues and higher binding free energy compared to known inhibitor ZINC13544129 and substrate DND making it a stronger competitive inhibitor of Rv2421c. Furthermore, the free energy of binding for each system was predicted using the MMPBSA techniques (Listed in Table 4), showed that the Cromolyn disodium and Novobiocin compounds showed stronger affinity for Rv2421c.

In vitro growth assays verify Novobiocin sodium salt as a potential inhibitor of Mycobacterium tuberculosis growth
We purchased five of the available top binding compounds (Carbenoxolone disodium salt, Novobiocin sodium salt, Sulfasalazine, Cromolyn sodium salt and ZINC13544129) based on higher binding affinity scores and higher number of interactions compared to DND and ZINC58655383. Unfortunately, no vendors were found for ZINC58655383 and DND, and these compounds were not purchased. In this study we aimed to identify inhibitors with MIC of at least 50 μM as a start for further interrogation. Of the five compounds (Fig 3A-3E), only Novobiocin sodium salt was found to inhibit M. tuberculosis growth at 50, 25 μM and weakly at 10 μM concentrations (Fig 3B). Further inspection of the molecular properties and activity of Novobiocin sodium salt showed that it was slightly hydrophilic based on the milogP value of 0.66 and it had antibiotic activity against gram positive bacteria by inhibiting DNA synthesis.

Targeting Rv2421c with Novobiocin
The mechanism of action of Novobiocin includes the inhibition of DNA gyrase B and Topoisomerase IV in gram positive bacteria such as Pseudomonas Aeruginosa, Acinetobacter baumannii and Klebsiella pneumonia [23]. However, mutations in DNA gyrase B have resulted in resistance to Novobiocin specifically, in Halophilic Archaebacteria and M.tuberculosis [24,25].
Previous reports indicate that Novobiocin was withdrawn from the market due to poor pharmacological properties and safety concerns in the treatment of staphylococcus infections and therefore emphasized the need for alternative forms of the compound [23,25]. Several derivatives of Novobiocin have been developed of which four were docked to Rv2421c. The docking results indicated that Chlorobiocin had the highest affinity (-11kcal/mol) followed by Coumermycin, compound 5 and BL-C43 (-10.4, -9.6 and -9.3 kcal/mol respectively). Furthermore, interaction analysis using PyMol find polar contacts option indicated that Coumermycin had the largest amount of contacts of 9 compared to Chlorobiocin, compound5 and BL-C43, each having, 6,5 and 2 contacts. These derivatives could serve as alternatives for TB treatment. We constructed protein models for both Haloferax DNA gyrase B and M. tuberculosis DNA gyrase B using the Swissmodel webserver. Structural superimpositions indicated very low RMSD values (0.67Å) between the two models and the conservation of two mutant residues Ser122 and Arg137 suggesting that M. tuberculosis DNA gyrase B is resistant to Novobiocin. However, amino acid sequence comparisons between Haloferax DNA gyrase B and Rv2421c indicated very low,10% sequence identity and only one conserved residue (Ser 122) associated with Novobiocin resistance in Haloferax. Additionally, Rv2421c is structurally different from M. tuberculosis DNA gyrase B with an RMSD value of more than 19Å. Molecular docking studies of Novobiocin and its four derivatives to M. tuberculosis DNA gyrase B indicated lower binding affinities for compound 5, Coumermycin, Novobiocin, Chlorobiocin and BL-C43 (-5.9, -6.5, -6.7, -6.7 and -6.8 kcal/mol respectively) compared to Rv2421c binding. This study provides data to suggest that Coumermycin can be considered for Rv2421c inhibition due to stronger binding instead of the known gyrase B drug target. Future work will include purchasing Coumermycin to test its activity against M. tuberculosis in whole cell assays, and possibly modifying the structure of Novobiocin sodium salt to improve MIC values.

Conclusions
Rv2421c has been shown to be essential for Mycobacterium tuberculosis growth, shares no homology to known proteins in the human host, is conserved between various Mycobacterium species, is up-regulated during the non-replicative metabolic growth phase, making it an attractive drug target. It has a known 3D structure which has been exploited to screen for putative compounds within the Prestwick chemical library and ZINC database, resulting in the successful identification of 155 candidate compounds. Thereafter 3D-QSAR, molecular docking and molecular dynamics simulation studies were used to prioritize five potential compounds.
Of the five compounds tested in vitro, only one, a Novobiocin disodium salt, showed activity against Mycobacterium tuberculosis at 50, 25 and weakly at 10 μM concentrations. Novobiocin is known to target Mycobacterium tuberculosis DNA gyrase B, but emerging resistance stimulated us to seek derivatives to target Rv2421c as alternatives for the treatment of Mycobacterium tuberculosis. Docking studies and interaction analysis supported the higher binding affinity of one Novobiocin derivative, Coumermycin to Rv2421c compared to DNA gyrase B. Future studies will involve testing Coumermycin for activity against Mycobacterium tuberculosis.