Cheminformatics-aided discovery of small-molecule Protein-Protein Interaction (PPI) dual inhibitors of Tumor Necrosis Factor (TNF) and Receptor Activator of NF-κB Ligand (RANKL)

We present an in silico drug discovery pipeline developed and applied for the identification and virtual screening of small-molecule Protein-Protein Interaction (PPI) compounds that act as dual inhibitors of TNF and RANKL through the trimerization interface. The cheminformatics part of the pipeline was developed by combining structure–based with ligand–based modeling using the largest available set of known TNF inhibitors in the literature (2481 small molecules). To facilitate virtual screening, the consensus predictive model was made freely available at: http://enalos.insilicotox.com/TNFPubChem/. We thus generated a priority list of nine small molecules as candidates for direct TNF function inhibition. In vitro evaluation of these compounds led to the selection of two small molecules that act as potent direct inhibitors of TNF function, with IC50 values comparable to those of a previously-described direct inhibitor (SPD304), but with significantly reduced toxicity. These molecules were also identified as RANKL inhibitors and validated in vitro with respect to this second functionality. Direct binding of the two compounds was confirmed both for TNF and RANKL, as well as their ability to inhibit the biologically-active trimer forms. Molecular dynamics calculations were also carried out for the two small molecules in each protein to offer additional insight into the interactions that govern TNF and RANKL complex formation. To our knowledge, these compounds, namely T8 and T23, constitute the second and third published examples of dual small-molecule direct function inhibitors of TNF and RANKL, and could serve as lead compounds for the development of novel treatments for inflammatory and autoimmune diseases.


Introduction
Tumor Necrosis Factor (TNF) is a pro-inflammatory cytokine [1] that is associated with a variety of important physiological processes and pathological conditions, including rheumatoid arthritis (RA), psoriatic arthritis, Crohn's disease, and multiple sclerosis. [2,3] To control the adverse functions of TNF, efforts have focused on blocking TNF binding to its receptors. [4] TNF is generated as a transmembrane protein (tmTNF), which is proteolytically cleaved by tumor necrosis factor-α-converting enzyme (TACE) to its soluble form (sTNF). [5] sTNF and tmTNF bind to two different receptors, TNFR1 (TNF receptor type 1) and TNFR2 (TNF receptor type 2) with differential capacities, exerting differential functions. [6,7] While TNFR1 is expressed on most cell types, TNFR2 is expressed mainly on immune cells [6] and its complete activation requires the presence of tmTNF. [8] It has been demonstrated that tmTNF and sTNF differ in their physiological functions [9,10] and inhibitors that distinctively target them may result in differential effects. [11] Early studies in our lab provided in vivo evidence that deregulated TNF production is causal to the development of chronic polyarthritis in a transgenic animal model and that anti-TNF treatment is efficacious for treating the modeled disease. [12] These studies were instrumental in mobilizing the interest of the anti-TNF industry, which led to the first successful clinical trials performed initially for RA [13] and then for other chronic inflammatory diseases, such as Crohn's disease, psoriasis, psoriatic arthritis, juvenile idiopathic arthritis, spondylarthritis and Behçet's disease. [14] Thus far, four synthetic antibodies have been approved by the FDA as effective TNF inhibitors, namely infliximab, adalimumab, certolizumab pegol [15] and golimumab [16] as well as the Fc-p75 receptor etanercept. [17] However, their discovery did not limit the everincreasing research efforts towards the development of new anti-TNF drugs, mainly due to impediments, such as unwanted side effects (e.g. high risk of hepatitis B and tuberculosis), insufficient clinical response, the need for intravenous administration, and high cost. Drug development that leads to small-molecule inhibitors may overcome several of the above drawbacks by offering important advantages, such as oral administration, shorter half-lives with lower immunosuppression, easier manufacturing and reduced cost. [14] The development of small-molecule inhibitors for protein-protein interactions (PPIs) is a nontrivial task in drug research. [18][19][20][21] Successful drug design requires the identification of inhibition. 14,400 diverse drug-like compounds were initially virtually-screened [38] from the Maybridge HitFinder database [39] and were docked into the binding site of the TNF dimer (PDB ID: 2AZ5). The 30 compounds with the closest docking score and binding conformation to SPD304 were further filtered through a ligand-based predictive model based on the largest dataset available, containing 2481 known TNF inhibitors, [40] to afford 9 active compounds, which were subsequently tested in vitro to evaluate their potency against TNF. Our ligandbased model developed for the prediction of small-molecule TNF inhibition was made publicly available through the Enalos Cloud platform (http://enalos.insilicotox.com/TNFPubChem/). In vitro evaluation of functional inhibition and direct binding pointed to two compounds, namely T8 and T23, as the most potent TNF inhibitors (Fig 1).
RANKL (Receptor activator of nuclear factor kappa-B ligand), another member of the TNF superfamily, is the master mediator of osteoclast formation and bone resorption. Its specific inhibition with a monoclonal antibody (denosumab) effectively reduces the incidence of fractures in postmenopausal women [41] and emerges as the latest therapeutic achievement against osteoporosis. It has been demonstrated that SPD304 also binds to RANKL and inhibits RANKL-mediated osteoclastogenesis. [42] This motivated us to perform additional computations and biological assays for the T8 and T23 complexes with RANKL. In vitro proof of this second functionality thus established these two compounds as dual inhibitors [43] of TNF and RANKL. MD and free energy calculations for both structures and SPD304 in complex with TNF and RANKL were carried out to offer additional insight into the interactions that govern TNF and RANKL complex formation. Finally, direct and specific binding of the two compounds was confirmed both for TNF and RANKL, as well as their ability to inhibit the biologically-active trimer forms.

Materials and methods
In summary, our strategy for identifying the new dual TNF and RANKL inhibitors, T8 and T23, is shown below (Fig 2): All steps, including structure-and ligand-based modeling, in vitro evaluation and molecular dynamics calculations, are discussed below.

Virtual screening and molecular docking calculations
As a first step in our pipeline, all 14,400 compounds included in the Maybridge database were in silico investigated in a structure-based approach using docking and molecular modeling approaches. All the compounds of the abovementioned database passed the Lipinski "rule of five" for drug likeness and are non-reactive, ensuring fewer false positives and higher quality results. The crystal structure of TNF dimer in complex with inhibitor SPD304 (PDB code: 2AZ5) [27] was used as the molecular template for our investigation. Each compound was docked into the active site of the protein using the Surflex-Dock module of SYBYL 8.0 suite. [44] Based on the docking scores of the ligand poses, we generated a prioritized list of 30 compounds for further consideration. Detailed information on the molecular modeling methodology is presented in the Supporting Information (S1 Text).

Ligand-based predictive model development
A ligand-based model was developed to complement our methods in an effort to identify the most promising compounds among those proposed in the previous structure-based approach. Our model was built based on the KNIME platform with the help of our in-house developed Enalos KNIME nodes. [45] methodologies tested. We built three different models based on the k-Nearest Neighbor, the Nearest Neighbor and Random Forest. For each methodology used, we selected different variable selection methods, including GainRatioAttribute Evaluator combined with Ranker search method, CfsSubset Evaluator combined with Best First search method and InfoGainAttribute Evaluator combined with Best First search method, respectively, for each of the aforementioned models. Details on these methodologies can be found in the literature. [49,50] On top of these models, a consensus model was also built based on majority vote from each of the individual models. This model outperformed all others based on validation metrics that were used to assess the predictive power of each model, namely specificity, sensitivity, accuracy and precision. Confusion matrix is also provided for each methodology tested. Details on these metrics are given in the Supporting Information (S1 Text).
Domain of applicability. Our developed model aspires to emerge as a useful tool for the virtual screening of newly introduced small molecules not included in our original dataset. In this virtual screening framework, [51] the reliability of provided predictions needs to be assessed and thus it is important that the domain of applicability of the model is well defined. When the model's applicability limits are known, predictions for new molecular entries can be highlighted as reliable or not.
For the developed model, in order to define the domain of applicability, Euclidean distances between each compound in the test set and its nearest neighbor in the training set are calculated and then compared to a calculated threshold. For a distance greater than the threshold the prediction is considered unreliable. Details on the domain of applicability calculation are given in the literature. [52][53][54][55][56][57] To implement the domain of applicability calculation, we used the Enalos Domain-Similarity node.

Enalos cloud platform
The Enalos Cloud Platform [58,59] is a platform that hosts several predictive models for drug discovery and risk assessment of small molecules and nanoparticles. [60] In the previous steps, we succeeded to develop a predictive consensus model for TNF inhibition. Since, to the best of our knowledge, this is the only ligand-based model developed from an extensive dataset of TNF inhibitors, we decided to release it as a web service to facilitate the need for the design and virtual screening of novel potent small-molecule inhibitors of TNF, by providing immediate access to the model's results. The model can be easily accessed through http://enalos. insilicotox.com/TNFPubChem. The interested user can initiate a prediction through a user friendly graphical interface following a minimum-step procedure. The user can submit a structure by using one of the following ways: (i) draw a structure using the sketcher provided in the first page, [61] (ii) submit a SMILES notation of a molecule, or (iii) upload an sdf file. In any case, more than one compound can also be submitted. When the "TNFPubChem" workflow is selected, the compounds are submitted and the prediction is generated. The results page includes the predicted classification and an indication based on the domain of applicability on whether the provided prediction can be considered reliable or not. Screenshots of this online tool are shown and discussed in the Results and Discussion Section.
TNF-induced death assay in L929 cells [62] L929 cells were cultured on 96-well plates (3×10 4 cells/well) overnight. Then, 0.25 ng/mL human TNF (PeproTech) and 2 μg/mL actinomycin D (Sigma-Aldrich) were added. TNF with the compounds were pre-incubated for 30 min at room temperature before adding them to cells to assess inhibition. Background death was estimated using an actinomycin D treatedonly control. After approximately 24 h, dead cells were cleared away by washing with PBS. The remaining live cells were then fixed with methanol and stained with crystal violet. After solubilization with acetic acid, staining was quantified spectrophotometrically at 570 nm. Cytotoxicity is expressed with respect to the background death control and also relative to the toxicity of compounds. Experiments were repeated three times.
Measurement of cytotoxicity in L929 cells [62] L929 cells were seeded onto a 96-well plate (3x10 4 cells/well). After 24 h, cells were treated with the compounds at increasing concentrations. DMSO was used in the untreated control. The next day, a PBS-washing step was performed to remove dead cells. The remaining live cells were next fixed with methanol and stained with crystal violet. After solubilization with acetic acid, staining was quantified spectrophotometrically at 570 nm. Survival is expressed with respect to the untreated control. Experiments were performed in triplicate.
TNF/TNFR1 ELISA assay [62] 96-well plates were coated with 0.1 mg/mL recombinant soluble human TNFR1 (PeproTech) in PBS overnight at 4˚C. Next, blocking was performed with 1% BSA in PBS after carrying out four washes with PBS containing 0.05% Tween-20. 0.025 mg/mL recombinant human TNF (PeproTech) in PBS was added and the plates were incubated for 1 h at room temperature. After two consecutive rounds of washes, plates were incubated with a rabbit anti-human-TNF antibody (provided by Prof. W. A. Buurman, University of Maastricht) and an anti-rabbit secondary antibody conjugated with HRP (Vector Laboratories). Both incubations were for 1 h and were performed at room temperature. Following a last round of washes, the signal was finally developed using the TMB Substrate Kit (ThermoFisher Scientific) and was measured spectrophotometrically at 450 nm. Experiments were performed in triplicate.

RANKL-induced osteoclastogenesis assay
Bone marrow cells (BMs) were collected after flushing out of femurs and tibiae, subjected to gradient purification using ficoll-paque (GE Healthcare), plated on 96-well plates at a density of 6 × 10 4 cells per well in αMEM (GIBCO) containing 10% fetal bovine serum supplemented with 50 ng/ml human RANKL, 25 ng/ml M-CSF [42] in the presence or absence of T8 and T23 compounds in the indicated concentrations (1, 2, 5, 10 and 20 μM) for 5 days. To visualize osteoclasts, cell cultures were stained with TRAP (tartrate-resistant acid phosphatase), using an acid phosphatase leukocyte (TRAP) kit (Sigma-Aldrich). Osteoclasts were identified as TRAP-positive multinucleated cells containing more than three nuclei.

Osteoclastic (TRAP) activity assay
To measure osteoclast activity, bone marrow cells were cultured for four days as described above and lysed with 0.2% Triton X-100. TRAP activity was measured by the conversion of pnitrophenyl phosphate to p-nitrophenol in the presence of sodium tartrate. Thus, cell lysates were incubated with 5.5 mM phosphatase substrate diluted in 100 mM citrate buffer and 10 mM sodium tartrate, for 40 minutes at 37˚C. The reaction was stopped by addition of 0.4 N sodium hydroxide and absorbance was measured at 405 nm using a microplate reader (Molecular Devices). TRAP activity was normalized to total protein content as determined by the Bradford assay (Bio-Rad protein assay). TRAP activity per μg of total protein was expressed as a percentage of the positive control treated without compounds. IC 50 values were calculated as mean ± SEM from three independent experiments performed in duplicate.

BM viability
BMs were plated on 96-well plates at a density of 10 5 cells/well and treated with T8 and T23 at various concentrations, in the presence of M-CSF (25 ng/mL) for 48 hours. Cell viability was determined by the 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) colorimetric assay, which measures the ability of viable cells to reduce a soluble tetrazolium salt to an insoluble purple formazan precipitate. [63] After removal of the medium, each well was incubated with 0.5 mg/mL MTT (Sigma-Aldrich, St. Louis, MO) in serum-free α-MEM at 37˚C for 3 h. At the end of the incubation period, the medium was removed and the intracellular formazan was solubilized with 200 μL DMSO and quantified by reading the absorbance at 550 nm on a microplate reader (Optimax; Molecular Devices, Sunnyvale, CA). Cell viability (%) was expressed as a percentage of the negative control treated without compounds. LC 50 values were calculated as mean ± SEM from three independent experiments.

RANKL homology modeling
The homology structure of RANKL in complex with SPD304 was constructed based on the mouse RANKL crystal structure (PDB code: 1S55) and the TNF dimer with SPD304 (PDB code: 2AZ5) using the jFATCAT pairwise structure alignment algorithm. [64] The derived structure was used for the subsequent MD calculations.

Molecular dynamics calculations
Unrestrained MD simulations in water were performed for TNF and RANKL complexes with compounds T8 and T23 using AMBER 12. [65,66] Water molecules were discarded from the crystal structures and hydrogen atoms were added to the proteins with AMBER. TNF and RANKL missing residues were included in the crystal structures with Modeller 9.10. [67,68] Atomic partial charges, bonded and non-bonded parameters for the proteins were represented by the modified ff99SB force field. [69] Geometry optimization of the compounds was obtained with Gaussian 09 [70] at the HF/6-31G Ã level and the ANTECHAMBER program was employed to derive the RESP atomic partial charges for T8 and T23. Force field parameters for T8 and T23 were represented by the general AMBER force field (GAFF). [71] The complexes were neutralized by adding 4 Na + (TNF complexes) and 7 Cl − (RANKL complexes) counterions. TIP3P water molecules [72] were used to model explicit solvation in a truncated octahedron, with a cutoff 10 Å. Long-range electrostatics were considered by the particle mesh Ewald (PME) approach. [73] A combination of lengthy, multi-step steepest descent and conjugate gradient iterations were performed to achieve thorough energy minimization of the complexes: initially, a 500 kcal mol −1 Å −2 restraint was applied to the solute (ligand−protein) to allow minimization of the positions of water molecules only. Then, minimization continued for two more steps with diminished restraints on the solute (10 and 2 kcal mol −1 Å −2 ), before totally removing the restraint for the last step of the minimization. Each minimization stage was carried out in 5000 cycles with a 20 Å cutoff. Subsequently, the systems were gently heated in the NVT ensemble for 100 ps, from 0 K until the target temperature of 310 K was reached. Hydrogen motion was not included in the calculations by applying the SHAKE algorithm [74], thus allowing a 2 fs time step to be used. The Langevin thermostat [75] (collision frequency = 2.0 ps −1 ) was employed to control the temperature. During heating, the solute was also moderately restrained (force constant 10 kcal mol −1 Å −2 ). This restraint was kept for the subsequent 100 ps of equilibration in constant pressure. A final unrestrained MD equilibration step of 100 ps was carried out for each complex. Constant-pressure MD simulations for 50 ns were produced with the GPU (CUDA) version of PMEMD in AMBER 12. [76] The same procedure was applied for two additional simulations, namely for complexes SPD304-TNF and SPD304-RANKL to compare our results with those of the known TNF inhibitor. Trajectory analysis (RMSD, atomic fluctuations, and hydrogen bond calculations) was performed with the ptraj module of AMBER. Donor−acceptor distance and donor−hydrogen−acceptor angle cutoffs of 3.5 Å and 120˚, respectively, were used to calculate hydrogen bond (HB) interactions.

MM-PBSA free energy calculations
The Molecular Mechanics Poisson-Boltzmann Surface Area (MM−PBSA) method estimates free energies of (bio)molecular systems in solution by performing end-state calculations. [77][78][79][80] The division of the binding free energy into individual components offers additional information regarding complex formation. For an inhibitor-protein complex, the following process directs the binding free energy change (ΔG bind ): Calculations were performed on 5000 frames from each trajectory after all water molecules and counterions were removed. For every snapshot, the binding energy is calculated with the following equation: where ΔG bind is the total binding free energy, G complex , G protein , and G inhibitor are the energies for the complex, the protein (TNF or RANKL), and the inhibitor (T8, T23, SPD304), respectively. The binding free energy can be divided into enthalpy and entropy contributions: The enthalpy is given by where ΔE MM is the interaction energy between the protein and the inhibitor and is approximated with the molecular mechanics (MM) method, while ΔG sol defines the change in the free energy of solvation upon ligand binding. ΔE MM is further separated into: ΔE elec is the electrostatic interaction energy and ΔE vdW is the van der Waals interaction energy; no cutoff was applied for the calculation of these two terms. Moreover, the solvation energy (Eq 3) may be expressed in electrostatic (ΔG PB ) and nonpolar (ΔG NP ) terms: The electrostatic (ΔG PB ) energy is estimated by the Poisson−Boltzmann (PB) approach [81] through the PBSA module of AMBER, and a solvent-accessible surface area (SASA) term is used to express the hydrophobic contribution to solvation (ΔG NP ): For the surface tension (γ) and the offset (β), the default values of 0.005420 kcal mol −1 Å −2 and −1.008000 kcal mol −1 , respectively, were considered. A probe radius of 1.4 Å was applied for the solvent (water) in the SASA calculation. ΔG NP (Eq 6) was computed with the linear combinations of pairwise overlaps (LCPO) method. [82] The values of solvent and solute dielectric constants were fixed at 80.0 and 1.0, respectively. [83] The finite-difference grid spacing was 0.5 Å and the ratio between the longest dimension of the grid and that of the solute was set to 4.0. The ionic strength was considered to be 0.1 M. The contribution of entropy (Eq 2) was calculated with the nmode routine of AMBER 12 over the last 100 snapshots of the trajectories, for computational efficiency.

Cross-linking experiments
100 ng of recombinant human TNF or RANKL (Peprotech) were incubated with the inhibitors for 30 min at room temperature and then subjected to cross-linking, using 4.8 mM BS3 (Ther-moFisher Scientific) for 30 min at room temperature. The reaction was stopped by adding 1/ 10th volume of 1 M Tris-HCl, pH 7.5. Samples were then subjected to SDS-PAGE and western blotting using an anti human-TNF or RANKL antibody (both R&D Systems).

Binding affinity assay
For use in the binding assays, recombinant TNF and RANKL were expressed and purified as previously described. [84][85][86] Fluorescence intensity was measure with a Synergy H1 Multi-Mode Reader (BioTek) in 96-well microplates (black) at 25˚C. Briefly, a series of solutions (0.1 mL) containing either TNF (0.75 μM) or RANKL (0.5 μM) and increasing amounts of the ligands were prepared. During preparation of the solutions the buffer was firstly added to all wells followed by the ligand solution and finally the protein was added. All solutions were mixed by pipetting and equilibrated for 1h at 25˚C before measurement. Differences in fluorescence intensity values (λex = 274 nm; λem = 302 nm) between the various protein/ligand complexes and free protein were fitted on a second order equation as described in the Supporting Information (S2 Text) in order to determine the dissociation constant (K d ) of TNF and RANKL with SPD304, T23 and T8. Experiments were performed for TNF in 10 mM citratephosphate (pH 6.5) and for RANKL in 25 mM Tris-HCl buffer, 100 mM NaCl (pH 7.5); both buffers were supplemented with PEG3350 at the concentrations indicated in the respective figures. K d values were calculated as mean ± SEM from three independent experiments.

Results and discussion
We initiated our cheminformatics-aided workflow for the identification of novel TNF inhibitors with an in silico approach based on the TNF crystal structure in complex with SPD304 and a pool of small molecules included in the Maybridge Hitfinder database. The biologically-active TNF is a trimer of identical subunits and SPD304 has been reported to displace one of the subunits, thus resulting in inactive species; that means that functional inhibition is affected by obstructing trimerization. [22,27] The binding site of TNF dimer in this proteinprotein interaction is characterized as predominantly hydrophobic, consisting of glycine, leucine and tyrosine residues. Reported interactions with small molecules are described as hydrophobic and shape-driven as the molecular structures need to be large enough to interact with both subunits to prevent the binding of the third subunit to the TNF dimer. All the 14400 small molecules included in the Hitfinder database were in silico explored in a structure-based framework using molecular modeling and docking scoring. The crystal structure of TNF dimer with SPD304 (PDB code: 2AZ5) was used as the molecular model for our investigation and the compounds were docked into the enzyme's active site. The molecular docking studies were performed using the Surflex-Dock algorithm of SYBYL. [44] Based on the structure-based results, a prioritized list of 30 compounds was constructed by a combination of high docking scores with optimal placements into the binding cavity that resemble the SPD304 binding arrangement for further in silico screening with a ligand-based developed model. The ranking of these prioritized 30 compounds is given in Table 1 and in Supporting Information (S1 Table).  (29) (14) After the initial structure-based screening of our original datasets, we filtered out less promising compounds by developing and applying a ligand-based predictive model. Our model was built on a set of almost 2500 compounds that are included in the PubChem database and have been identified as TNF inhibitors based on an in vitro HTS assay. [40] The ligand-based model was implemented through the KNIME platform [48] with the aid of our in-house developed Enalos KNIME nodes [45] that execute several tasks that are crucial for model development. In particular, we used the Enalos Mold2 node that calculates hundreds of molecular descriptors to encode structural information of compounds, and the Enalos Domain of applicability that is used to define the area of reliable predictions. Details on the Mold2 descriptors are provided in the Supporting Information (S1 Text and S2 Table).
Since the developed KNIME workflow allows this flexibility, we experimented with a great variety of variable selection and model development methods. Moreover, we experimented with consensus approaches and concluded in the most accurate and predictive model, which is  (4)27)23-20 (28) a consensus model based on the majority vote of the results of three different modeling methodologies, namely k-Nearest Neighbor, Nearest Neighbor and Random Forest. [49] The proposed models were validated both internally and externally in terms of goodnessof-fit, robustness and predictivity and were proven to successfully fulfill the criteria recommended by the Organization for Economic Cooperation and Development (OECD) for model validation. [88] For validation purposes, the dataset was separated into training and test sets. The small molecules included in the latter were kept as a blind set and were not used during the development of the model.
The validation results for each methodology and the consensus model are shown in Tables  2 and 3. Based on these metrics, the consensus model outperformed the individual models that were built and thus was considered as the most accurate and reliable to be used in our virtual screening process.
A crucial aspect, which is often neglected by similar studies presented in the literature, is the dissemination of the results to the wider community. [89] It is of major importance that the developed model is not retained only for developers' use, but is broadly distributed to the scientific community so that it could serve as a direct source of information. Furthermore, as recently highlighted by several initiatives, this is most effectively achieved by providing open source tools that could be easily implemented and adjusted to the distinctive requirements of each project.
To encourage and facilitate the reuse of our predictive model, the consensus model has been made publicly available online via the Enalos Cloud Platform. Our model is hosted under the following url: http://enalos.insilicotox.com/TNFPubChem/ and is easily accessible online from any browser, also supporting mobile devices. The interested user can make their own predictions using the user friendly graphical interface that allows multiple options for submitting a new structure. First, as seen in the screen shot below (Fig 3), a sketcher is available where a new molecule can be drawn and structurally modified. The structure can be either directly submitted to generate a prediction or can be copied as SMILES. The second option includes the SMILES notation submission in the upper right of the web page, where the user can paste one or more SMILES notations for one or a batch of molecules and then submit the whole list for prediction. Finally, with the third option the user can upload an SDF file with multiple entries and submit the file for prediction. When the model "TNFPubChem" is selected and the structures are submitted in either way, the prediction is generated as an html page or a CSV file. The prediction outcome includes a classification for each of the given structures and an indication of whether the prediction can be considered reliable or not, based on the domain of applicability of the model (Fig 4). The web service does not require special computational skills and can be easily used by scientists of different disciplines, including chemists, biologists, physicists, and engineers, involved or interested in the biological evaluation of TNF inhibitors.
The consensus model was finally used to predict TNF inhibition by the 30 compounds proposed in the previous step. Among those, the top 9 commercially-available small molecules  being predicted as active by the ligand-based model within its domain of applicability (Table 1) were selected for experimental validation to quantify their inhibitory potency against TNF. These computationally-identified hits were also filtered for Pan Assay Interference Compounds (PAINS). [87,90] As shown in Table 1, none of the predicted active compounds was identified as PAINS and thus none was excluded from the subsequent experimental validation. Moreover, we tested whether a simpler approach using 2D similarity search based on Tanimoto metric between the known active compound (SPD304) and our identified hits would have yielded comparable results with our proposed methodology. Low similarity between SPD304 and identified hits in the range of 0.15-0.45 (Table 1) confirmed that a simpler approach could not have proposed the structures identified based on our methodology.
Biological screening of the compounds examined their capacity to block TNF function in a modified TNF bioactivity assay. [91] The basis of this test is the death-inducing function of TNF in the murine fibrosarcoma cell line L929 following sensitization by the transcription inhibitor actinomycin D. Functional potency of the compounds would translate in a significant reduction of TNF-induced death. Using this approach, out of the nine shortlisted small molecules, two compounds (T8 and T23) were selected as follows. After an initial screening at a concentration of 20 μM, compounds displaying ! 25% inhibition of TNF-induced cell death were further examined in dose-response tests in order to estimate their IC50 values. In these dose-response experiments, T8 and T23 inhibited TNF-driven toxicity in L929 cells with IC 50 values of 40±2.3 and 17±1.2 μM, respectively (S1 Fig). Given that screening has been based on protection from TNF-induced death, compounds that were highly toxic themselves would not have passed this first level of testing. Indeed, both compounds were found to have low toxicity against L929 cells (S2 Fig). Importantly, NMR and MS data for T8 and T23, as shown in S3-S6 Figs and S2 Text in Supporting Information, confirmed the purity for both compounds to be above 95%.
Having established that the selected compounds obstruct TNF function, and since TNF exerts its functions mainly through interacting with the TNFR1 receptor, a further test was performed to evaluate the effect of inhibition on this protein-receptor interaction. The ELISAbased test revealed that both compounds significantly block this interaction, with measured IC 50 values of 30±2.3 (T8) and 3±0.1 (T23) μM (S7 Fig). Finally, direct binding of the compounds to TNF was demonstrated using a fluorescencebased binding assay (S8 Fig). T23 was found to have the lowest K d value (2.8±0.2 μM), while T8 had a K d of 8.8±0.8 μM, and SPD304 of 5.8±0.5 μM.
In previous work, we had shown that SPD304 is also a potent inhibitor of RANKL [42], another member of the TNF superfamily, mainly involved in the regulation of osteoclast formation and bone resorption. [92] We were hence prompted to explore the two compounds with respect to RANKL also, firstly by in vitro testing. We evaluated the effect of T8 and T23 on RANKL-dependent osteoclast differentiation in a culture system of bone marrow cells stimulated with RANKL and M-CSF for 5 days through evaluation of the tartrate-resistant acid phosphatase (TRAP) activity, an osteoclast-specific enzyme. Using a quantitative assay that measures TRAP activity, both compounds were found to inhibit significantly the formation of osteoclasts in a dose-dependent manner, with an IC 50 of 9.0±0.7 μM (T8) and 2.6±0.2 μM (T23) (S9 Fig). In order to exclude the possibility that the inhibitory effect of the compounds is correlated with high cell toxicity, a viability assay was employed in primary bone marrow cells. Of the two compounds, T8 displayed a low toxicity with an LC 50 of 42.6±4.3 μM, while T23 had a higher toxicity (LC 50 = 5.9±0.3 μM, S10 Fig). It should be noted that the increased toxicity of the two compounds in these cells in comparison to the toxicity in L929 cells could be due to the increased sensitivity of primary cells and that in both cases, the toxicity was lower as compared to SPD304. Direct binding of the compounds to RANKL was validated with K d values of 6.3±0.6 μM (T8) and 7.3±0.4 μM (T23) (S11 Fig). Interestingly, these values were lower than that of SPD304, which was measured to be 13.8±0.7 μM. All in vitro results for both TNF and RANKL are shown in Table 4.
The two proposed molecular scaffolds were also explored in a structure-based scheme to further investigate binding to TNF and RANKL. For this purpose, we used the jFATCAT pairwise structure alignment algorithm [64] to align the RANKL structure (PDB code: 1S55) to the crystal structure of TNF dimer with SPD304 (PDB code: 2AZ5). We then performed molecular dynamics (MD) simulations and binding free energy calculations (MM-PBSA) for the SPD304, T8 and T23 complexes with TNF and RANKL to rationalize the docking calculations in a more rigorous fashion and to complement the experimental results with additional information. The MD analysis indicated the high stability of the complexes, since it was shown that the simulations eventually converged for all protein structures (S12 Fig). Indeed, Cαbased RMSD values for TNF in its complexes with the compounds were particularly stable, with slight fluctuations around an average deviation of approximately 2 Å from the crystal structure, while RANKL deviated more from the conformation of the crystal structure (~3.5 Å); however, it stabilized its structure after 35 ns in all complexes. The flexibility of individual TNF and RANKL residues in each complex is presented in S13 Fig. In general, RANKL residues appear more flexible than TNF residues and the highly flexible RANKL region around terminal residues 155-160 of monomer A in the T8 complex may suggest a less stable structure for the complex (S13 Fig). Additional information on radii of gyration and B-factors for the complexes is shown in S3 Table (Supporting Information).
All-atom RMSD calculations for each compound in either TNF or RANKL revealed differences in the binding modes of the complexes. As expected, SPD304 appears particularly stable in the binding sites of both proteins throughout the simulations (Fig 5). Interestingly, T23 also has very low RMSD values in its complexes (avg.~1Å), thus indicating a high degree of stability. This may be due to enhanced interactions between T23 and the proteins, which suggest efficient T23 binding to TNF and RANKL. On the contrary, T8 displayed significant conformational changes throughout the runs, and especially in complex with TNF. In agreement with the experimental results, the increased flexibility of T8 in TNF may have reduced its binding to the protein.
The hydrogen bond (HB) analysis supported the experimental observations regarding the hydrophobic nature of the interactions that govern SPD304 binding to either protein target. It was observed that SPD304 was not involved in any HBs with TNF, thus further supporting the hypothesis of favorable hydrophobic interactions as the main driving force of SPD304-TNF binding. Similarly, HB interactions were not observed between SDP304 and RANKL.
On the other hand, it was suggested that despite the hydrophobic nature of the TNF and RANKL binding pockets, particular groups on T23 and T8 facilitate the formation of HBs with cavity residues. Therefore, the high stability of T23 in TNF and RANKL may be also attributed to HB networks between the ligand and particular TNF [Tyr119 (A chain), Ser60 (B chain), Gly121 (B chain)] and RANKL [Tyr214 (A chain), Trp192 (B chain), Ala193 (B chain)] residues. Also, the HB analysis on the T8 complexes may justify the significant structural variation of T8 in TNF compared to its relatively stable structure in RANKL, as depicted in Fig 5. Similar to SPD304, T8 did not participate in any HB interactions with TNF, and this may have enabled the molecule to frequently change conformations, whereas its structure is obviously restricted when bound to RANKL (Fig 5), mainly because it forms stable HBs with binding site residues [Asn275 (A chain), Gly277 (A chain), Gly278 (A chain), Asn275 (B chain)]. Representative conformations with main HBs between T8/T23 and TNF/RANKL (as obtained from the respective MD trajectories) are shown in S14 Fig. In agreement with the docking results, the compounds are in close proximity to the Leu120-Gly121-Gly122 β strand of TNF. This is particularly important, since it has been reported that SPD304 blocks the TNF trimerization and thus diminishes the biological activity of the protein by interacting with Gly122. [42] Additionally, the significance of Gly278 in RANKL trimer association has been indicated in several studies; [42,93,94] Douni et al. [42] have described an osteopetrotic mouse model that is based on a missense point mutation introducing a G278R substitution at the interface between the RANKL monomers. The mutated RANKL protein fails to form bioactive trimers, activate the RANK receptor, or induce osteoclastogenesis. [42] Modeling studies on SPD304-RANKL suggested that SPD304 should be located at an optimal position near Gly278 (~4 Å) to inhibit RANKL trimerization. [27] This observation coincides with the SPD304-Gly278 distance as observed from the 50 ns MD trajectory of the RANKL complex. Indeed, the average distance between the closest SPD304 atom and the Cα atom of Gly278 was calculated to be 4.25 ± 0.30 Å. Representative MD snapshots for the TNF and RANKL complexes with SPD304, T8 and T23 are presented in Fig 6. Residues Gly122 (TNF complexes) and Gly278 (RANKL complexes) are shown in red, while other residues (highlighted orange), such as Leu57, Gln61, Tyr59/119/151 (TNF complexes) and Tyr214/216/306, Asn275, Phe310 (RANKL complexes) are also in close proximity to the ligands and are considered crucial for either TNF or RANKL inhibition. For instance, TNF residue Tyr119 was observed to form stable HBs with T23, while RANKL residues (Asn275, Tyr214) show HB interactions with T8 and T23 as described above; other binding site amino acids, such as Tyr59/151, Gln61 and Leu57 have been reported as "contact residues" to SPD304 in the TNF crystal structure. [27] MM-PBSA free energy calculations were carried out for each complex in order to validate the predictions of our models. Indeed, this analysis verified the experimental results, since it demonstrated that T8 and T23 have comparable binding energies to SPD304 (Table 5). Particularly, T8 and T23 binding energy values in TNF are close to the binding energy of SPD304, and they differ from each other by~5 kcal/mol. It is important to highlight that such energy differences lie within the expected error of the method, and the corresponding binding energies are practically indistinguishable. [83,95] The high binding affinity of SPD304 to TNF in the absence of any HB interactions further supports the above claims regarding the hydrophobic nature of TNF binding. It is however predicted that T8 binding to TNF is associated with a high entropy penalty compared with the other compounds and also with its binding to RANKL (Table 5). This difference denotes the reduced hydrophobic character of the T8-TNF interaction and may also rationalize the above structural observation regarding the high conformational flexibility of T8 in TNF (Fig 5). On the other hand, the HB interaction between T8 and Asn275 (S14 Fig) in the RANKL complex may have resulted in a stable structure that is accompanied by a more favorable entropy term (Table 5). T23 appears to bind both protein targets equally strongly, with combined binding energy values that suggest most effective dual inhibition of TNF and RANKL than T8. Interestingly, in agreement with the potency experiments, our predictions ranked the three compounds with decreasing binding affinity to TNF as follows: SPD304 > T23 > T8. Finally, the MM-PBSA analysis revealed that the most favorable contributions to the binding enthalpy in all complexes are attributed to van der Waals interactions, followed by the nonpolar contribution to solvation, while the total electrostatics disfavor either TNF or RANKL complex formation.
As the modeled mechanism of action of T8 and T23 on both TNF and RANKL consists of an obstruction to the trimerization interface, the effect of inhibition at the level of trimerization was examined using chemical cross-linking experiments. As can be seen in Fig 7, both inhibitors can prevent the formation of trimers both for TNF and RANKL, when used at specific molar ratios. These results confirm that not only T8 and T23 are direct inhibitors of TNF and RANKL but also they do so by affecting the biologically-active configuration of these molecules, i.e. the trimeric form. Finally, we addressed aggregation-induced and general non-specific effects in two ways. Firstly, by plotting saturation curves using the binding assay raw data and performing Scatchard analysis. For all protein/ligands tested (TNF/T8, TNF/T23, RANKL/T8 and RANKL/T23), the linearity of the Scatchard plots resulted in an 'n' value of 1, suggesting that there is only one ligand binding site per protein molecule. Representative binding and saturation curves and Scatchard plots are illustrated in S15 and S16 Figs. Secondly, we did not find evidence for non-specific effects due to aggregation as judged by the method proposed by Feng & Stoichet [96] (Table 6). Briefly, we examined the inhibitory effect of T8 and T23, as well as SPD304, on TNF and RANKL, using varying concentrations of the two tested proteins in the presence of varying concentrations of solubility enhancing agent PEG3350. Inhibition remained practically unchanged by fluctuating these two concentration parameters, indicating that both small molecules inhibit TNF and RANKL via specific mechanisms as proposed by our computational model.
Collectively, our computational approach succeeded to identify two compounds that were experimentally evaluated as dual TNF and RANKL inhibitors, with T8 being the one with the lowest toxicity. Within the proposed strategy, the Enalos Cloud Platform emerges as a key component for the evaluation of novel small molecules that have not been experimentally evaluated or even synthesized and can also be expanded for several drug-related properties (e.g., inclusion of toxicity and solubility models). Our proposed methodology and tools can also be expanded and applied to other biological targets that are now gaining attention. Specifically, in Table 6. Physicochemical properties of SPD304, T8 and T23, and determination of their dissociation constant with TNF or RANKL under different assay conditions. PEG3350 was used as co-solvent in order to enhance inhibitors solubility and thus eliminate the possibility of aggregates formation. the context of developing novel treatments for chronic inflammatory diseases, the proposed molecular scaffolds, T8 and T23, could be investigated as lead compounds for future drug design targeting TNF and RANKL inhibition.

Conclusions
In this work, we have identified computationally and validated experimentally two small-molecule compounds that function as direct inhibitors of TNF by blocking the protein-protein interaction (PPI) between the cytokine and its receptor. Both compounds (T8 and T23) were confirmed to be direct inhibitors of TNF function, with IC 50 values comparable to those of the potent inhibitor SPD304, and showed low toxicity even at concentrations above 100 μM. Our computational approach combined structure-based modeling with ligand-based modeling. The predictive ligand-based model was made publicly available [60] through the Enalos Cloud Platform (http://enalos.insilicotox.com/TNFPubChem) and can be used for the predictions of TNF inhibition (specific NF-κB induction) of novel small molecules.
Most importantly, our proposed small molecules, T8 and T23, were validated experimentally to act as dual inhibitors, being able to hinder both TNF and RANKL function at the low micromolar range; thus, they are proposed as the second and third examples of dual PPI inhibitors of TNF and RANKL. T8 and T23 were proven to directly bind to both TNF and RANKL and affect the formation of biologically-active trimers as predicted by our computation model. Furthermore, molecular dynamics calculations provided complementary information regarding the interactions at the molecular level of the two compounds in the TNF/RANKL complexes. The proposed molecular scaffolds could be further optimized in drug design targeting TNF and RANKL, ultimately aiming at the development of novel treatments for a range of inflammatory and autoimmune diseases.