Insights into the Interactions of Fasciola hepatica Cathepsin L3 with a Substrate and Potential Novel Inhibitors through In Silico Approaches

Background Fasciola hepatica is the causative agent of fascioliasis, a disease affecting grazing animals, causing economic losses in global agriculture and currently being an important human zoonosis. Overuse of chemotherapeutics against fascioliasis has increased the populations of drug resistant parasites. F. hepatica cathepsin L3 is a protease that plays important roles during the life cycle of fluke. Due to its particular collagenolytic activity it is considered an attractive target against the infective phase of F. hepatica. Methodology/Principal Findings Starting with a three dimensional model of FhCL3 we performed a structure-based design of novel inhibitors through a computational study that combined virtual screening, molecular dynamics simulations, and binding free energy (ΔGbind) calculations. Virtual screening was carried out by docking inhibitors obtained from the MYBRIDGE-HitFinder database inside FhCL3 and human cathepsin L substrate-binding sites. On the basis of dock-scores, five compounds were predicted as selective inhibitors of FhCL3. Molecular dynamic simulations were performed and, subsequently, an end-point method was employed to predict ΔGbind values. Two compounds with the best ΔGbind values (-10.68 kcal/mol and -7.16 kcal/mol), comparable to that of the positive control (-10.55 kcal/mol), were identified. A similar approach was followed to structurally and energetically characterize the interface of FhCL3 in complex with a peptidic substrate. Finally, through pair-wise and per-residue free energy decomposition we identified residues that are critical for the substrate/ligand binding and for the enzyme specificity. Conclusions/Significance The present study is the first computer-aided drug design approach against F. hepatica cathepsins. Here we predict the principal determinants of binding of FhCL3 in complex with a natural substrate by detailed energetic characterization of protease interaction surface. We also propose novel compounds as FhCL3 inhibitors. Overall, these results will foster the future rational design of new inhibitors against FhCL3, as well as other F. hepatica cathepsins.


Introduction
Fascioliasis or hepatic distomatosis, caused by the food-borne trematodes Fasciola hepatica and Fasciola gigantica, is considered one of the most important parasitic diseases, which constitutes a serious public health problem and has a significant veterinary relevance. Economically important animals affected by this disease include cattle, sheep and goats [1,2]. Fascioliasis symptoms are host-specific, but generally comprise reduced milk and wool yields, weight gains, and fertility [3]. Recently, the global burden of fascioliasis was calculated and it has been estimated that 2.6 million people are infected with Fasciola spp. [4].
Despite the economic losses as well as the negative impact on human health, chemotherapy is currently the only viable parasite control mechanism. Benzimidazoles, in particular triclabendazole, are the most commonly-used drugs. Their targets are both immature and mature forms of the parasite, but their continued use has led to drug resistance [5]. Therefore, the search for new strategies and target molecules for the development of novel fasciolicide drugs is urgently required.
The most abundant molecules found in F. hepatica secretions are papain-like cysteine proteases, termed cathepsins, which are grouped in cathepsin L and B families [6,7]. They are secreted in vesicle packages by gastrodermal cells into parasite gut lumen, and then released into host tissues [8]. In recent decades, the role of these proteases has been widely studied due to their importance as potential targets for the treatment of many parasite infections [9]. Cathepsins are critical for the development and survival of the parasite within the mammalian hosts. They participate in the digestion of host components such as fibronectin, collagen and albumin, which facilitates parasite migration and feeding, and can also degrade immunoglobulins and T cell surface molecules, thereby promoting immune evasion [10][11][12]. These proteases have an active site formed by five subsites, i.e., S3-S2-S1-S1'-S2', the substrate specificity being governed by S2 and S3 subsites [13]. An analysis of the residues comprising the S2 and S3 subsites in several members of the cathepsin L family, reveals the divergence within these subsites, in particular at positions that have the greatest influence on substrate recognition, i.e., 61,67,157,158 and 205 (papain numbering) [6,14,15].
F. hepatica can regulate the differential expression of cathepsins during its life cycle. These expression patterns have been associated with the functional diversity of papain-like proteases [12,16,17]. Previous studies have detected cathepsin B (FhCB) and L3 (FhCL3) secretion in early invasive-stage parasites [18]. The prevalence of cathepsin L-like activity after excystation was observed in in vitro assays [19]. Also, experiments with an RNAi derived from an FhCL1 gene fragment encoding a region conserved across the cathepsin L family, led to the induction of phenotypes with abnormal motility in F. hepatica newly-excysted juveniles (NEJ) and a significant reduction of rat intestinal wall penetration [20]. The predominant cathepsin, found by proteomic analysis in the NEJ excretion/secretion products, is procathepsin L3 (proFhCL3) [21]. The zymogen form of this peptidase progressively changes to the mature enzyme during the first 48h of NEJ development, which is mainly involved in penetration and immune response evasion [18]. Additionally, partial protection against fascioliasis in rats was obtained using a recombinant form of FhCL3 [22]. These findings suggest that FhCL3 could be a potential target for new therapies against early stages of parasite infection.
It is widely accepted that the interaction patterns between enzymes and their natural substrates provide insights for drug design [23,24]. Accordingly, some studies have been conducted to assess the substrate specificity of FhCL3, as well as the role of some enzyme residues (i.e., H63 and W69) in the substrate-binding process [25]. It was also demonstrated the strong preference of this cathepsin for Pro and Gly residues at P2 and P3 sites, respectively, through the usage of positional scanning synthetic combinatorial libraries [25]. The previous finding was linked to the FhCL3 collagenolytic activity, since type I and type II collagens possess repeating Gly-Pro-Xaa motifs [26]. To date some in silico modeling tools have been applied to provide structural insights into the interaction of FhCL3 with peptidic substrates, as well as to predict the affinity of the enzyme for various peptides [27]. However, no previous detailed energetic analysis of the FhCL3-substrate interactions has been performed yet. Therefore, we believe that the latter is required not only to complement previous structural analyses, but also to establish at an atomic level the nature of the interactions present at the complex interfaces, as well as to quantify their energy contribution to the binding process.
Here we carried out a thorough energetic study of the binding site of FhCL3 in complex with a peptidic substrate based on a homology model. Furthermore, an FhCL3 model and HuCatL crystal structure were used for Virtual Screening (VS) studies and compounds with higher selectivity for the former enzyme were subsequently selected according to their Autodock Vina energy-scores (S vina ) [28]. Finally, binding affinities were estimated through MM-GBSA absolute binding free energy (ΔG bind ) calculations [29][30][31] based on thermodynamic ensembles generated with molecular dynamics (MD) simulations.
In order to provide insights into the binding mode of peptidic substrates to FhCL3, a 3D model of this enzyme in complex with a specific peptide, ACE-AGPR#NAA-NME, was also built. The procedure for obtaining the complex structure was the same reported by Robinson et al [27].

Virtual screening
VS against the FhCL3 homology model and the HuCatL structure (PDB: 2YJC) was carried out with AutoDock Vina v4.0 software [28]. Synthetic lead compounds from HitFinder database of the Maybridge British company (http://www.mybridge.com) were selected for this study. The VS was run using software default settings, however, the number of energetically-degenerated poses was set to ten. During docking simulations, all rotatable bonds of each ligand were allowed to freely move around the bond axes, while the protein structure was kept fixed. The grid box used to define the screening was centered on the catalytic cysteine residue, i.e., C25 of FhCL3 and HuCatL, employing AutoDockTools. Box dimensions X, Y and Z were set to 16.5, 21 and 15 Å, respectively.
To identify compounds with higher specificity for FhCL3, hit selection was based on the relative affinity for FhCL3 and HuCatL obtained from ΔS vina values. Furthermore, these hits were submitted to the DrugMint server [41] for the selection of drug-like compounds based on the best probability scores. The most probable pose of each compound within the FhCL3 binding site was selected by visual inspection. Some criteria taken into account for pose selection were (i) the number of hydrogen bonds between the compound and the enzyme residues and (ii) the available information for other compounds containing some similar chemical groups in complex with papain-like proteases [42,43].

Ligand parametrization
The 3D structures of the ligands were obtained from the SDF format using Babel [45]. Then, Avogadro [46] was used for ligand protonation at pH = 7.4 and for subsequent steepest-descents energy minimization using Generalized Amber Force Field (GAFF) parameters [47]. Minimized structures were then optimized at HF/6-31G Ã level using Gaussian 09 package [48]. Electrostatic potentials (ESPs) for the optimized structures were finally generated by singlepoint calculations in Gaussian 09 with HF/6-31G Ã method and Merz-Kollman (MK) scheme [49]. Partial atomic charges were fitted to the ESPs through the Restricted Electrostatic Potential (RESP) method [50] implemented in the Antechamber program [51]. Likewise, ligand atom types, bond and dihedral angles, atomic masses and bond lengths were obtained from GAFF using Antechamber [51].

Energy minimization and molecular dynamics simulations
EM of free FhCL3 and all protease-ligand complexes was performed using GROMACS v4.6.3 [52] with the AMBER99SB-ILDN force field for the enzyme [53] and GAFF for the ligands. Briefly, protonation states of the FhCL3 ionizable residues were determined at pH = 7.4 by using the PDB2PQR server [54]. All systems were solvated with explicit TIP3P water molecules [55] in a dodecahedral box whose edges were placed at a minimum distance of 1 nm from the solute surface, and neutralized by replacing water molecules with Na + counter ions. EM was carry out by 50000 of steepest descents steps with a tolerance of 10 kJ/(molÁnm), to relax high energy interactions and steric clashes.
Subsequently, the equilibration procedure was conducted in two steps: NVT and NPT ensembles keeping the solute heavy atoms restrained. During the 200 ps NVT equilibration, temperature was kept constant at 300 K using the velocity-rescale thermostat [56]. The subsequent 200 ps NPT equilibration was performed at a temperature of 300 K using the same temperature coupling algorithm and at a pressure of 1 bar with the Parrinello-Rahman barostat [57]. A time step of 2 fs was employed to integrate the equation of motion using the Leap-Frog algorithm [58]. Random initial velocities taken from the Maxwell-Boltzmann distribution were assigned to the atoms of each system at 300 K. Cutoff radii of 1.4 nm and 1.0 nm were established for the calculation of van der Waals and short-range electrostatic interactions, respectively. The Particle Mesh Ewald (PME) method [59] was employed to handle long-range electrostatic interactions. Periodic boundary conditions were used at the boundaries of the unit cell. Neighbor lists were defined by a cutoff radius of 1.0 nm and were updated every 10 fs. All bond lengths were constrained with the LINCS method [60].
The productive run time was 130 ns and 100 ns for FhCL3-ligand and FhCL3-substrate complexes respectively. System coordinates and initial velocities were taken from the NPT simulation output. Constant temperature and pressure of 300 K and 1 bar were maintained during the simulations with the velocity-rescale thermostat [56] and the Parrinello-Rahman barostat [57], respectively. The integrator, cutoff radii, constraint algorithm etc. were identical to those used during the equilibration steps.

Binding free energy calculation and decomposition
MM-GBSA and the Molecular Mechanics/Poisson-Boltzmann Surface Area (MM-PBSA) are computationally efficient methods for estimating the binding free energy (ΔG bind ) of protein-ligand complexes [29-31, 61, 62]. In these methods, ΔG bind is expressed through the well-known equation: Each term on the right-hand side of the former equation is expressed as follows in the MM-GBSA and MM-PBSA formalism: E MM, is the energy of the complex, the ligand or the protein in the gas phase, calculated as the sum of the internal energy (E inter ), the van der Waals energy (E vdw ) and the electrostatic energy (E elec ) as expressed below: The second term of Eq 2, ΔG solv , represents the free energy of solvation calculated by means of implicit-solvation models. This term is further decomposed into the sum of the polar solvation (G GB/ PB ) and the non-polar solvation (G SA ) contributions (Eq 4) In the Poisson-Boltzmann model (PB), the polar contribution is computed through the wellknown PB equation. On the other hand, in the case of GB models, the polar solvation component (G GB ) is calculated through Eq 5 proposed by Still et al [63]. Even though the PB model is considered as a more rigorous approach, GB models are less computationally-demanding and often give fairly satisfactory predictions [61,64].
The term ε w is the dielectric constant of the solvent (e.g. water). i and j represent the solute atoms, being r ij the distance between them, q i and q j , their partial charges, and R i and R j , their effective Born radii.
The non-polar solvation contribution is calculated through Eq 6, where SA stands for the solvent-accessible surface area of the solute. Coefficients γ and β are empirical constants with values of 0.0072 kcal/mol and 0, respectively, for the GB models [65].
In this study, MM-GBSA free energy calculations were performed using MMPBSA.py module of AMBER12 [66]. The snapshots of the complex, the receptor and the ligand were extracted from a single desolvated trajectory. GB OBC1 model (igb = 2) with mbondi2 radii [67] was used for estimating ΔG GB . Snapshots belonging to the equilibrated trajectory were used for the calculation of effective binding free energy (ΔG eff ), which comprises all the energy terms in the right-hand member of Eq 2 except for the entropy contribution. Additionally, the snapshots were considered as statistically independent from each other during ΔG eff calculations. Conformational entropy associated with ligand binding was estimated by Normal-Mode Analysis (NMA) [68,69]. For entropy calculations, seventy frames evenly extracted from the productive MD simulations were taken. Prior to normal-mode calculations the complex, the ligand and the receptor were subjected to 50000 cycles of EM using a distance-dependent dielectric constant of 4r (r being the distance between atom pairs) and a dmrs value of 10 -4 kcal/ (mol Å) as the convergence criterion for the root-mean squared gradient. Per-residue effective free energy decomposition (ΔG res ) was carried out in order to determine the more important residues involved in FhCL3-ligand and FhCL3-substrate interactions [70]. Also, pair-wise effective free energy decomposition was performed for the FhCL3-substrate complex [70].

Trajectories analysis and determination of protein-ligands interactions
The trajectories were analyzed with tools provided by GROMACS v4.6.3 package [52]. The Root Mean Square Deviation (RMSD) was calculated during the productive run with respect to both the starting and average structures. Visual Molecular Dynamics (VMD) v1.9.1 [71] was used to visualize trajectories and to convert the GROMACS MD trajectory format (xtc) into AMBER trajectory format (crd). Hydrogen bonds established between each ligand and FhCL3 were calculated employing a donor-acceptor distance cutoff 3.5 Å and a donor-acceptor-hydrogen angle cutoff 30 degrees during the equilibrated productive trajectory. PYMOL v1.6 [72] and LigPlot [73] were used for visualization, and Gnuplot v4.4 for graphic analysis of time profiles.

Results and Discussion
Energetic characterization of the interaction interface of FhCL3 in complex with a peptidic substrate A 3D-model of FhCL3 was generated based on a MSA of twelve papain-like proteases (S1 Fig) and using the crystal structure of proFhCL1 C25G (PDB: 2O6X) [14] as a template. The assessment methods employed here confirmed the high quality of the 3D-model of FhCL3 (S1 Table and S1-S4 Figs), thereby suggesting its suitability for further structure-based analyses.
MD simulations in combination with MM-GBSA per-residue and pair-wise free energy decomposition were performed for the FhCL3-peptide complex. Convergence and stability of the MD simulation was monitored through the inspection of structural and energetic properties. RMSD values showed different time evolution when calculated for the heavy atoms of the whole complex and for those of the peptide (Fig 1A). In this regard, the whole complex showed relatively stable RMSD values, whereas the peptide displayed structural fluctuations during the first 20 ns, indicating a delay on its stabilization into binding site. This difference is a consequence of the slight contribution of the peptide atoms to the global RMSD. Additionally, instantaneous ΔG eff values were calculated (Fig 1B). It is noteworthy that the accumulated mean values of ΔG eff reached relatively stable values during the MD simulation. Overall, these results suggest that 20 ns is a suitable equilibration time and, therefore, the last 80 ns were used to calculate mean ΔG eff values.
Ten residues, i.e., Q19, G23, G25, W26, G67, W69, Y143, T161, H162 and W184, of FhCL3 largely contribute to the substrate binding (ΔG res -1.0 kcal/mol) according to the predictions of the per-residue free energy decomposition protocol (Fig 1C). Our results showed that most peptide-FhCL3 interactions are governed by non-polar contributions (i.e., mainly van der Wals interactions). It is worth noting that some residues widely conserved within the cathepsin L family, i.e., Q19, G23, C25, W26, G67, H162 and W184 (S1 Fig) are included within the previous list. Particularly, the catalytic residues C25 and H162 are hot-spots (residues whose side chain contribute in more than 1 kcal/mol to ΔG eff ) which establish important pair-wise interactions with the peptide residues from the P2 to the P1' site (Table 1). Q19 is another hot-spot with a large electrostatic per-residue contribution located within the oxyanion hole of the enzyme [74]. This residue strongly interacts with the P1' site residue through the formation of a hydrogen bond (S5 Fig). Likewise, G67 was predicted as an important residue for anchoring the substrate through the formation of a hydrogen bond, in this case, with the residue at P2 (S5 Fig). Interestingly, equivalent interactions involving Q19 and G67 have been observed in the 3D structures of other papain-like proteases in complex with peptidomimetic compounds [75,76]. Finally, the side chain of W184 and, to a less extent, that of W26 establish favorable van der Waals interactions with residues at P1'-P3' and P2 sites, respectively (Table 1 and Fig 1C  and 1D). Overall, our predictions are in agreement with the essential roles attributed to some of the previously-mentioned residues within the papain-like family.
On the other hand, some non-conserved FhCL3 residues such as W69, Y143 and T161 have the largest per-residue free energy contributions to the complex formation (Fig 1C and 1D). W69 establishes strong van der Waals interactions with the substrate residues Ala(P4), Gly(P3) and Pro(P2) ( Table 1). Interestingly, our predictions showed that the ring of Pro(P2) adopts a nearly-perpendicular conformation with respect to the indol group of W69 (Fig 1D), which precludes the stabilization through stacking interactions proposed before [21,25]. Probably, a crucial role of Pro at the P2 site, given its bend-inducing capacity, is to promote the appropriate conformation of the substrate backbone within the enzyme binding site, especially that of the residue at P3, which strongly interacts with W69 (Table 1). In addition, the structural analysis showed the occurrence of close contacts between the backbone of residues at P4 and P3 sites with the indol ring of W69. Therefore, the substitution of the previous substrate residues by bulky amino acids could disrupt the interface complementarity thereby reducing the binding affinity as has been proven at least for the P3 site [25]. All these predictions clarify from an energetic point of view the experimental results that established the importance of W69 in FhCL3 Interactions with a Peptide and Potential Inhibitors determining the enzyme specificity for Gly and Pro residues at P3 and P2, respectively [21,25,27]. Of note, the MD simulation of the peptide-FhCL3 complex performed here also confirmed that the most representative rotameric configuration of W69 side chain in the bound state of the enzyme corresponds to that predicted by Corvo et al based on molecular modeling approaches [25]. This particular conformation partially occludes the S2 subsite and favors the interaction with Gly(P3) as predicted through our energetic analysis (Table 1) and also suggested before [25,27]. In the case of Y143, it was predicted the formation of a hydrogen bond comprising its phenol group and the carbonylic oxygen of Asn(P1') (S5 Fig), which, in addition to its van der Waals interaction with Ala(P3'), explains the large energy contribution of this residue (Table 1 and Fig 1D). In general, we believe that the interaction with Y143 might enhance the specificity of ligands toward FhCL3, given that this residue is not conserved within the papain-like family and also bears a hydroxyl group with the capacity of forming specific hydrogen bonds with the substrate.
Additionally, we obtained that the backbone of T161 has the largest energy contribution to the complex formation (Fig 1C and 1D), which mainly arises from the hydrogen bonds involving its carbonyl oxygen (O) and the amidic hydrogen atoms of residues at P1 and P1' sites (S5 Fig). Note that although position 161 is not conserved throughout the papain-like family [21] (S1 Fig) and its energy contribution to the substrate binding is large, at least for the complex analyzed here, its nature seems to be irrelevant. The latter stems from the fact that its interactions with the peptide residues are mediated mostly by its backbone oxygen atom rather than by its side chain hydroxyl group, which extends away from the interface in disagreement with previous suggestions [21]. Remarkably, equivalent interactions have been observed between D158 of papain and the amidic hydrogen atoms at the P2 site of petidomimetic inhibitors (PDBs: 1PPP and 1CVZ) [75,77], thereby reinforcing our previous predictions. Finally, we predicted the low energy contribution of H63 to the substrate binding. Hence, this position is not likely to be involved in ligand binding. This result explains the little impact of the H63N mutation on the specificity substrate profiles of FhCL3 [25].

Virtual screening
The VS protocol used for the identification of potential ligands of FhCL3 was first validated through the non-covalent re-docking of nitrile within the binding site of HuCatL. This inhibitor has the sulfone chemical group (Fig 2A), which is common in many parasitic cysteine protease inhibitors [78,79]. Irreversible inhibition mechanism occurs through covalent bond formation with the thiolate of the catalytic cysteine [80]. The RMSD for the heavy atoms of the re-docked pose having the highest S vina value with respect to the experimental binding mode was only 3.4 Å (Fig 2B). Additionally, hydrogen bonds between nitrile and residues G68 and D162, were also reproduced in the predicted HuCatL-nitrile complex (Fig 2C). Note that only non-covalent interactions were taken into account in the re-docking simulation, therefore, the FhCL3-nitrile pre-complex rather than the actual covalent complex was modelled here. Overall, this procedure provided a reasonable prediction of nitrile experimental binding mode which, in turn, validates our docking protocol.
In order to reduce cross-inhibition between host and parasite targets, we took into account only the potentially-selective ligands for the parasitic enzyme. Through VS calculations, twelve compounds with relative binding free energy values lesser than -1.50 kcal/mol (ΔS vina <-1.50 kcal/mol) between FhCL3 and HuCatL complexes were identified (S2 Table). Moreover, according to the DrugMint scores, only five compounds constitute potential drug-like non-peptidic inhibitors (S2 Table). The names of these selective compounds and their S vina values are listed in Table 2. Non-peptidic inhibitors are considered as the best strategy for in vivo inhibition in order to avoid degradation by proteases. In this sense, structure analysis suggests a common peptidomimetic scaffold among the selected compounds. Besides, they all possess a certain number of aromatic moieties, i.e., phenyl, naphtalene and bencyl groups, which increases their hydrophobicity. Those moieties could establish favorable hydrophobic interactions with the non-polar residues of FhCL3 binding site (Fig 3). Interestingly, several cysteine protease inhibitors reported so far bear aromatic functional groups in their structures [43,81,82]. On the other hand, the heterocyclic rings, i.e., triazole, pyrrol and isoxazole, present in some of the selected compounds, could establish polar interactions with various active site residues (Fig 3).
Some of the selected compounds, i.e., HTS12701 and RH0159, share a thiomethylene-ketone moiety (Fig 3A and 3E), which is present in various reversible inhibitors of cysteine proteases from Plasmodium falciparum and Leishmania donovani parasites [81]. The inhibitory mechanism of these compounds might involve the formation of transition-state-like hemithioacetal complexes with C25 at the protease active site [81]. On the other hand, the hydrazide moiety was observed in compounds HTS11101 and RH01594 (Fig 3D and 3E). This moiety is frequently present in inhibitors of other parasite cysteine proteases [43,81]. Finally, in the case  of BTB03219, it can be highlighted the presence of an aryl_CF3 substituent, which could increase the affinity for FhCL3 through halogen bond formation (Fig 3B), as has been reported for other cysteine proteases like HuCatL [44]. The VS protocol performed in this paper led to the identification of five compounds as possible inhibitors of FhCL3. However, the refinement of docking results is needed to more accurately predict the binding modes of the compounds to the enzyme, as well as the absolute and relative binding free energy values of the complexes. In this sense, the combination of conformational space-exploring techniques and end-point free energy calculation methods such as MD simulations and MM-GBSA, respectively, constitutes a useful approach to assess the stability of protein-ligand complexes [83][84][85].

Binding free energy analysis
Prior to free energy calculations, the stability and convergence of MD simulations were monitored through per-frame ΔG eff time profiles. In this regard, fluctuations for the FhCL3-ligand complexes are shown together with accumulated mean values (Fig 4). ΔG eff values are quite variable for each snapshot, but the accumulated mean values become stable in most of cases. Besides, significant differences in the stabilization time for every complex are clearly observed (Fig 4). The latter may result from the fact that for some FhCL3-ligand complexes the initial structures predicted by docking were farther from their MD average structures than for others (S6 Fig). Accordingly, the subsequent ΔG bind calculations and the analyses of binding determinants and hydrogen bond formation are based on snapshots collected after the equilibration time of productive MD simulations.
To better understand the main energy components contributing to the formation of the different FhCL3-ligand complexes, we analyzed the MM-GBSA free energy components, i.e., van der Waals, electrostatic, polar solvation, non-polar solvation and entropy (TΔS) contributions ( Table 3). The results indicate that non-polar contributions (ΔE vdw +ΔG SA ) dominated the binding process, because the complex formation reduces the Solvent Accessible Surface Area (SASA) and the enzyme binding site has hydrophobic-interacting residues, i.e., W69, V160, V209 and W184, which establish favorable van der Waals interactions with the different ligands. Conversely, the polar (ΔE elec +ΔG GB ) and entropy terms have unfavorable contributions in all cases.
MM-GBSA results show that HTS12701 is the compound with the best ΔG bind value (-10.71 kcal/mol), followed by BTB03219 (-8.16 kcal/mol), both of them similar to the positive control (nitrile, -10.55 kcal/mol). Furthermore, K i values calculated from theoretical ΔG bind values suggest that HTS12701 and BTB03219 bind the enzyme in the sub-micromolar and micromolar concentration ranges (Table 3), respectively, the former being predicted as a tight-binding inhibitor. The theoretical K i values for the rest of the ligands predict their low affinity interactions with the enzyme. Interestingly, the comparison between both the average structure of the productive MD simulation and initial docking pose shows that compounds like HTS12701 and BTB03219 keep bound to FhCL3 active site through specific interactions like nitrile (S6A- S6C  Fig). Particularly, the former compound adopts a conformation during the MD simulation which places the thiomethylene ketone moiety closer to C25, in agreement with the proposed binding mode of this compound series [81]. On the other hand, the remaining compounds, i.e., SPB07884, HTS11101 and RH01594 (S6D- S6F Fig) do not form stable complexes during simulation time, which explains their more unfavorable ΔG bind values.
Finally, by comparing the MM-GBSA and S vina values we can observe changes in the ranking list of the selected compounds. However, the energy values lie within the similar ranges of free energy values (from -11 kcal/mol to -8 kcal/mol) for the best hits (compare Tables 2 and  3). As MM-GBSA is considered to be a more accurate method for estimating relative affinities [61,64,83], its predictions were taken as the final criterion for hit ranking.

Per-residue free-energy decomposition-Insights into FhCL3-ligand interactions
In order to get insights into FhCL3-ligand interactions, the MM-GBSA approach was employed to decompose the binding effective free energy of the high-affinity complexes into perresidue contributions (Fig 5). This allowed us to identify the residues at the enzyme active site with significant energy contributions to the ligand binding. In general, we observed that some  of these residues, e.g. G67, V160 and T161 lie within the S2 and S3 subsites, indicating substrate-like binding modes of the ligands to FhCL3. The subsequent decomposition of ΔG res into backbone and side chain energy contribution led to the identification of six critical (warm/ hot-spot) residues, i.e., Q19, C25, W69, V160, W184 and V209, whose respective side chain energy contributions were larger than those of the backbone. The latter suggests the essential role of these specific residues in the formation of some FhCL3-ligand complexes (Fig 5). It is also worth noting the prevalence of per-residue non-polar energy contribution in all systems, in agreement with ΔG eff decomposition results shown in the previous section. Accordingly, most of the previously-mentioned residues are hydrophobic and, thus, can establish strong van der Waals interactions with compounds containing aromatic groups (Fig 5).
For the FhCL3-nitrile complex, four energetically-relevant residues, i.e., G67, W69, V160, and T161 (Fig 5), were identified. W69 and V160 are likely to interact with hydrophobic moieties (Fig 6A), while T161 forms a stable hydrogen bond with the N2 atom of nitrile, equivalent to that described before for the FhCL3-peptide complex ( Fig 6A). Overall, these results are consistent with a previous work which state that nitrile non-covalent interactions comprise the enzyme active site and, especially, the specificity substrate subsites (S2 and S3) [44].
The complexes of FhCL3 with the best hits, i.e., BTB03219 and HTS12701, and nitrile share a group of common energetically-relevant residues, i.e., G23, C25, W26, G67, V160, T161and H162 (Fig 6). However, some other residues show significant differential energy contributions among the three complexes. For example, W69, which largely contributes to the formation of FhCL3-nitrile and FhCL3-BTB03219 complexes, seems to be irrelevant for HTS12701 binding to the enzyme. A similar behavior was observed for G68. Conversely, HTS12701 establishes strong interactions with Q19 and W184, residues belonging to S1-S1' subsites, not observed in the other two complexes. Both residues are conserved throughout the cathepsin L family, which suggests their essential role in substrate binding, as observed for the FhCL3-peptide complex analyzed before. Therefore, HTS12702 may display low selectivity toward the proteases of this family. On the other hand, BTB03219 preferentially interacts with residues of the S2-S3 subsites, which are believed to control the enzyme specificity. Hence, though less potent than HTS12702, it is likely to be a more selective inhibitor. Note, however, that both compounds are predicted to display a roughly equivalent specificity for FhCL3 with respect to HuCatL, according to their respective ΔS vina values (S2 Table). Further insight into the structural determinants for the interaction of nitrile, BTB03219 and HTS12701 with FhCL3 was obtained through the hydrogen bond and hydrophobic contact analysis at the interfaces of these complexes. For example, W69 establishes hydrophobic interactions with aromatic rings of both nitrile and BTB03219, which is consistent with the large van der Waals energy contribution to the ΔG res values of this residue (Fig 6). Another residue showing favorable hydrophobic interactions with the three compounds is V160, whose side chain interacts with the ligand hydrophobic moieties lying within the S2 subsite. On the other hand, the carbonyl oxygen atom (O) of T161 is involved in hydrogen bond formation with both BTB03219 and nitrile, which suggests the importance of this position in accommodating hydrogen donor groups within the S2 subsites. Interestingly, in this particular position the nature of the residue is irrelevant for protein-ligand interactions, as was also obtained for the FhCL3-peptide complex. Furthermore, Q19 at the S1' subsite forms two alternative hydrogen bonds with acceptor nitrogen atoms of HTS12701. Unlike the previous case, the nature of this residue is important, since the hydrogen bond involves the NE2 atom of its side chain. In fact, Q19 is part of the oxyanion hole of cysteine proteases and stabilizes the tetrahedral intermediate of protease-substrate complexes [74]. Finally, the carbonyl oxygen atoms of Q19 and G66 may form halogen bonds with the fluorine atoms of BTB03219 (Fig 6), thereby contributing to the affinity of this particular compound for FhCL3. Interestingly, halogen bonds have been observed in the crystal structure of HuCatL in complex with nitrile and are believed to contribute to the binding process [44]. Similar analyses were carried out for the compounds with less favorable ΔG bind values (S7 and S8 Figs).
A close inspection of the representative structures of the FhCL3-ligand complexes also revealed that the side chain of W69 can adopt different rotameric conformations depending on the nature of the ligand bound to the enzyme (Fig 5 and S7 Fig). Specifically, we observed that in the FhCL3-nitrile complex the side chain of W69 has a coaxial orientation with respect to the binding site, while in the other complexes it has a perpendicular conformation that partially occludes the S2 subsite, as obtained for the FhCL3-peptide complex analyzed before. It is worth saying that even though both rotameric conformations have been proposed before [25], this is the first time that their occurrence is predicted through MD simulations of FhCL3 in complex with different ligands. Therefore, our results reinforce the importance of W69 side chain rotation for the accommodation of ligands with different shapes within the enzyme binding site, as suggested before [21,25].
Overall, we proposed some crucial interaction patterns between the selected compounds and FhCL3. Finally, as expected, the best hits interact with residues previously characterized for the FhCL3-peptide complex, which indicates their substrate-like binding mode. The list of most favorable residues (C25, W26, G67, V160, T161 and H162) for ligand interaction is roughly similar for most of the compounds analyzed here. Remarkably, there are some distinctive residues of FhCL3 mediating the interactions with the ligands through its side chains, i.e., W69 and V160, which have been identified as substrate-specificity determinants [21].

Conclusions
In the present study, a computational protocol consisting of VS, MD simulations, and binding free energy calculations was used to search for novel and selective inhibitors against FhCL3. Additionally, the results obtained here enhanced our understanding of the binding determinants of this protease with peptidic substrates and organic ligands. Free energy calculation through a more accurate method, i.e., MM-GBSA, proved to be a useful post-docking refinement tool, since a new ranking list different from that of VS, was finally obtained. The further decomposition of the overall binding free energies into individual energy terms indicated that the van der Waals interactions are the dominant force for substrate/ligands binding. Moreover, the decomposition of the binding free energy into per-residue contributions showed that the non-polar side chain of residue W69 establishes critical van der Waals interactions with the substrate and some ligands. This agrees with a previous work that highlights the importance of this residue for FhCL3 specificity [25,27]. Interestingly, we also observed that the side chain of this residue may adopt different conformations to accommodate different ligand groups within enzyme binding site. The previous results suggest that a flexible docking protocol allowing the rotation of the side chain of W69 would lead to the identification of more diverse scaffolds of FhCL3 ligands. Roughly six different residues, i.e., C25, W26, G67, V160, T161 and H162, were predicted as energetically-important for ligand and/or substrate anchoring inside the FhCL3 active site via hydrophobic and hydrogen bond interactions in almost all complexes. However, the nature of T161, one of the residues with very large energy contribution, seems to be irrelevant, since its main interactions involved the backbone oxygen atom, suggesting that the variation of this residue within the S2 subsites of papain-like proteases may not necessarily affect the substrate binding. Overall, we proposed HTS12701 and BTB03219 as promising lead compounds that could be FhCL3 inhibitors. We expect that the structural insights obtained in this study will facilitate the design of novel inhibitors against FhCL3.