Binding-Site Assessment by Virtual Fragment Screening

The accurate prediction of protein druggability (propensity to bind high-affinity drug-like small molecules) would greatly benefit the fields of chemical genomics and drug discovery. We have developed a novel approach to quantitatively assess protein druggability by computationally screening a fragment-like compound library. In analogy to NMR-based fragment screening, we dock ∼11000 fragments against a given binding site and compute a computational hit rate based on the fraction of molecules that exceed an empirically chosen score cutoff. We perform a large-scale evaluation of the approach on four datasets, totaling 152 binding sites. We demonstrate that computed hit rates correlate with hit rates measured experimentally in a previously published NMR-based screening method. Secondly, we show that the in silico fragment screening method can be used to distinguish known druggable and non-druggable targets, including both enzymes and protein-protein interaction sites. Finally, we explore the sensitivity of the results to different receptor conformations, including flexible protein-protein interaction sites. Besides its original aim to assess druggability of different protein targets, this method could be used to identifying druggable conformations of flexible binding site for lead discovery, and suggesting strategies for growing or joining initial fragment hits to obtain more potent inhibitors.


Introduction
Since the completion of the human genome, there has been much interest in the ''druggability'' of new potential drug targets, and what fraction of the proteome is druggable. In this paper we are concerned with protein druggability in the sense defined by Hopkins and Groom [1], i.e., the ability of a protein to bind small, drug-like molecules with high affinity. For many classes of protein binding sites, such as the ATP binding sites in kinases, there is little ambiguity about whether the site is druggable; the challenge in developing inhibitors in such cases is achieving selectivity and other desired properties. However, not all biological targets are druggable since only certain binding sites are complementary to drug-like compounds in terms of physicochemical properties (i.e. size, shape, polar interactions and hydrophobicity) [1,2]. An accurate method for predicting druggability would be particularly valuable for assessing emerging classes of binding sites such as protein-protein interactions (PPI) [3] and allosteric sites [4], which are generally considered more challenging but are attracting increasing interest in both academia and industry as drug targets. For example, while some PPI sites have led to potent small molecule inhibitors, others have not despite substantial effort [5,6].
A first step in evaluating target druggability is to detect the presence of binding pockets of suitable size, shape, and composition to accommodate drug-like molecules. Many such methods have been developed and tested using training sets of ligand binding sites extracted from the Protein Data Bank (PDB). Several in-depth reviews are available that summarize computational methods for protein binding pocket detection [7,8,9], many of which can be classified as geometry-based [10,11,12,13], information-based [14,15] and energy-based algorithms [16,17]. Combinations of these strategies have also been developed [18,19,20,21,22]. In addition, more complex free-energy calculation methods have also been used to predict binding sites and identify energetically favorable binding site residues, including computational solvent mapping [23] and grand canonical Monte Carlo simulations [24].
The presence of a ''suitable'' protein pocket is necessary but not sufficient to guarantee potent binding of drug-like small molecules. A few studies have attempted to more directly predict druggability of binding sites. Several studies have predicted protein druggability on the basis of sequence and structural homology to known drug targets [1,2,8]. However, not all members of the same protein family are equally druggable [25]. More importantly, such methods cannot be used to assess druggability of novel target families. Recently, an alternative approach was described to predict the maximal affinity for a passively absorbed oral drug to a given binding site, by quantitatively approximating the physical forces driving protein-ligand binding. Specifically, hydrophobic surface area and curvature of the binding pocket were used to fit the binding affinities of a training set of protein-ligand binding complexes. Notably, this model was successfully applied to predict the relative druggability of two novel targets before experimental validation [26].
To date, the most extensive experimental assessment of druggability on various targets has been performed by Hajduk and coworkers [27]. The heteronuclear-NMR-based technique was applied to screen fragment-like libraries against a set of 23 protein targets containing 28 different binding sites. This study revealed that small ligands bind almost exclusively to well-defined binding pockets on the protein surface, independent of their affinity. Remarkably, the authors observed a strong correlation between the experimental NMR hit rate and the ability to bind drug-like ligands with high affinity in a particular binding site. Furthermore, they derived a linear regression model to fit the experimentally measured hit rates to physicochemical descriptors of these 28 binding pockets. These results suggested that the druggability of a particular binding site is related to its propensity to bind low-affinity, fragment-size compounds.
We wondered whether in silico fragment screening would also be useful in this regard, with the obvious advantages of speed and cost relative to experimental screening. Here we describe the development and evaluation of such a method, making use of a molecular mechanics-based scoring method for the proteinligand interactions (Method Section) [28,29]. Specifically, we report the results of virtually screening ,11,000 diverse fragment-like compounds against a total of 152 protein binding sites, including the training dataset and external dataset studied by Hajduk and coworkers [27]. We demonstrate that the hit rate calculated from computationally screening a diverse fragment library correlates with the hit rate measured experimentally from the NMR-based screening method, despite the fact that we could not directly replicate the experiment in silico because the fragment libraries used for the NMR screening are proprietary. Secondly, we show that the in silico fragment screening method can be used to distinguish known druggable and non-druggable targets, including both enzymes and protein-protein interaction sites. Finally, we explore the sensitivity of the results to different receptor conformations, including flexible protein-protein interaction sites.

Comparison between NMR-based and Virtual Fragment Screening
The key results of virtual fragment screening against the Hajduk et al. training dataset [27] are summarized in Table 1 and Figure S1. Table 1 summarizes the druggability scores measured from NMR-based screening, predicted by an empirically fitted model by Hajduk et al., and predicted by our virtual fragment screening method. Our calculated druggability scores correlate reasonably well with the NMR-based fragment screening results ( Figure 1) except for three binding sites (Bir3, E2-31 DNA site, and LCK pTyr binding site), for which we compute much lower hit rates. Our primary goal is not to reproduce the NMR screening data per se, but to predict druggability, as we discuss in the next section. Nonetheless, these outliers deserve brief comment. First, the experimentally measured binding affinity of the best fragment hits ranges are rather weak, from 200 to 1,000 mM, for these three outliers, several-fold higher than other targets with similar NMR hit rates (less than 50 mM) [27]. Further, although these sites were classified as druggable or moderately druggable based on the NMR hit rates, no high-affinity druglike binders have been reported to our knowledge. The LCK pTyr binding site and Bir3 have been suggested to be not druggable due to their highly polar or very small binding sites [30]. The reported LCK pTyr site ''druglike'' ligand contains a diphosphonophenylalanine group to target the pTyr site, and a cyclohexane ring inserts deeply into an extra spatially distinct hydrophobic binding pocket (pTyr+3) [31], which further indicates that LCK pTyr site is not a truly druggable site by itself. We cannot rule out the possibility that these outliers reflect a deficiency in our method, but for the reasons discussed above, we have excluded them from further analysis. Figure 1 presents the correlation between the NMR-based druggability score and our calculated druggability score for the remaining 21 binding sites. Encouragingly, a reasonable correlation is achieved (R 2 = 0.69), especially considering that the compounds screened by NMR and by docking are different. The empirical model developed by Hajduk et al. by fitting to this data, using multiple adjustable parameters, gives a correlation of R 2 = 0.79 for these 21 binding sites, and the slopes of the two models are similar in Figure 1 (0.72 and 0.81, respectively).
Hajduk et al. defined binding sites as ''highly druggable'' if they have a log(hit rate).21.0. Based on the correlation in Figure 1, the corresponding value of the computational log(hit rate) is 0.36, and we use this value to classify proteins as druggable or non-druggable in the following sections. Note that, although Hajduk et al. distinguish between ''highly druggable'' and ''moderately druggable'', we use a simple binary classification for simplicity.

Classification of Binding Sites as Druggable/Non-Druggable
Using the druggability score cutoffs derived above, we evaluated the ability of the virtual fragment screening protocol to classify the binding sites in the Hajduk et al. external dataset, which contains 72 targets, including 35 classified as druggable and 37 as nondruggable. Table S1 summarizes the druggability scores calculated using the empirical model of Hajduk et al. and our virtual fragment screening method.
In evaluating the success of our method, we put more emphasis on ''true positives'', i.e., the ability to identify proteins as druggable when they have in fact been shown to bind small drug-like molecules with high affinity. By contrast, the lack of a published, potent small molecule inhibitor does not necessarily prove that a target is not druggable, and thus we put less emphasis on putative ''false positives''. In addition, in practical application, incorrectly predicting a site to be nondruggable when it is in fact druggable arguably would be worse than incorrectly predicting a non-druggable site to be druggable. However, it is clear from the ROC plot that a different hit rate cutoff could also be used to reduce the number of false positives while maintaining a relatively high true positive rate, if desired.
As shown in Figure 2, our method is able to effectively distinguish between druggable and non-druggable sites. Encouragingly, our method, using the default hit rate cutoff, correctly identified almost all the true positives except for only one case, protein kinase C (PKC-delta). Because PKC-ligand binding was shown to be dependent on phospholipid binding [32], it is perhaps not surprising that our method failed to predict PKC-delta as a druggable site since we do not treat the effects of membrane binding. With respect to false positives, we classified several proteins that bind sugars or sugar analogs as moderately druggable, whereas these are annotated as non-druggable targets due to the lack of reported high-affinity druglike binders in the literature.
Based on a survey of recent literature, a few of the putative ''false positives'' are possibly incorrectly classified as non-   (Table S1). For example, fatty acid binding protein (FABP) was annotated as a non-druggable target; however, both ours and the empirical model of Hajduk et al. predicted it to be highly druggable. Small fragment-like ligands were identified by an NMR-based screening method [33], and also a high-affinity drug-like inhibitor was recently reported to be an effective therapeutic agent to prevent and treat metabolic diseases [34]. In contrast to prediction of Hajduk et al., our method ranked three targets (guanylate kinase, guanine nucleotide-binding protein, and deoxynucleoside-monophosphate-kinase) as highly druggable hits, whereas they were annotated as non-druggable targets. Considering the similarity of these targets to well-known kinase targets, we wonder whether these three nucleoside/nucleotide binding proteins may prove to be druggable.

Application to Known Drug Targets with Multiple Conformations
As an additional data set, we assembled a diverse set of wellknown drug targets that have multiple crystal complex structures available, and generally display significant sidechain movement upon binding to different ligands [35] (Table 2). Thus, these targets serve as additional positive controls, and also allow us to investigate the consequences of protein conformational flexibility.
All of these targets are predicted to be druggable based on the cutoff of 0.36 used above. Most of the targets have much higher druggability scores (.1.0), regardless of which crystal structure was used. The lowest druggability scores of 0.45 and 0.52 were calculated for angiotensin-converting enzyme (ACE, 1o86) and neuraminidase (NA, 1a4g), respectively. These binding sites are highly charged. It has been argued elsewhere that such binding sites are in fact less druggable than more hydrophobic binding sites [26]. Nevertheless, it is encouraging that our fragment virtual screening method is capable of assigning a reasonable druggability score to highly charged but druggable binding sites.
The histograms of energy scores ( Figure S2) and the druggability scores calculated from them do change using different crystal structures, as expected, but in most cases the changes are remarkably small. The largest variation in the virtual fragment screening results between different crystal structures is seen for P38 MAPK, which also displays some of the largest conformational changes: the largest movement of a binding site side chain (RMSD max. ) is more than 10 Å , and the average movement of binding site side chains (RMSD ave. ) is ,4 Å . In the other cases, where conformational changes in the binding site are relatively small (RMSD max. ,3 Å ), the druggability scores vary only slightly, including for the cases where ligand-free structures were available (Alr2, CDK2, DHFR and thrombin). As shown below, the results on PPI sites show a much more striking dependence on conformational states. Druggability Assessment for Protein-Protein Interaction Sites Protein-protein interactions are central to many biological processes, and therefore represent an important class of molecular targets for developing therapeutic agents. However, PPIs have been historically considered to be difficult targets for small molecular inhibitors due to the lack of suitable binding pockets to accommodate drug-like molecules. In addition, the binding interfaces in PPIs are generally highly conformationally flexible. Nevertheless, important progress has been made in discovering small molecular inhibitors of several important protein-protein interaction targets, such as MDM2-p53, BCL-X L -BAK and IL2-ILR [3,36]. Therefore, it is desirable to computationally evaluate the druggability of PPIs, and identify specific druggable conformations.
Among the PPI sites studied here, high-affinity small molecule ligands have been found for IL-2, MDM2, BCL-X L , and HPV E2. MDM2 and BCL-X L were correctly predicted to be druggable regardless of which crystal structure was used. By contrast, IL-2 and HPV E2 were correctly predicted to be druggable only when using the co-crystal structure with a small molecule inhibitor, and not when using a crystal structure with a peptide or protein bound. To date, only micromolar binders have been discovered for TNF and ZipA. ZipA was predicted to be non-druggable using both structures, while TNF was predicted to be druggable.
In general, the structural variation among different structures of PPI targets (Table 3) is larger than in receptors or enzymes ( Table 2). For example, as illustrated in Figure 3, a remarkable conformation change occurs upon ligand binding in IL-2. Clearly, the druggability score varies for different protein conformations [37], especially for IL-2, MDM2, and HPV E2 (Table 3 and Figure S3).

Two Case Studies
Here we examine in more depth two case studies, focusing on the chemical composition and binding modes of top-ranked fragments. The first case study is protein tyrosine phosphatase 1B (PTP1B). The PTP1B catalytic site and vicinal non-catalytic site are surface-exposed, highly hydrophilic, and recognize charged phosphotyrosine (pTyr) residue or pTyr mimetics ( Figure 4A). Characterizing the binding properties of such sites is particularly challenging because the binding sites contain numerous polar and charged groups, and ligand binding is a complex tradeoff between forming favorable electrostatic interactions and the cost of desolvation. NMR-based screening results suggested that the catalytic site is highly druggable while the non-catalytic site is nondruggable. The Hajduk et al. empirical model assigned the PTP1B catalytic site as moderately druggable [27], while the computational solvent mapping technique predicted this site to be nondruggable [23]. Encouragingly, our virtual screening method correctly classifies the catalytic and non-catalytic sites as druggable and non-druggable, respectively. It even reproduces the relative hit rate observed in the NMR experiment (,10 times higher for the catalytic site), although this may be a fortuitous result. Appropriately, the high-ranking fragment hits identified for the PTP1B catalytic site are dominated by heterocyclic carboxylic acids, and hits for the non-catalytic site contain many neutral methyl salicylate moieties ( Figure 4B), which is consistent with previously identified fragments from experimental screening studies (chemical structures are shown in Figure S4) [38,39].
The second case study is P38 MAPK, which was chosen due to its significant structural flexibility, where a new allosteric binding site spatially distinct from the ATP catalytic pocket is induced upon binding to a diaryl urea type of inhibitor. We focus on the crystal structure (1kv2) bound with most potent MAPK inhibitor, BIRB796 ( Figure 5A), and the crystal structure (1kv1) bound with a micomolar diaryl urea type of inhibitor ( Figure 5B). BIRB796 binds potently by forming strong interactions to both the ATP binding pocket and the allosteric site, including the crucial hydrogen bonding interaction between the morpholino group and the main chain amide of residue Met109; the hydrophobic interaction between its naphthyl moiety and the lipophilic pocket formed by the side chains of Lys53, Leu75, Ile84, Leu104 and Thr106; the hydrogen bonds between the urea group and the side chain carboxylate group of residue Glu71 and the main chain amide of residue Asp168; and the hydrophobic interactions between its tolyl and t-butyl groups on the pyrazole ring and the allosteric pocket [40]. The top scoring fragment hits occupy different binding regions. For example, pyridinyl-imidazole scaffolds are predicted to target the ATP binding pocket, while urea-like moieties on substituted naphthyl rings establishing interactions with the Glu71 sidechain and the lipophilic pocket, and substituted heterocyclic rings bury deeply into the allosteric binding pocket ( Figure 5C). It is notable that the predicted fragment bound to the ATP binding pocket structurally resembles an aminopyridine type of fragment identified by fragment-based crystallography screening (chemical structures are shown in Figure  S5) [39]. The results using the crystal structure bound with a micomolar diaryl urea type of inhibitor are similar, and highranking fragment hits mimic binding interactions seen for the picomolar inhibitor ( Figure 5D).
In summary, we have developed an in silico fragment screening method, analogous to an NMR-based screening method that was previously shown to be effective at assessing the druggability of the binding sites. This approach does not require fitting any physicochemical parameters derived from protein binding pocket. We performed a large scale assessment on a total of 152 protein binding sites. We demonstrated that the hit rate calculated for 21 binding sites using our approach correlates with the hit rate measured experimentally from NMR-based screening method, and that the method could successfully distinguish known druggable and non-druggable targets. We are investigating extensions of the method to identifying druggable conformations of flexible binding sites for molecular docking, and suggesting strategies for growing or joining initial fragment hits to obtain more potent inhibitors.

Dataset Selection
24 binding sites (Table 1) were chosen from the training dataset of Hajduk et al. [27] to develop our physics-based druggability prediction model. A total of 28 binding sites were experimentally investigated via the NMR-based fragment screening approach. However, structural information for CMPK other , E2-31 other and Survivin other sites are not publicly available, and the crystal structure of MurA is fosfomycin-covalent modified, so these 4 binding sites were excluded in our study. The external dataset defined by Hajduk et al. [27] contains 72 proteins, of which 35 binding sites were assigned as druggable and the remaining 37 sites as non-druggable, based on whether high-affinity druglike binders could be found in the literature. We have used all 72 binding sites (Table S1) to assess our prediction on new targets. For both the training and the external datasets, the same protein structures were used as in the Hajduk et al. study unless it was not reported (i.e. Bcl-X L , CMPK, E2-31, PAK4 and SARS-RNA site).
To supplement these targets, we selected 15 well-known drug targets from the DUD dataset [41] as true positives, and two or more crystal structures were chosen for each target ( Table 2). Many of these structures were also used by Sherman et al. for testing a strategy for docking against flexible binding sites [35]. We use these flexible binding sites to evaluate the consequence of structural flexibility in our druggability calculation. Finally, six protein-protein interaction (PPI) targets [3] (Table 3) were chosen from the recent literature.
To quantitate the flexibility of ligand binding sites, the crystal structures for each target were superimposed using Chimera [42], and root mean squared distance (RMSD) values were calculated for each binding site residue or a combination of several binding site residues. Residues that could not be aligned, as well as those with missing atoms, were ignored during RMSD calculations.

Fragment Library Preparation
The diverse set of fragments was selected from the fragment-like subset of the ZINC database (version 6, December 2005) [43]. This subset contains 49,134 compounds with relatively low molecular weight (MW # 250), few rotatable bonds (RB , 3), low hydrophobicity (22 , log P , 3), and weak hydrogen bonding potentials (HB donor ,3 and HB acceptor ,6). We also eliminated fragments with more than 15 heavy atoms, based on the previous observation that maximal binding free energy increases more slowly for ligands containing more than 15 heavy atoms [44]. This filter reduced the library size to 32,717 molecules. Finally, we performed structural similarity analysis to reduce redundancy. Feature key fingerprints were calculated using CACTVS [45] and the fingerprint-based similarity analysis was performed with a modified version of the program SUBSET [46]. Representative structures were selected for each structural cluster with Tanimoto coefficient (Tc) less than 0.9 to other clusters. This further reduced the library to 11,129 diverse molecules. To assess any potential bias resulting from the diversity-based filtering, 32,717 ZINC fragment-like compounds were used to redo the screening for the Hajduk et al. training dataset. The computed energy distributions are very similar (data not shown).

Computational Druggability Assessment Protocol
The detailed virtual screening protocol was published elsewhere [28,47]; here, a brief overview is presented ( Figure 6). Our scoring method consists of two steps: predicting the binding poses of ligands using a docking program, and then refining and rescoring those protein-ligand complexes using a more computationally intensive molecular-mechanics based energy function. This protocol uses a high-throughput docking program to initially orient and score the ZINC fragment-like compounds in the binding site, and subjects the best single docking pose for each docked compound to a rescoring stage in which the ligand is energy minimized and the binding affinity is estimated using an all-atom molecular mechanics force field combined with an implicit solvent model. Most of the labor-intensive, manual steps during the docking and rescoring stages were automated for large scale application here.
The program DOCK 3.5.54 was used to dock the fragment database into the protein binding site [48,49]. A maximum of 120 matching spheres were used to ensure adequate ligand sampling within large binding surfaces like protein-protein interfaces. Ligand conformations were scored based on the docking total energy (E tot = E ele + E vdw 2 DG lig-solv ), which was the sum of electrostatic (E ele ) and van der Waals (E vdw ) interaction energies corrected by the ligand partial desolvation energy (DG lig-solv ) [49]. Final energies were computed after rigid-body minimization. Then, a single docking pose with the best total energy score was saved for each docked molecule for the next stage of scoring.
The docked protein-ligand complex and ligand were then submitted to multi-scale Truncated Newton energy minimization in all-atom OPLS force field and Generalized Born (GB) solvent using the Protein Local Optimization Program (PLOP) [50,51,52]. The molecular mechanics forces are divided into short-range (bond, angle, torsion, and local non-bonded) and long-range components, with the long-range forces updated only intermittently. The algorithm was also optimized for minimizations with GB solvent that increases the computational expense by only a factor of ,3 relative to the vacuum. Thus, this scoring approach accounts for accurate and efficient calculations of ligand-protein interaction energies, the ligand/receptor desolvation, and to a lesser extent, ligand strain energies. In this work, the protein was kept rigid during ligand-protein complex minimization to reduce the computational expense. The binding energy (E bind = E R*L 2 E L 2 E R ) was calculated by subtracting the energies of the optimized free ligand in solution (E L ) and the free protein in solution (E R ) from the optimized ligand-protein complex's energy in solution (E R*L ). The van der Waals energy component was empirically scaled by a factor of 2 as we suggested previously [28].
To compute a ''hit rate'' for the in silico screening, we chose an energy cutoff value empirically to maximally differentiate druggable and non-druggable binding sites. The ''druggability score'' in this work is defined as log(hit rate). At the present time, docking scoring functions cannot robustly reproduce absolute binding affinities in realistic applications. Although useful for rankordering compounds [47], the molecular-mechanics based scoring function used here also cannot be interpreted in terms of absolute binding affinities, in part because entropy losses are not computed. The cutoff chosen (240 kcal/mol) was based primarily on visual inspection of the energy distributions for the 13 druggable binding sites and 11 non-druggable sites in the Hajduk et al. training data set. The effect of varying the cutoff is explored with respect to differentiating between druggable and non-druggable binding sites ( Figure S6). Interestingly, the correlation between the docking screening hit rates and the NMR screening results is relatively insensitive to the value of the energy cutoff within a certain range (from 240 to 234 kcal/mol). Unless stated otherwise, the results below use an energy cutoff of 240 kcal/mol for computing the in silico ''hit rate''.

Supporting Information
Table S1 Targets, binding sites, available ligand binding information, and hit rate data predicted by two different computational models. Found at: doi:10.1371/journal.pone.0010109.s001 (0.19 MB DOC) Figure S1 Energy histograms from docking 11,129 ZINC fragment-like compounds against 24 binding sites previously studied by NMR-based fragment screening. Color code is defined using the NMR druggability score: druggable (green): log (Hit Rate) .21.0, and non-druggable (red): log (Hit Rate) #21.0.  Found at: doi:10.1371/journal.pone.0010109.s004 (1.03 MB TIF) Figure S4 Chemical structures of a ligand co-crystallized with PTP1B (1ph0), binders identified in experimental screening, and high-ranking fragment hits identified from virtual fragment screening (fragments bound to the catalytic site are colored in green and to the non-catalytic site in magenta). Found at: doi:10.1371/journal.pone.0010109.s005 (1.01 MB TIF) Figure S5 Chemical structures of a ligand co-crystallized with P38 MAPK (1kv2), binders identified in experimental screening, and high-ranking fragment hits identified from virtual fragment screening using two different crystal structures, 1kv2 and 1kv1 (fragments bound to ATP site colored in green, lipophilic pocket colored in cyan, and allosteric site in magenta). Found at: doi:10.1371/journal.pone.0010109.s006 (1.06 MB TIF) Figure S6 The correlation between the virtual fragment screening hit rates and the NMR screening results, using different energy cut-offs for defining the fragment-like compounds as ''hits'' in the virtual screen. Found at: doi:10.1371/journal.pone.0010109.s007 (0.78 MB TIF)