Scoring protein-ligand binding structures through learning atomic graphs with inter-molecular adjacency

Debby D. Wang; Yuting Huang

doi:10.1371/journal.pcbi.1013074

Abstract

With a burgeoning number of artificial intelligence (AI) applications in various fields, biomolecular science has also given a big welcome to advanced AI techniques in recent years. In this broad field, scoring a protein-ligand binding structure to output the binding strength is a crucial problem that heavily relates to computational drug discovery. Aiming at this problem, we have proposed an efficient scoring framework using deep learning techniques. This framework describes a binding structure by a high-resolution atomic graph, places a focus on the inter-molecular interactions and learns the graph in a rational way. For a protein-ligand binding complex, the generated atomic graph reserves key information of the atoms (as graph nodes), and focuses on inter-molecular interactions (as graph edges) that can be identified by introducing multiple distance ranges to the atom pairs within the binding area. To provide more confidence in the predicted binding strengths, we have interpreted the deep learning model from the model level and in a post-hoc analysis. The proposed learning framework has been demonstrated to have competitive performance in scoring and screening tasks, which will prospectively promote the development of related fields further.

Author summary

The binding between a small compound (ligand) and a protein plays a crucial role in many biological processes, such as signal transduction and immunoreaction. Particularly, a small-molecule drug can bind to a target protein to modulate its signaling pathways and suppress the progression of the associated disease. Apparently, the binding strength is a key indicator for evaluating how well such small-molecule drugs work, therefore becoming a core topic in computational drug discovery. Nowadays, the binding structure of a ligand and its target protein can be resolved experimentally or modeled computationally, while the accurate scoring of such a binding structure (predicting the binding strength) still remains a challenge. An effort has been put into the development of benchmark databases that provide a variety of protein-ligand binding structures and their experimentally resolved binding strengths, leading to increasing deep learning applications in this field. In this study, we represent a protein-ligand binding structure as a graph, with the atoms as nodes and the inter-molecular interactions as edges. A light but efficient deep learning architecture has been adopted for learning such graphs and outputting the binding strengths. Validated by our experiments, the model performs well in both scoring and screening tasks.

Citation: Wang DD, Huang Y (2025) Scoring protein-ligand binding structures through learning atomic graphs with inter-molecular adjacency. PLoS Comput Biol 21(5): e1013074. https://doi.org/10.1371/journal.pcbi.1013074

Editor: Mohammad Sadegh Taghizadeh, Shiraz University, IRAN, ISLAMIC REPUBLIC OF

Received: April 7, 2024; Accepted: April 21, 2025; Published: May 9, 2025

Copyright: © 2025 Wang, Huang. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All the original data for model construction are available from the PDBbind database (https://www.pdbbind-plus.org.cn/). Specifically, the Version 2020 (PDBbind v2020) was used in this work, and it can be accessed from the Download section of PDBbind+ website. The CASR database was used for evaluating the scoring performance of constructed models (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3753885/). These data have been cleaned, standardized, and stored in Zenodo (https://zenodo.org/records/15023336). The screening power of the constructed models were measured using the data from DUD-E (https://dude.docking.org/). All code files are available from an online GitHub repository at https://github.com/debbydanwang/DL-PLBAP. A Docker container with a trained model pre-installed is available for access on Zenodo (https://zenodo.org/records/15023336).

Funding: This work was supported by Hong Kong Research Grants Council (Project UGC/FDS16/E16/23 to DDW) and Hong Kong Metropolitan University (Project 2023/24 S&T to DDW). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The authors received salaries from Hong Kong Metropolitan University.

Competing interests: The authors have declared that no competing interests exist.

Introduction

‘AI for science has attracted considerate attention in the past decade. Quite a number of powerful mathematical algorithms have been developed in this field, to rise to the challenging tasks in dermatology [1], precision medicine [2], molecular science [3] and drug discovery [4].

As a crucial problem in computer-aided drug discovery (CADD), scoring a protein-ligand complex structure to exhibit its binding strength (Fig 1) always seeks for breakthroughs in AI developments. Such binding strength, as a key indicator of the efficacy of a drug that attaches to its target protein, can mostly be attributed to various non-covalent interactions (e.g. hydrogen bonds, hydrophobic contacts and -stacking). Earlier AI-based scoring works leveraged traditional machine-learning algorithms (e.g. random forests) to parse a feature vector, which describes the interactions in a protein-ligand complex structure, and mapped the vector to the binding strength [5–10]. It lasts until the emergence of deep learning, which reached its scientific milestones by the launch of AlphaFold (for near-perfect protein-fold predictions) [11] and the GPT-series (strong human-like chatbots) [12].

Download:

Fig 1. Scoring problem.

Scoring a protein-ligand complex structure to exhibit the binding strength.

https://doi.org/10.1371/journal.pcbi.1013074.g001

When first introduced to the works of scoring molecular binding strength, deep learning was primarily utilized in the manner of convolutional neural networks (CNNs) [13–18]. Accordingly, molecular lattices or grids, with each cell characterized by a collection of atomic properties (e.g. physico-chemical or pharmacophoric), are the de facto feature representations of a protein-ligand complex. The KDEEP model adopts a molecular lattice representation with a size of and a set of eight atomic properties (pharmacophoric) for delineating a complex structure, and feeds the lattice into a 3D-CNN for binding strength prediction [13]. The Pafnucy model compresses 19 atomic properties (both physico-chemical and pharmacophoric) of a complex structure into a molecular lattice (), and employs a simple 3D-CNN architecture for learning the lattice [14]. Rezaei et al. developed a light-weight 3D-CNN model for scoring, based on molecular lattices that concern 24 atomic features (11 Arpeggio atom types and the excluded volume for both protein and ligand) [15]. Although has opened a new venue for scoring works, deep lattice learning often lacks rotational invariance in data and is therefore resource-intensive after data augmentation [13, 14].

More recently, molecular graph learning has become a prevalent technique for the scoring works. In this context, a protein-ligand complex structure is commonly represented as a 2D atomic graph, which is then decoded by graph neural networks (GNNs) [19–23]. GraphBAR adopts a molecular graph with distance-dependent edges to characterize the binding-site atoms in a complex structure, and employs a spectral graph convolutional network (GCN) to map the graphs to the binding strengths [20]. Shen et al. considered the covalent connections for atoms in the binding area, and leveraged a cascade GCN (with two concatenated modules) for graph learning and binding strength prediction [21]. GraphscoreDTA represents a complex by a fusion of graphs (a 1D amino-acid graph for protein, an atomic graph for ligand, and a hybrid graph for the binding pocket), and predicts the binding strength using a GNN with a bitransport information mechanism and Vina distance terms [23]. Zhang et al. utilized a similar graph representation and developed a multi-objective GNN model for binding-strength scoring [22]. These pioneer works have shed light on modern scoring works. Nevertheless, there is still much room for improvement in developing target-oriented graph representation, achieving high screening power and making the model more transparent. Accordingly, we are dedicated to the design of efficient scoring models, with informative molecular graphs, descent screening power and reasonable interpretability, in this work.

Materials and methods

Atomic-level molecular graphs

A molecular graph can be represented as , where V indicates nodes and E stands for edges connecting those nodes .

To capture sufficient information in a molecule, treating its atoms as graph nodes is a well-acknowledged strategy. Each node or atom is then characterized by a series of physico-chemical or pharmacophoric properties, leading to a feature matrix of all the nodes in the graph (m is the number of properties). As molecules like proteins are very large in terms of atoms, retaining all the atoms is a heavy burden to the computations and therefore task-oriented cropping is frequently performed. For scoring a protein-ligand binding structure, the atoms in the binding area is often of interest. This results in a smaller feature matrix , where n is the number of nodes in the binding area ().

Generally, a graph in deep learning works shows the connections between nodes by an adjacency matrix A, where A_ij indicates an edge between the i-th and j-th nodes. However, designing task-specific graph edges, especially for tasks involving molecules, is often challenging. The covalent bonds, contacts defined through distance thresholding, or a combination of them have been regarded as edges in different works [20, 21, 24]. Considering the atoms in the binding area of a protein-ligand complex, Fig 2A shows the covalent adjacency among those atoms. Interactions or contacts between a pair of atoms (nd_i and nd_j) can also be defined by the range where the atomic distance (d_ij) resides, leading to multi-level distance-dependent adjacencies among atoms. Fig 2B displays two types of atomic contacts ( and ). In Fig 2C, a hybrid type of adjacencies (covalent bond and distance-dependent contacts) is considered. For scoring tasks, these adjacency definitions either emphasize the covalent bonds, or mix the inter- and intra-molecular interactions, resulting in the loss of focus on the inter-molecular interactions. Nevertheless, these inter-molecular interactions play a pivotal role in determining the binding strength between a ligand and its target protein. Accordingly, we focus on the inter-molecular contacts in this work, and define multi-level atomic adjacencies by one-hot encoding of those contacts belonging to different distance ranges (Fig 2D). Such adjacencies can be stored in an adjacency tensor A, where each slice shows all pairs of nodes having distances in range , as follows.

(1)

Download:

Fig 2. Different definitions of atomic adjacency in a molecular graph.

A. Covalent adjacency. B. Distance-dependent contacts. C. A combination of covalent adjacency and distance-dependent contacts. D. Inter-molecular contacts through distance thresholding.

https://doi.org/10.1371/journal.pcbi.1013074.g002

Algorithm 1 shows the procedure for generating such an adjacency tensor for a protein-ligand binding area.

Algorithm 1 Generating an Inter-molecular Adjacency Tensor

Input: Coordinates for the atoms in the binding area (), a list of distance ranges

Output: An inter-molecular adjacency tensor A

Initialize ().

Calculate the distance matrix based on .

for k = 1 to K do

is an indicator function

for i = 1 to n do

for j = i + 1 to n do

if and nd_i-nd_j is not a protein-ligand atom pair then

Turn off intra-molecular interactions

end if

end for

Graph-based deep learning

Given a graph with a node-feature matrix F and an adjacency matrix A, message-passing mechanisms are frequently adopted for learning such a graph [25]. These mechanisms, as shown in Eq 2, update nodes features iteratively in a local context.

(2)

Here, is the features describing the i-th node in the l-th layer, indicates the neighborhood of the i-th node (based on the adjacency matrix), is the message passed from the j-th node to the i-th node in the l-th layer, denotes a permutation-invariant function (e.g. average), and is an update function such as a neural network.

Although a wide variety of graph neural networks (GNNs) have been developed, properly learning molecular graphs still remains a challenge. The ChebNet [26], leveraging spectral graph convolutions, is among the well-acknowledged GNNs. It has an efficient form for updating node features in each iteration, as follows.

(3)

Here, is the feature matrix for all the nodes in the l-th layer, is an activation function, is the weight matrix, and is a normalized adjacency matrix with self-adjacencies ( and ). Such graph-learning operations can be stacked into L layers. From the message-passing perspective, this mechanism can be regarded as a simple average of the normalized information collected from the neighborhood of a node.

(4)

Previously, ChebNet has been questioned about its capability for capturing long-range dependence among the nodes in a graph. However, scoring protein-ligand binding strength is a work that largely concerns local contexts (e.g. a key hydrogen bond or an important interaction), making the ChebNet mechanism fit well in this task. When focusing on only the inter-molecular interactions (Fig 2D), we update the features once (Eq 3) to learn the neighborhoods of binding-site atoms in this work. Higher-order graph convolutions, which will involve intra-molecular interactions and be computationally expensive, are not considered. This strategy places an absolute focus on the inter-molecular interactions (crucial to scoring works) and is of high efficiency. Since L = 1 in this scenario, the layer notation l will be omitted for simplicity in what follows.

Instead of using a single adjacency matrix A, an adjacency tensor that covers different adjacency (interaction) types is of necessity in a scoring task. As inter-molecular interactions are mostly non-covalent (atomic distance ), multiple distance ranges starting from can be nominated to construct the inter-molecular adjacency tensor. Fig 2D exhibits a two-slice adjacency tensor A as follows.

(5)

Here we adopt such a tensor because has been verified to be a distance threshold for capturing sufficient inter-molecular interactions in a binding complex [20].

Targeting at each type of inter-molecular contacts (), the graph nodes can be learned using the message-passing mechanism in ChebNet, as

(6)

where , is a diagonal matrix showing the degree of each node in , and all the other notations follow Eq 3.

After collecting the messages from direct neighbors of graph nodes, we gather the information into the graph level for the scoring purpose. Such an aggregation function is permutation-invariant and similar to that in Eq 2. A simple summation in the following equation serves as an example.

(7)

The features for different inter-molecular interactions () are then concatenated before being fed into dense layers for final graph-level predictions.

(8)

Here, indicates an concatenation of features and h stands for the hidden features describing the whole binding-site graph.

Referring to a well-established architecture (GraphBAR [20]), we developed a light graph-learning architecture that focuses only on inter-molecular interactions and learns the interactions through direct atomic neighborhoods (Fig 3).

Download:

Fig 3. A light graph-learning architecture adopted in this work.

The node feature matrix F and inter-molecular adjacency tensor A of a protein-ligand complex are the inputs, and the binding strength is the output. Main components of this architecture include graph convolution layers, node aggregation layers, dense (fully-connected) layers and dropout layers.

https://doi.org/10.1371/journal.pcbi.1013074.g003

Experiment and results

Scoring performance of models

The aforementioned framework scores the binding strengths of protein-ligand complex structures through learning Atomic Graphs with Inter-Molecular Adjacency (AGIMA-based scoring, abbreviated as AGIMA-Score). In an AGIMA-Score model, the binding area of a complex structure is treated as a graph, represented by a node-feature matrix and an adjacency tensor . Here the binding area is recognized as all the ligand atoms and the protein atoms within -distance of any ligand atom, referring to Son’s work [20]. Three sets of node features, referring to Pafnucy (, m = 18) [14], KDEEP (, m = 8) [13] and GraphBAR [20] (, m = 13) respectively, were adopted to construct F (Table 1). includes generic physico-chemical properties (e.g. atom types and partial charge) and pharmacophoric properties (e.g. aromaticity and hydrogen-bond membership) of atoms. is a subset of that excludes pharmacophoric properties. focuses on pharmacophoric properties, with atomic charge and excluded volume considered.

Download:

Table 1. Node-feature sets for building molecular graphs.

Three feature sets, with 18 features (from Pafnucy), 8 features (from KDEEP) and 13 features (from GraphBAR) respectively, were considered in this study. The names and data types of these features are listed.

https://doi.org/10.1371/journal.pcbi.1013074.t001

When constructing the inter-molecular adjacency tensor A (Fig 2D), two distance ranges were selected for capturing multi-level protein-ligand interactions, as follows. As a pair of atoms having a distance are mostly connected by a covalent bond, we paid more attention to the atom pairs being apart for characterizing the inter-molecular interactions (non-covalent). Meanwhile, the binding area is recognized according to a pairwise atomic distance of , we employed the two distance ranges, and , in Eq 5 to build the adjacency tensor () in this work. Combining the three node-feature matrices (, and ) and the adjacency tensor ( A) in the generation of molecular graphs, we constructed three AGIMA-Score models (AGIMA-Score¹⁸, AGIMA-Score⁸ and AGIMA-Score¹³) based on the graph-learning architecture in Fig 3. To investigate whether a single distance range of can cover sufficient adjacency information, we built a single-matrix adjacency tensor () to pair up with the three node-feature matrices for each protein-ligand complex. This led to the construction of three new models (AGIMA-Score, AGIMA-Score and AGIMA-Score) for comparison purpose. In addition, the non-redundant features () from were collected and combined with the adjacency tensor A to build the AGIMA-Score²¹ model. The dense layers each have a dimension of 128, and the number of epochs and batch size were tuned when constructing these models.

In order to evaluate the performance of these models comprehensively, several broadly-discussed, deep-learning scoring models were implemented as competing models. These include the Atom Convolutional Neural Network (ACNN) [27], OnionNet [18], KDEEP [13] and GraphBAR [20]. For ACNN, parameters including pooling filters, number of epochs and batch size were tuned in our work to reach the best model. Number of epochs and batch size were tuned for OnionNet and KDEEP (no data augmentation). As two similar graph-learning approaches, GraphBAR considers both intra- and inter-molecular contacts (Fig 2B) while AGIMA-Score focuses on only the inter-molecular contacts (Fig 2D) in the construction of molecular graphs. To make a fair comparison with AGIMA-Score models, we constructed two GraphBAR models, GraphBAR_2AM and GraphBAR_3AM, based on the architecture in Fig 3. GraphBAR_2AM takes into account the intra-/inter-molecular contacts within and those within when building the adjacency tensor, which corresponds to the AGIMA-Score_SAM models (considering inter-molecular contacts in ). GraphBAR_3AM adopts three distance ranges (, and ) to collect the intra-/inter-molecular adjacencies, corresponding to the AGIMA-Score models (considering inter-molecular contacts in and ). The number of epochs and batch size were treated as tuning parameters for the two GraphBAR models.

The AGIMA-Score and competing models were constructed based on the benchmark PDBbind database (https://www.pdbbind-plus.org.cn/). The Refined Set and Core Set in this database were employed for training and parameter tuning (validation). Each sample in these two sets is a protein-ligand complex structure (determined mostly by X-ray crystallography or NMR spectroscopy) with the experimentally resolved binding strength (−logK_d/i). These structural and binding-strength data are of high quality as they have gone through rigorous filtering processes [28, 29]. To avoid potential train-validation contamination, each pair of complexes, one from the validation set and the other from the training set, needs to pass a similarity test. This test guarantees that the similarity of two protein sequences is below 0.3 or the similarity of two ligands is below 0.7, in each pair of complexes. Protein sequence similarities were generated using the crossSetSim function from the protr R library, with the default BLOSUM62 substitution matrix. Ligand similarities were calculated using the cmp.similarity function from the ChemmineR library, based on SMILES-transformed descriptors. The complexes against this rule were removed from the Validation set. Two sets from CSAR [30] were regarded as the final test sets, named Test1 and Test2, in case of the over-optimistic results yielded from using the same-source data sets. The aforementioned similarity test was also performed on each test set vs. training set to prevent from potential train-test contamination. After this cleaning, the similarity statistics for pairwise complexes, with one complex from the training set and the other from the Validation, Test1, or Test2 set, are presented in Fig 4. Furthermore, the same protocol was adopted to ensure that there was no contamination among the Validation, Test1 and Test2 sets. Finally, the filtered Training, Validation, Test1 and Test2 sets consist of 5007, 195, 116 and 102 complex structures respectively. The lists of complexes for these sets can be found in Zenodo (https://zenodo.org/records/15023336).

Download:

Fig 4. Similarity test for each pair of complexes (Training vs. Validation, Training vs. Test1, and Training vs. Test2).

The horizontal axis stands for the similarity between the two protein sequences involved in a complex pair, and the vertical axis indicates the similarity between the two involved ligands. The red dotted line means a sequence similarity of 0.3 and the yellow line shows a ligand similarity of 0.7.

https://doi.org/10.1371/journal.pcbi.1013074.g004

The performance of each model was evaluated according to (1) the Pearson’s Correlation (PC) between the experimental and predicted binding strengths of the complex structures and (2) the room-mean-square-error (RMSE) concerning those binding strengths. The evaluation results are now listed in Table 2.

Download:

Table 2. Scoring Performance Comparison.

The models were trained on PDBbind Refined Set (version V2020) with parameters tuned via the Core Set (version V2020), and tested on two sets from the CSAR source. State-of-the-art deep learning models (ACNN, OnionNet, KDEEP and GraphBAR) for scoring the protein-ligand complexes were realized, to comprehensively evaluate the proposed AGIMA-Score models. For GraphBAR, different graph adjacency schemes (2 or 3 adjacency matrices) were adopted for model construction. For AGIMA-Score, different node features (separately referring to Pafnucy, KDEEP and GraphBAR) and adjacency schemes (2 adjacency matrices or single adjacency matrix) were considered for model investigation. By default, 2 adjacency matrices (generated by intermolecular atomic contacts within and those within ) were adopted in the graph learning by AGIMA-Score. Best performance in terms of PC and RMSE were underlined for the state-of-the-art methods and the proposed AGIMA-Score models.

https://doi.org/10.1371/journal.pcbi.1013074.t002

An ACNN model is underfitted easily while a KDEEP model is often overfitted. Among the earlier models (ACNN, OnionNet, KDEEP and GraphBAR), GraphBAR outperforms the others in terms of Test1-PC, Test1-RMSE and Test2-RMSE, while OnionNet reaches the best PC on Test2 set. Compared to these earlier models, the AGIMA-Score models perform well in average. AGIMA-Score¹⁸ achieves the overall best performance because of the lowest Test1-RMSE, highest Test2-PC and lowest Test2-RMSE. AGIMA-Score⁸ attains the best performance with respect to Test1-PC. Although using a single adjacency matrix (AGIMA-Score_SAM) eases the computations in graph learning, it often results in an underperformance in terms of PC and RMSE (compared to AGIMA-Score). It shows that using two adjacency matrices captures more information about the connections among atoms, such as the strong and weak hydrogen bonds with donor-acceptor distances of and separately [31]. The outperformance of AGIMA-Score over GraphBAR demonstrates the efficacy of learning the molecular graphs with inter-molecular adjacencies, rather than mixed intra- and inter-molecular adjacencies, in a scoring task. Overall, these results reveal the strong competitiveness of AGIMA-Score models in scoring the binding strength of a protein-ligand complex. A Docker container with the trained AGIMA-Score¹⁸ model pre-installed is available for access on Zenodo (https://zenodo.org/records/15023336).

Screening performance of models

The scoring performance shows the capability of ranking a list of binding complexes or predicting the accurate binding strengths. Beyond that, the screening power is another indicator of interest for further evaluating the prediction models. As a more practical task, virtual screening aims to discover the potential binders for a target protein, in order to mitigate the burden of downstream biochemical experiments (Fig 5). Such a target protein often plays a key role in regulating the progression of some diseases, exemplified by the epidermal growth factor receptor (EGFR) protein that mediates the growth of non-small-cell lung cancer (NSCLC). Modeling the binding structure of each protein-ligand pair (docking) and scoring the binding strength based on this structure (scoring) are the two primary subtasks in virtual screening. Current state-of-the-art docking tools (e.g. AUTODOCK [32] and Glide[33]) can provide near-experimental binding structure for a pair of protein and ligand, while accurately scoring such a binding structure has long been a challenge. This work aims mainly at the scoring phase. Once we have the scored binding structures according to a model, whether the true binders for the target protein can be highly ranked is a main indicator of the screening capability of this model. A model with both high scoring and high screening power is always the pursuit of the CADD community.

Download:

Fig 5. A virtual screening task.

The task starts from a target protein and a big library of ligands, followed by the modeling of each protein-ligand binding structure (docking tool) and the scoring of the binding structures (scoring model). The highly-ranked ligands (according to the predicted scores) will be regarded as potential binders for further biochemical experiments.

https://doi.org/10.1371/journal.pcbi.1013074.g005

To evaluate the screening power of AGIMA-Score models and their competitors, we selected a comparably large set from the DUD-E source (https://dude.docking.org). This set concerns the aforementioned EGFR and its potential ligands. A total of 36,273 ligands have been included in this set, with 832 actives (binders) and 35,441 decoys (non-binders) for the EGFR protein. Noteworthily, the models discussed above accept protein-ligand binding structures as inputs, but this EGFR set only contains the structures of the monomers (EGFR protein and ligands). Accordingly, we paired up the protein with each ligand into a binding structure by the well-acknowledged AUTODOCK Vina docking tool, before feeding the structures into the models. The best binding pose was retained for each protein-ligand pair, based on the default setting in AUTODOCK Vina and a reference structure (PDB:2RGP). The generated 36,273 EGFR-ligand binding structures were then fed into each model (ACNN, OnionNet, KDEEP, GraphBAR_2AM, GraphBAR_3AM, AGIMA-Score¹⁸, AGIMA-Score⁸ and AGIMA-Score¹³) to predict the binding strengths.

Enrichment factor (EF) is a widely-used index for evaluating the screening performance of a scoring model. It is defined as , where is the percentage of actives in the top ranked ligands. Meanwhile, the total decoy-to-active ratio () for this set is , indicating a high imbalance between actives and decoys. To provide a more comprehensive evaluation, we composed a series of secondary sets according to varying r_DTA values and assessed the corresponding EFs for each model on these sets. Aiming at a specific model MDL, an r_DTA () and an X value, this procedure is described as follows.

Keep all the 832 actives and randomly select decoys to constitute a set of size .
According to MDL, score and rank all the EGFR-ligand complexes in each of the set generated above, and calculate .
Repeat the process for 10 times to derive the average EF ().

The top () ranked ligands were used for evaluating the screening performance of each model in Table 2, and the results are now exhibited in Fig 6.

Download:

Fig 6. The screening performance of each model on the EGFR set, with the top

ranked ligands considered.

In each scenario, the enrichment factors (EFs) regarding various decoy-to-active ratios (r_DTA) were calculated for each model, and plotted in a line. The black dashed lines indicate EF = 1.

https://doi.org/10.1371/journal.pcbi.1013074.g006

Here, EF goes worse as r_DTA or X goes larger for all the models. However, AGIMA-Score¹⁸ performs markedly better than the others. Encouragingly, AGIMA-Score¹⁸ even achieves an EF of 26 for the top 1% ranked ligands, when all the decoys are included in the assessment. Similar results for the top ranked ligands are displayed in S1 Fig. Three more tasks, involving target proteins of HIV protease (HIVPR), ADAM17 Protease (ADA17) and tyrosine-protein kinase SRC (SRC), were considered. The HIVPR set covers 37,673 potential ligands (1,395 actives/36,278 decoys) for HIVPR protein. 30,956 (959 actives/29,997 decoys) and 35,790 (831 actives/34,959 decoys) ligands are included in the ADA17 and SRC sets, respectively. The screening performance of the AGIMA-Score models and competitors on these three sets, in terms of the top 12% ranked ligands, are displayed in Fig 7. For HIVPR set, AGIMA-Score¹³ performs the best, with an EF of 29 for the top 1% ligands when involving all decoys. AGIMA-Score⁸ outperforms the others for the ADA17 set and AGIMA-Score¹⁸ is the best performer for the SRC set, with EFs of 13 and 17 for the top 1% ligands (all ligands involved) respectively. The results for the top ranked ligands are presented in S2 Fig. Such results promote the AGIMA-Score models further.

Download:

Fig 7. The screening performances of AGIMA-Score and competing models on the HIVPR, ADA17 and SRC sets, with the top

ranked ligands considered.

In each scenario, the enrichment factors (EFs) regarding various decoy-to-active ratios (r_DTA) were calculated for each model, and plotted in a line. The black dashed lines indicate EF = 1.

https://doi.org/10.1371/journal.pcbi.1013074.g007

Discussion on model interpretability

Interpretations of deep learning models can build confidence in their predictions, therefore attracting more and more attention in recent years. Here, we discuss the ways to interpret AGIMA-Score models from model level and post-hoc analysis.

.

Model-level interpretation.

Due to the black-box nature of deep learning models, explaining the intrinsic structures of these models, which often concern millions or even more parameters, is quite difficult. In this regard, we focus mainly on the learning architecture (Fig 3) of AGIMA-Score models. This framework first transforms the original node features to an embedding space, and then considers multi-range, distance-dependent inter-molecular interactions ( and ) between a pair of protein and ligand. They imply important local interaction patterns between the two binding molecules. After a further feature-embedding transformation ( and ), the framework gathers the information from those interaction patterns (by concatenation of and ) in the binding area. Then it maps the gathered information into the components of molecular binding strength or interaction energy using another hidden layer, leading to the final prediction of total binding strength. Hence, the framework can be partly explained in the perspective of molecular interaction energies.

Post-hoc interpretation.

After a model is constructed, investigating the roles of different features in the decision-making process and monitoring the correlations between some hidden features and the outputs are well-acknowledged strategies for decoding the model in a post-hoc way. Specifically, we employed the masking-based feature importance assessment and principal component analysis (PCA) of key feature embeddings in our work.

Masking-based feature importance assessment. To simplify the scenario, ascertaining the importance of each node feature in the decision-making process for a given AGIMA-Score model is the goal here. For such a model MDL, this assessment procedure is described as follows.

Implement MDL on predicting the binding strengths of all the complexes in the validation set. Suppose the results are PC₀ and RMSE₀.
Mask one node feature (i-th feature) at a time and re-implement MDL on the scoring task. Here, masking a feature means replacing the original features with 0s. As a result, a drop in PC () and an increase in RMSE () will be derived.
Rank the node features in terms of PC drops (or RMSE increase), and those corresponding to a large PC drop (or RMSE increase) after being masked are more important in the decision-making process.

The assessment result for AGIMA-Score¹⁸ model is now displayed in Fig 8. As shown here, certain pharmacophoric features (e.g. hydrophobicity, hybridization type and ring membership) weigh heavier importance than the atom types (e.g. Carbon, Nitrogen and Oxygen) in the perspective of either PC drop or RMSE increase. It verifies the important role of certain pharmacophoric properties in determining protein-ligand binding, as frequently applied in pharmacophore-based virtual screening [34]. However, other pharmacophoric properties, like hydrogen-bond donors, are of low interest in this scenario. The AGIMA-Score⁸ model depends on a feature set that combines pharmacophoric properties, atomic charges and excluded volume (Fig 9). In this scenario, the excluded volume and atomic charges (positive or negative) stand out from the crowd of pharmacophoric features. The AGIMA-Score¹³ model employs a simplified feature set of that from AGIMA-Score¹⁸ (Fig 10). Here the partial charge, heavy-atom neighbors and hetero-atom neighbors dominate the PC drop, while atom types are more important in terms of RMSE increase. In summary of the importance plots, atom features such as certain pharmacophoric properties, atomic charges and connections play a vital role in revealing protein-ligand binding.

Download:

Fig 8. Importance assessment of the node features involved in the AGIMA-Score¹⁸ model.

The result was revealed by the masking-based performance drop on the validation set (PDBbind Core Set).

https://doi.org/10.1371/journal.pcbi.1013074.g008

Download:

Fig 9. Importance assessment of the node features involved in the AGIMA-Score⁸ model.

The result was revealed by masking-based performance drop on the validation set (PDBbind Core Set).

https://doi.org/10.1371/journal.pcbi.1013074.g009

Download:

Fig 10. Importance assessment of the node features involved in the AGIMA-Score¹³ model.

The result was revealed by masking-based performance drop on the validation set (PDBbind Core Set).

https://doi.org/10.1371/journal.pcbi.1013074.g010

PCA of key feature embeddings. The feature embeddings ( and ) in the last-but-two layer in Fig 3 were monitored in this study, because these hidden features stand for the important molecular interactions learned by an AGIMA-Score model. and correspond to the inter-molecular interactions in distance ranges and respectively. PCA was adopted to compress these embeddings, and explore their correlations with the molecular binding strength. In order for better visualization, the first principal component (PC1) for and that for , of all the complexes in each set were extracted. Examining the correlations between such a PC (of or ) and the binding strength of a complex can provide useful insights into the logics of AGIMA-Score models. Taking AGIMA-Score⁸ as an example, the PC1 vs. binding strength plots for the Training, Validation, Test1 and Test2 sets are now shown in Fig 11. The linear trend for -PC1 vs. binding strength and that for -PC1 vs. binding strengths were also captured in this figure, where a marked difference in the two trendlines can be observed. This demonstrates two different types of interactions in the process of determining the protein-ligand binding strength.

Download:

Fig 11. Investigation of key feature embeddings in the AGIMA-Score⁸ model.

The feature embeddings () in the last-but-two layer of the model architecture were decoded by principal component analysis, and the first principal components of were correlated with the binding strength via linear regression.

https://doi.org/10.1371/journal.pcbi.1013074.g011

Focusing on the Validation set, the scatter plots for the -PC1 and -PC1 of this model are displayed in Fig 12, where multiple thresholds for binding strength are also set to reveal the trends. It shows that higher values of interactions (represented by -PC1 and -PC1) normally lead to higher binding strengths. A similar analysis for AGIMA-Score¹³ can be found in S3 and S4 Figs.

Download:

Fig 12. Principal component plots of feature embeddings in the AGIMA-Score⁸ model.

-PC1 vs. -PC1 plots for the validation set are shown. Different thresholds of binding strength were used to uncover the correlations between the PCs and the binding strength.

https://doi.org/10.1371/journal.pcbi.1013074.g012

Conclusion

The AGIMA-Score framework is introduced in this work. It describes a protein-ligand binding structure as an atomic-level graph, with only the inter-molecular interactions taken into consideration. Having a high computational efficiency, this framework places an absolute focus on the learning of the binding area of a protein-ligand complex. Depending on different sets of node features and a neat graph-learning architecture, a number of AGIMA-Score models were constructed. Such models perform well in the scoring of protein-ligand binding strengths and the screening of binders from non-binders for a target protein. At last, they can be explained reasonably from the model level, or in a post-hoc analysis. In the near future, our research will focus on exploring enriched sets of node features and developing more comprehensive approaches for model interpretability.

Supporting information

S1 Fig. The screening performance of each model on the EGFR set, with the top ranked ligands considered.

In each scenario, the enrichment factors (EFs) regarding various decoy-to-active ratios (r_DTA) were calculated for each model, and plotted in a line. The black dashed lines indicate EF = 1

https://doi.org/10.1371/journal.pcbi.1013074.s001

(EPS)

S2 Fig. The screening performance of AGIMA-Score and competing models on the HIVPR, ADA17 and SRC sets, with the top ranked ligands considered.

In each scenario, the enrichment factors (EFs) regarding various decoy-to-active ratios (r_DTA) were calculated for each model, and plotted in a line. The black dashed lines indicate EF = 1.

https://doi.org/10.1371/journal.pcbi.1013074.s002

(EPS)

S3 Fig. Investigation of key feature embeddings in the AGIMA-Score¹³ model.

The feature embeddings () in the last-but-two layer of the model architecture were decoded by principal component analysis, and the first principal components of were correlated with the binding strength via linear regression.

https://doi.org/10.1371/journal.pcbi.1013074.s003

(EPS)

S4 Fig. Principal component plots of feature embeddings in the AGIMA-Score¹³ model.

-PC1 vs. -PC1 plots for the validation set are shown. Different thresholds of binding strength were used to uncover the correlations between the PCs and the binding strength.

https://doi.org/10.1371/journal.pcbi.1013074.s004

(EPS)

Acknowledgments

No acknowledgments to declare.

References

1. Du-Harpur X, Watt FM, Luscombe NM, Lynch MD. What is AI? Applications of artificial intelligence to dermatology. Br J Dermatol. 2020;183(3):423–30. pmid:31960407
- View Article
- PubMed/NCBI
- Google Scholar
2. Johnson KB, Wei W-Q, Weeraratne D, Frisse ME, Misulis K, Rhee K, et al. Precision medicine, AI, and the future of personalized health care. Clin Transl Sci. 2021;14(1):86–93. pmid:32961010
- View Article
- PubMed/NCBI
- Google Scholar
3. Ivanenkov YA, Polykovskiy D, Bezrukov D, Zagribelnyy B, Aladinskiy V, Kamya P, et al. Chemistry42: an AI-driven platform for molecular design and optimization. J Chem Inf Model. 2023;63(3):695–701. pmid:36728505
- View Article
- PubMed/NCBI
- Google Scholar
4. Jayatunga MKP, Xie W, Ruder L, Schulze U, Meier C. AI in small-molecule drug discovery: a coming wave?. Nat Rev Drug Discov. 2022;21(3):175–6. pmid:35132242
- View Article
- PubMed/NCBI
- Google Scholar
5. Wang DD, Chan M-T. Protein-ligand binding affinity prediction based on profiles of intermolecular contacts. Comput Struct Biotechnol J. 2022;20:1088–96. pmid:35317230
- View Article
- PubMed/NCBI
- Google Scholar
6. Wang DD, Xie H, Yan H. Proteo-chemometrics interaction fingerprints of protein-ligand complexes predict binding affinity. Bioinformatics. 2021;37(17):2570–9. pmid:33650636
- View Article
- PubMed/NCBI
- Google Scholar
7. Wang DD, Zhu M, Yan H. Computationally predicting binding affinity in protein-ligand complexes: free energy-based simulations and machine learning-based scoring functions. Brief Bioinform. 2021;22(3):bbaa107. pmid:32591817
- View Article
- PubMed/NCBI
- Google Scholar
8. Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26(9):1169–75. pmid:20236947
- View Article
- PubMed/NCBI
- Google Scholar
9. Liu Q, Kwoh CK, Li J. Binding affinity prediction for protein-ligand complexes based on β contacts and B factor. J Chem Inf Model. 2013;53(11):3076–85. pmid:24191692
- View Article
- PubMed/NCBI
- Google Scholar
10. Zilian D, Sotriffer CA. SFCscore(RF): a random forest-based scoring function for improved affinity prediction of protein-ligand complexes. J Chem Inf Model. 2013;53(8):1923–33. pmid:23705795
- View Article
- PubMed/NCBI
- Google Scholar
11. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. pmid:34265844
- View Article
- PubMed/NCBI
- Google Scholar
12. Brown T, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language models are few-shot learners. In: Advances in Neural Information Processing Systems. 2020. p. 1877–901.
13. Jiménez J, Škalič M, Martínez-Rosell G, De Fabritiis G. KDEEP: protein-ligand absolute binding affinity prediction via 3D-convolutional neural networks. J Chem Inf Model. 2018;58(2):287–96. pmid:29309725
- View Article
- PubMed/NCBI
- Google Scholar
14. Stepniewska-Dziubinska MM, Zielenkiewicz P, Siedlecki P. Development and evaluation of a deep learning model for protein-ligand binding affinity prediction. Bioinformatics. 2018;34(21):3666–74. pmid:29757353
- View Article
- PubMed/NCBI
- Google Scholar
15. Rezaei MA, Li Y, Wu D, Li X, Li C. Deep learning in drug design: protein-ligand binding affinity prediction. IEEE/ACM Trans Comput Biol Bioinform. 2022;19(1):407–17. pmid:33360998
- View Article
- PubMed/NCBI
- Google Scholar
16. Wang DD, Chan M-T, Yan H. Structure-based protein-ligand interaction fingerprints for binding affinity prediction. Comput Struct Biotechnol J. 2021;19:6291–300. pmid:34900139
- View Article
- PubMed/NCBI
- Google Scholar
17. Wang D, Wang R. Scoring protein-ligand complex structures by hybridnet. In: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC). 2023. p. 4070–5.
18. Zheng L, Fan J, Mu Y. OnionNet: a multiple-layer intermolecular-contact-based convolutional neural network for protein-ligand binding affinity prediction. ACS Omega. 2019;4(14):15956–65. pmid:31592466
- View Article
- PubMed/NCBI
- Google Scholar
19. Wang DD, Wu W, Wang R. Structure-based, deep-learning models for protein-ligand binding affinity prediction. J Cheminform. 2024;16(1):2. pmid:38173000
- View Article
- PubMed/NCBI
- Google Scholar
20. Son J, Kim D. Development of a graph convolutional neural network model for efficient prediction of protein-ligand binding affinities. PLoS One. 2021;16(4):e0249404. pmid:33831016
- View Article
- PubMed/NCBI
- Google Scholar
21. Shen H, Zhang Y, Zheng C, Wang B, Chen P. A cascade graph convolutional network for predicting protein-ligand binding affinity. Int J Mol Sci. 2021;22(8):4023. pmid:33919681
- View Article
- PubMed/NCBI
- Google Scholar
22. Zhang X, Gao H, Wang H, Chen Z, Zhang Z, Chen X, et al. Planet: a multi-objective graph neural network model for protein–ligand binding affinity prediction. J Chem Inf Model. 2023.
- View Article
- Google Scholar
23. Wang K, Zhou R, Tang J, Li M. GraphscoreDTA: optimized graph neural network for protein-ligand binding affinity prediction. Bioinformatics. 2023;39(6):btad340. pmid:37225408
- View Article
- PubMed/NCBI
- Google Scholar
24. Feinberg EN, Sur D, Wu Z, Husic BE, Mai H, Li Y, et al. PotentialNet for molecular property prediction. ACS Cent Sci. 2018;4(11):1520–30. pmid:30555904
- View Article
- PubMed/NCBI
- Google Scholar
25. Gilmer J, Schoenholz S, Riley P, Vinyals O, Dahl G. Neural message passing for quantum chemistry. In: International Conference on Machine Learning. 2017. p. 1263–72.
26. Kipf T, Welling M. Semi-supervised classification with graph convolutional networks. arXiv preprint 2016. https://arxiv.org/abs/1609.02907
27. Gomes J, Ramsundar B, Feinberg E, Pande V. Atomic convolutional networks for predicting protein-ligand binding affinity. arXiv preprint 2017. https://arxiv.org/abs/1703.10603
28. Su M, Yang Q, Du Y, Feng G, Liu Z, Li Y, et al. Comparative assessment of scoring functions: the CASF-2016 update. J Chem Inf Model. 2019;59(2):895–913. pmid:30481020
- View Article
- PubMed/NCBI
- Google Scholar
29. Liu Z, Li Y, Han L, Li J, Liu J, Zhao Z, et al. PDB-wide collection of binding data: current status of the PDBbind database. Bioinformatics. 2015;31(3):405–12. pmid:25301850
- View Article
- PubMed/NCBI
- Google Scholar
30. Dunbar JB Jr, Smith RD, Damm-Ganamet KL, Ahmed A, Esposito EX, Delproposto J, et al. CSAR data set release 2012: ligands, affinities, complexes, and docking decoys. J Chem Inf Model. 2013;53(8):1842–52. pmid:23617227
- View Article
- PubMed/NCBI
- Google Scholar
31. Jeffrey G, Jeffrey G. An introduction to hydrogen bonding. New York: Oxford University Press. 1997.
32. Huey R, Morris G, Forli S, . Using autodock 4 and autodock vina with autodocktools: a tutorial. Scripps Res Inst Molecul Graph Lab. 2012;10550:1000.
33. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, et al. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem. 2004;47(7):1739–49. pmid:15027865
- View Article
- PubMed/NCBI
- Google Scholar
34. Horvath D. Pharmacophore-based virtual screening. Chemoinformatics And Computational Chemical Biology. 2011. p. 261–98.

[ref1] 1. Du-Harpur X, Watt FM, Luscombe NM, Lynch MD. What is AI? Applications of artificial intelligence to dermatology. Br J Dermatol. 2020;183(3):423–30. pmid:31960407
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Johnson KB, Wei W-Q, Weeraratne D, Frisse ME, Misulis K, Rhee K, et al. Precision medicine, AI, and the future of personalized health care. Clin Transl Sci. 2021;14(1):86–93. pmid:32961010
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Ivanenkov YA, Polykovskiy D, Bezrukov D, Zagribelnyy B, Aladinskiy V, Kamya P, et al. Chemistry42: an AI-driven platform for molecular design and optimization. J Chem Inf Model. 2023;63(3):695–701. pmid:36728505
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Jayatunga MKP, Xie W, Ruder L, Schulze U, Meier C. AI in small-molecule drug discovery: a coming wave?. Nat Rev Drug Discov. 2022;21(3):175–6. pmid:35132242
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Wang DD, Chan M-T. Protein-ligand binding affinity prediction based on profiles of intermolecular contacts. Comput Struct Biotechnol J. 2022;20:1088–96. pmid:35317230
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Wang DD, Xie H, Yan H. Proteo-chemometrics interaction fingerprints of protein-ligand complexes predict binding affinity. Bioinformatics. 2021;37(17):2570–9. pmid:33650636
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Wang DD, Zhu M, Yan H. Computationally predicting binding affinity in protein-ligand complexes: free energy-based simulations and machine learning-based scoring functions. Brief Bioinform. 2021;22(3):bbaa107. pmid:32591817
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26(9):1169–75. pmid:20236947
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Liu Q, Kwoh CK, Li J. Binding affinity prediction for protein-ligand complexes based on β contacts and B factor. J Chem Inf Model. 2013;53(11):3076–85. pmid:24191692
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref10] 10. Zilian D, Sotriffer CA. SFCscore(RF): a random forest-based scoring function for improved affinity prediction of protein-ligand complexes. J Chem Inf Model. 2013;53(8):1923–33. pmid:23705795
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref11] 11. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. pmid:34265844
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref12] 12. Brown T, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language models are few-shot learners. In: Advances in Neural Information Processing Systems. 2020. p. 1877–901.

[ref13] 13. Jiménez J, Škalič M, Martínez-Rosell G, De Fabritiis G. KDEEP: protein-ligand absolute binding affinity prediction via 3D-convolutional neural networks. J Chem Inf Model. 2018;58(2):287–96. pmid:29309725
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref14] 14. Stepniewska-Dziubinska MM, Zielenkiewicz P, Siedlecki P. Development and evaluation of a deep learning model for protein-ligand binding affinity prediction. Bioinformatics. 2018;34(21):3666–74. pmid:29757353
View Article
PubMed/NCBI
Google Scholar

[51] View Article

[52] PubMed/NCBI

[53] Google Scholar

[ref15] 15. Rezaei MA, Li Y, Wu D, Li X, Li C. Deep learning in drug design: protein-ligand binding affinity prediction. IEEE/ACM Trans Comput Biol Bioinform. 2022;19(1):407–17. pmid:33360998
View Article
PubMed/NCBI
Google Scholar

[55] View Article

[56] PubMed/NCBI

[57] Google Scholar

[ref16] 16. Wang DD, Chan M-T, Yan H. Structure-based protein-ligand interaction fingerprints for binding affinity prediction. Comput Struct Biotechnol J. 2021;19:6291–300. pmid:34900139
View Article
PubMed/NCBI
Google Scholar

[59] View Article

[60] PubMed/NCBI

[61] Google Scholar

[ref17] 17. Wang D, Wang R. Scoring protein-ligand complex structures by hybridnet. In: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC). 2023. p. 4070–5.

[ref18] 18. Zheng L, Fan J, Mu Y. OnionNet: a multiple-layer intermolecular-contact-based convolutional neural network for protein-ligand binding affinity prediction. ACS Omega. 2019;4(14):15956–65. pmid:31592466
View Article
PubMed/NCBI
Google Scholar

[64] View Article

[65] PubMed/NCBI

[66] Google Scholar

[ref19] 19. Wang DD, Wu W, Wang R. Structure-based, deep-learning models for protein-ligand binding affinity prediction. J Cheminform. 2024;16(1):2. pmid:38173000
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

[ref20] 20. Son J, Kim D. Development of a graph convolutional neural network model for efficient prediction of protein-ligand binding affinities. PLoS One. 2021;16(4):e0249404. pmid:33831016
View Article
PubMed/NCBI
Google Scholar

[72] View Article

[73] PubMed/NCBI

[74] Google Scholar

[ref21] 21. Shen H, Zhang Y, Zheng C, Wang B, Chen P. A cascade graph convolutional network for predicting protein-ligand binding affinity. Int J Mol Sci. 2021;22(8):4023. pmid:33919681
View Article
PubMed/NCBI
Google Scholar

[76] View Article

[77] PubMed/NCBI

[78] Google Scholar

[ref22] 22. Zhang X, Gao H, Wang H, Chen Z, Zhang Z, Chen X, et al. Planet: a multi-objective graph neural network model for protein–ligand binding affinity prediction. J Chem Inf Model. 2023.
View Article
Google Scholar

[80] View Article

[81] Google Scholar

[ref23] 23. Wang K, Zhou R, Tang J, Li M. GraphscoreDTA: optimized graph neural network for protein-ligand binding affinity prediction. Bioinformatics. 2023;39(6):btad340. pmid:37225408
View Article
PubMed/NCBI
Google Scholar

[83] View Article

[84] PubMed/NCBI

[85] Google Scholar

[ref24] 24. Feinberg EN, Sur D, Wu Z, Husic BE, Mai H, Li Y, et al. PotentialNet for molecular property prediction. ACS Cent Sci. 2018;4(11):1520–30. pmid:30555904
View Article
PubMed/NCBI
Google Scholar

[87] View Article

[88] PubMed/NCBI

[89] Google Scholar

[ref25] 25. Gilmer J, Schoenholz S, Riley P, Vinyals O, Dahl G. Neural message passing for quantum chemistry. In: International Conference on Machine Learning. 2017. p. 1263–72.

[ref26] 26. Kipf T, Welling M. Semi-supervised classification with graph convolutional networks. arXiv preprint 2016. https://arxiv.org/abs/1609.02907

[ref27] 27. Gomes J, Ramsundar B, Feinberg E, Pande V. Atomic convolutional networks for predicting protein-ligand binding affinity. arXiv preprint 2017. https://arxiv.org/abs/1703.10603

[ref28] 28. Su M, Yang Q, Du Y, Feng G, Liu Z, Li Y, et al. Comparative assessment of scoring functions: the CASF-2016 update. J Chem Inf Model. 2019;59(2):895–913. pmid:30481020
View Article
PubMed/NCBI
Google Scholar

[94] View Article

[95] PubMed/NCBI

[96] Google Scholar

[ref29] 29. Liu Z, Li Y, Han L, Li J, Liu J, Zhao Z, et al. PDB-wide collection of binding data: current status of the PDBbind database. Bioinformatics. 2015;31(3):405–12. pmid:25301850
View Article
PubMed/NCBI
Google Scholar

[98] View Article

[99] PubMed/NCBI

[100] Google Scholar

[ref30] 30. Dunbar JB Jr, Smith RD, Damm-Ganamet KL, Ahmed A, Esposito EX, Delproposto J, et al. CSAR data set release 2012: ligands, affinities, complexes, and docking decoys. J Chem Inf Model. 2013;53(8):1842–52. pmid:23617227
View Article
PubMed/NCBI
Google Scholar

[102] View Article

[103] PubMed/NCBI

[104] Google Scholar

[ref31] 31. Jeffrey G, Jeffrey G. An introduction to hydrogen bonding. New York: Oxford University Press. 1997.

[ref32] 32. Huey R, Morris G, Forli S, . Using autodock 4 and autodock vina with autodocktools: a tutorial. Scripps Res Inst Molecul Graph Lab. 2012;10550:1000.

[ref33] 33. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, et al. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem. 2004;47(7):1739–49. pmid:15027865
View Article
PubMed/NCBI
Google Scholar

[108] View Article

[109] PubMed/NCBI

[110] Google Scholar

[ref34] 34. Horvath D. Pharmacophore-based virtual screening. Chemoinformatics And Computational Chemical Biology. 2011. p. 261–98.

Figures

Abstract

Author summary

Introduction

Materials and methods

Atomic-level molecular graphs

Graph-based deep learning

Experiment and results

Scoring performance of models

Screening performance of models

Discussion on model interpretability

Model-level interpretation.

Post-hoc interpretation.

Conclusion

Supporting information

S1 Fig. The screening performance of each model on the EGFR set, with the top ranked ligands considered.

S2 Fig. The screening performance of AGIMA-Score and competing models on the HIVPR, ADA17 and SRC sets, with the top ranked ligands considered.

S3 Fig. Investigation of key feature embeddings in the AGIMA-Score13 model.

S4 Fig. Principal component plots of feature embeddings in the AGIMA-Score13 model.

Acknowledgments

References

S3 Fig. Investigation of key feature embeddings in the AGIMA-Score¹³ model.

S4 Fig. Principal component plots of feature embeddings in the AGIMA-Score¹³ model.