Figure 1.
The principle of DTI prediction based on gene-expression information.
(A) Ligand-binding modifies the biological functions of protein target, a series of target-related downstream genes are then influenced. Thus, we suppose that variant ligands binding to the same target should influence some downstream genes in common. This hypothesis is corroborated by the fact that drugs sharing common targets result in similar gene-expression profiles in CMap. (B) We therefore applied CMap expression similarity as a guilt-by-association indicator of potential drug-target interactions. If one compound has no recognized interaction with one certain target but shows high expression similarity to the ligands of that target, it may imply undiscovered drug-target interaction.
Figure 2.
The rationale of batch effect adjustment.
(A) The expression profile (denoted as variable E) in CMap is mainly determined by drug action (component d) and batch effect (component b). While the cell condition may vary from batch to batch, the drug action is relatively consistent. Thus, if batch X and Y include cell cultures treated by the same drug (e.g. drug B), these two drug B related expression profiles reflect homogenous drug action but heterogeneous cell condition. So their difference (denoted as Δ) reflects the variation between batch X and Y. By adding Δ to expression profiles in batch Y, the batch variation is adjusted and the two batches are merged into one. (B) Among the batches with 30 instances or more, we find 10 of them linking to each other by various bridge drugs. Primarily, we merged these 10 batches into a new one. Then other batches sharing bridge drugs with this new batch are further merged to form an even bigger batch. This bridging procedure is repeated until all batches are adjusted.
Figure 3.
Receiver operating characteristic (ROC) curve is used to evaluate the performance of BAES score and unadjusted CMap expression similarity.
For the classification between compound pairs sharing target (positive set) or not (negative set), the area under curve for BAES and unadjusted CMap is 0.66 and 0.59, respectively. The advantage of BAES is verified with 2000 replicates of bootstrap test, by the pROC package for R (http://cran.r-project.org/web/packages/pROC/).
Figure 4.
The rationale and performance of DTI prediction model.
(A) If a target has its ligand-binding well characterized by CMap, we expect the potential ligands to show higher BAES to benchmark ligands (red colored connections) than random compounds do (grey colored connections), i.e. the LOI of ligands should excel the overall background of CMap. (B) For the cross validation of PPAR-γ, the area under ROC curve reaches 0.86, with the 95% confidence interval (i.e. the grey colored shape) ranging from 0.74 to 0.99. The LOI corresponding to 90 percent specificity is set as the threshold to discriminate positive and negative sets. Thus, only 10 percent of the negative set is above the threshold, i.e. there would be 130 false positive (FP) and 1170 true negative (TN) compounds. Meanwhile, 67 percent of the positive set is above the threshold, so there would be 6 true positive (TP) and 3 false negative (FN) compounds. The statistical significance of such enrichment (odds ratio = 18) is determined by Fisher's exact test (p = 7.31×10−5).
Figure 5.
The performance of LOOCV suggests that at mRNA level, the genomic reactions of ligands binding differ dramatically from target to target.
Here all the targets are displayed in several families, according to their functional origins. The height of each bar represents the AUC level (and 95% confidence interval). And the color of each bar indicates the significance level of benchmark ligands enrichment.
Figure 6.
Identifying of target-target interactions by gene-expression similarity.
(A) In some cases, the ligands of one target may generally show high LOI to another distant target. (B) The OPRD1 ligands (positive set) collectively show higher LOI to CACNA1C than other compounds (negative set). Setting the LOI corresponding to 90 percent specificity as threshold, we find that all OPRM1 ligands are above the threshold (p = 1.08×10−5).
Figure 7.
A summary of the highlighted potential target-target interactions.
(A) The profile of target-target interaction is visualized with a matrix, in which each row and column represents a target and a family of ligands, respectively. The color of each cell represents the significance level of related target-target interaction. If a pair of targets are known to share ligands in DrugBank, their interaction (grey colored) would not be considered. (B) Potential interactions are broadly observed between antipsychotic drugs and cardiac ion channels. (C) & (D) The ligands of COX-1 (p = 3.08×10−3) and COX-2 (p = 1.02×10−3) generally show high LOI to estrogen receptor beta (i.e. ERβ).