Towards Structural Systems Pharmacology to Study Complex Diseases and Personalized Medicine

Genome-Wide Association Studies (GWAS), whole genome sequencing, and high-throughput omics techniques have generated vast amounts of genotypic and molecular phenotypic data. However, these data have not yet been fully explored to improve the effectiveness and efficiency of drug discovery, which continues along a one-drug-one-target-one-disease paradigm. As a partial consequence, both the cost to launch a new drug and the attrition rate are increasing. Systems pharmacology and pharmacogenomics are emerging to exploit the available data and potentially reverse this trend, but, as we argue here, more is needed. To understand the impact of genetic, epigenetic, and environmental factors on drug action, we must study the structural energetics and dynamics of molecular interactions in the context of the whole human genome and interactome. Such an approach requires an integrative modeling framework for drug action that leverages advances in data-driven statistical modeling and mechanism-based multiscale modeling and transforms heterogeneous data from GWAS, high-throughput sequencing, structural genomics, functional genomics, and chemical genomics into unified knowledge. This is not a small task, but, as reviewed here, progress is being made towards the final goal of personalized medicines for the treatment of complex diseases.


Introduction
Drug discovery, as broadly practiced, suffers from several shortcomings. First, although the vast amounts of genotypic and molecular phenotypic data generated from Genome-Wide Association Studies (GWAS); whole genome sequencing (WGS) [1]; and high-throughput techniques such as RNA-seq [2], ChIP-seq [3], BS-seq [4], and DNase-seq [5] provide an unprecedented opportunity to understand the etiology of complex diseases and to discover safe and potent personalized medicines, to date these data have not been fully explored to improve the effectiveness and efficiency of drug discovery. Second, modern target-based drug discovery is characterized as a one-drug-one-gene paradigm, and has been of limited success in attacking complex diseases. Third, phenotypic screens and cell-based assays generate a large number of active compounds relevant to disease treatment, but give few hints as to what their molecular targets are [6][7][8][9]. As a result of these shortcomings, the cost to launch a new drug is typically more than US$1 billion, and that cost continues to increase, with only around one-third of drugs in phase III clinical trials reaching the market. The emerging field of systems pharmacology is addressing these shortcomings and beginning to change the way we think about drug action in multigenic, complex diseases [10][11][12][13][14][15].
As illustrated in Figure 1, a drug commonly not only interacts with its intended molecular target (on-target) but also binds to and affects other targets (off-targets) that are often unknown [16]. Each drug-target interaction modifies the conformational dynamics of the target structure and results in the alternation of the functional states (e.g., activation versus inhibition). Consequently, the changing conformational and functional states of both on-targets and off-targets directly or indirectly affects other molecular components and their interactions through the interplay of complex signal transduction, gene regulation, and metabolic networks that collectively mediate the system-level response to the drug, leading to either therapeutic or adverse effects [10]. A variety of genetic, epigenetic, and environmental factors define the initial pathophysiological state of the molecular components and their interactions, which then dynamically evolve when perturbed by a drug. Stated another way, both target-and non-target associated genetic and/or epigenetic alternations could impact the drug response. In addition to inherited genetic and/or epigenetic factors, cellular, tissue, and organism environments may have significant effects on drug efficacy and side effects [17][18][19][20][21]. For example, tumor-stromal interactions play key roles in anticancer drug sensitivity [22].
The underlying hierarchical organization of living organisms makes it essential to model drug actions from DNA to gene, to protein and its molecular ensemble, to cell, to tissue, to organ, to whole organism, and to population. Data-driven, network-based association studies and physical-or mathematical-based multiscale modeling are two pillars of the existing paradigm of systems pharmacology. Network-based association studies provide a promising avenue to realize personalized medicine. The reconstruction and analysis of genome-scale molecular interaction networks, including protein-protein interactions; protein-nucleic acid interactions; epistasis interactions, as found in signal transduction; gene regulation; and metabolic networks, have emerged as a powerful framework to integrate heterogeneous DNA variation and omics data in associating genotypes with phenotypes under various environmental and drug-induced conditions [16]. By taking advantage of the progress in network and systems biology, mechanism-based multiscale modeling that spans different temporal and spatial scales has already been able to predict genotype-phenotype associations in a whole cell model of Mycoplasma genitalium [23] and quantitatively simulate drug actions at the organism level [24].
However, several challenges remain in the application of systems pharmacology. First, data-driven network-based association studies primarily rely on sophisticated statistical techniques. Although great efforts have been made to address the n%p problem, where the number of observations n is much smaller than the number of variables or parameters p, the power of these statistics-based techniques remains limited if sample sizes are small. The ''causal'' relationships inferred from these methods are simply mathematical correlations. They may not provide biological insight into the underlying molecular mechanisms that enable the development of actionable models for understanding the drugresponse phenotype. Second, the existing paradigm of multiscale modeling is to isolate subsystems (or parameters) from a phenotype space. These subsystems are first studied independently and then combined to infer their synergistic behavior. It is noted that the subsystem itself can be considered as a phenotype. Thus, this isolation/combination process could be multilevel and recursive. However, current organism-level physiological models are often too complex to be supported by existing data and computational power. The mathematical model is often nonidentifiable; that is, there is not a unique parameter set to explain the experimental observations [25]. On the other hand, physical models are often too computationally intensive to readily model the global behavior of the physiological system. Fundamentally, isolated parameter space may be not sufficient to ''identify and elucidate the guiding principles of control and communication defining the behavior of an organism'' [11]. Such a guiding principle is fundamental to reliably predict human behavior by scaling up animal models. Third, the enormous investment in molecular libraries and target-, cell-, and organism-based high-throughput compound screening has generated a massive amount of chemical genomics data [26][27][28]. There is no doubt that these data are invaluable in understanding how drugs work at the molecular, cellular, and organismal levels. However, this arguably most important dataset for systems pharmacology has not been fully incorporated into either the network-based association studies or multiscale modeling frameworks, partly due to a lack of computational tools to map bioactive chemical space to its global target and pharmacological space. Lastly, it has been recognized that one of the critical hurdles in multiomics data integration and multiscale modeling is the lack of a common language and standard to annotate, exchange, reuse, and update computational models [29][30][31]. Due to the dynamic, complex, and multiscale nature of datasets and computational models needed to simulate a drug response under diverse genetic, epigenetic, and environmental conditions, an open and reusable conceptual framework that is able to link multilevel biological concepts and relationships is needed to realize the promise of community-driven predictive modeling of human physiology and pathology [32][33][34][35].
Although macromolecular structure is at the foundation of any molecular interaction, adding the structural and associated energetics and dynamics of the interplay between drugs, biomolecular targets, genetic and epigenetic variations, and environmental factors has not been fully exploited in systems pharmacology to date. A global three-dimensional macromolecular structure view of the biological system under study may offer new insights to address the aforementioned challenges. Stated more explicitly, a mechanistic understanding of how individual molecular components work together in a system and how the molecular interactions are affected and adapted to genetic and epigenetic variants and environmental perturbations requires knowledge of the underlying molecular structures and their conformational dynamics [36]. The information derived from the atomic details of molecular interactions, in principle, will enhance the power of statistical inference in data-driven systems biology and alleviate the current inability to fully characterize parameter space in mathematical modeling, revealing the guiding principles of systematic control and communication. Moreover, as a bridge to connect chemical and genomics space, macromolecular structure will allow us to link drugs, targets, and biological pathways, thereby providing a common framework to correlate molecular interactions with cellular functions. Leveraging the vast investment in chemical genomics, functional genomics, structural genomics, and structurebased drug discovery, together with efforts in systems pharmacology, may open a new door to developing personalized medicines for complex diseases. Thus, here we advocate and then justify a new paradigm of structural systems pharmacology. Structural systems pharmacology will model, on a genome scale, the energetic and dynamic modifications of macromolecules (proteins, RNA, DNA) by drugs. The modeling accounts for genetic/epigenetic and environmental factors as well as the subsequent collective effects on the information flow in biological systems.
Some advances have been made in incorporating macromolecular structure modeling into systems pharmacology. We review them in this article. We demonstrate that integrative modeling of drug action-from the structural and energetic basis of genomewide molecular interactions to the clinical outcomes at the organism level-provides new insights into both therapeutic effects and side effects while taking into account genetic differences. In terms of scope, we first propose a hybrid modeling approach to integrate mechanism-based multiscale modeling and Figure 1. A network view of drug action. Dark blue lines represent drug-target interactions. Green arrows are protein-protein interactions or biological reaction pathways. Yellow nodes represent genes affected by genetic variation. These variations will impact drug action by changing the information flow of drug-target interactions in the biological network, even when these genes are not themselves the direct drug targets. doi:10.1371/journal.pcbi.1003554.g001 simulation with a data-driven systems biology approach and suggest that macromolecular structure is an essential component to glue diverse technologies together into a unified framework. We demonstrate the potential of macromolecular structures in enhancing the capability of systems biology through enriching the connectivity of context-specific biological networks and resolving non-identifiable parameter space during network simulation. Then we focus on three aspects of structural systems pharmacology that link genetic events and drug-target interactions to the drug response phenotype. First, we focus on traditional pharmacodynamics, now in the context of structural systems pharmacology. Second, and in a similar fashion, we focus on traditional pharmacokinetics in the context of structural systems pharmacology. Third, we explore the role of structural systems pharmacology to enhance the power of pharmacogenomics and GWAS. Thus, this review complements several recent reviews that focus on a network view of systems pharmacology and its connection to phenotype [10][11][12][13][14][15]. Physical-based multiscale modeling will not be covered in detail, since this is presented elsewhere [37][38][39][40]. Ultimately, we argue, structural systems pharmacology should be incorporated into the modeling and simulation of macromolecular ensembles, tissues, and organisms.

Structural Systems Pharmacology-Structure-Enabled Integrative Modeling of Drug Action
As stated in the introduction, several challenges remain in systems pharmacology, limiting the ability to predictively model complex drug action under the influence of diverse genetic, epigenetic, and environmental factors. To address these challenges, we suggest an integrative structural systems pharmacology approach to understanding and predicting individual-and context-specific drug response phenotypes. In this proposed modeling framework, macromolecular structure is an indispensable component to link chemical space to genomic space, to associate genotypes with phenotypes, and to account for the environmental impact on biological systems. To demonstrate the proposed model, we use the predictive modeling of drug-induced arrhythmia as an example ( Figure 2). Drug-induced arrhythmia is a potentially life-threatening side effect that is a major concern in clinical trials. QT interval prolongation that can be measured by electrocardiogram (ECG) waves has been widely accepted as a biomarker for arrhythmia. Both multiscale physical and mathematical models [41][42] and network-based predictive models [43] have been developed to predict QT interval prolongation with the aim of predicting the drug side effect of arrhythmia at an early stage (blue-colored box in Figure 2). However, several key components are missing in these models. As a result, their prediction power is limited. Given a new or existing drug, we at least need to address the following issues in order to predict whether or not the drug may induce arrhythmia under a specific physiological context for a specific individual ( Figure 2): (1) Identification of genome-wide drug-target interactions. The QT interval prolongation involves not only multiple ion channels (e.g., hERG, Kv7.1, Nav1.5, and Cav1.2) but also multiple other genes that are functionally associated with the ion channel [43]. In addition, the interaction of a drug with metabolizing enzymes and regulatory genes may alter the concentrations of proteins that play roles in arrhythmia and the pharmacokinetic profile of the drug molecule. gating is the primary determinant of the membrane current during the action potential. Both drug binding and unbinding kinetics, as well as amino acid mutations, may impact the conformational change of the ion channel, leading to the change of action potential. The events of conformational dynamics can be modeled by Molecular Dynamics (MD) simulation. As genome-scale MD simulation is not feasible at this time, evolutionary and functional constraints that could be derived from sequencing and multiple omics data will be an invaluable asset to significantly reduce the conformational sampling space of MD simulation [44]. (3) Determination of the in vivo concentrations of relevant drugs and metabolites (e.g., ebastine [45]) that may affect the activity of ion channels. In principle, this could be achieved using a whole-body physiologically based pharmacokinetics (PBPK) model that incorporates tissue-specific, genome-scale metabolic models. However, to date, little attention has been paid to adjusting PBPK models using information from genome-wide drug-target interactions. (4) Identification of individual-and context-specific parameter spaces for PBPK and systems biology models. To predict individual-and context-specific (e.g., normal tissue versus inflamed tissue) drug responses, it is critical to define molecular states, network architectures, and dynamic parameters at the molecular level under the physiological conditions that exist during drug treatment. Although a vast amount of GWAS and multiple types of omics data provide abundant opportunities for this purpose, these data have not been fully explored to define biological networks at molecular resolution. (5) Most importantly, it is necessary to integrate the above information into a coherent computational model across temporal and spatial scales. As these tasks traditionally span multiple disciplines and require different techniques, such as statistical machine learning [46], MD simulation, Ordinary Differential Equation (ODE)-based kinetics simulation, constraint-based modeling, discrete logic models, etc., they may need to be implemented as independent functional modules and subsequently assembled into a complete framework. Consequently, these functional modules need clearly defined interfaces and metadata for bi-directional communication.
For example, the PBPK model determines the fate of drug molecules. In turn, drug-target interactions may regulate the expression level of CYP450 (as detailed later), thus altering the parameter space of the PBPK model. It has been suggested that ontology-driven, rule-based modeling may facilitate the integrative modeling of drug actions [11]. The integration of rule-based semantic modeling and Bayesian statistical modeling [47][48][49], which can establish cause-effect relationships across temporal and spatial scales, could be a useful tool in combining diverse techniques and multiple sequencing, molecular, and omics data. Such a scheme is depicted in Figure 3. It combines information and biological knowledge from DNA variants and their associated genes; drug-target interactions; protein conformational states; biological pathways; cellular networks, such as protein-protein interaction networks; molecular phenotypes, such as gene expression profiles; and different organism phenotypes. Using these individualized drug response phenotypes, the probability of causal mutations, involved targets, conformational states, molecular complexes or functional modules, and biological pathways and their links can be established by a priori knowledge from mechanism-based modeling or estimated using a Bayesian statistical framework.
Although we use the arrhythmia example for illustrative purposes, the framework described in Figure 2 should be generalizable for the understanding and prediction of drug response phenotypes. As detailed in the remaining text, we will show that macromolecular structures play critical roles in reconstructing genome-scale, high-resolution molecular interaction models, simulating the conformational dynamics of drug-target complexes, enabling context-specific pharmacokinetics modeling, resolving nonidentifiable parameters in mathematical modeling, and enhancing the predictive power of network-based association studies. Thus, the structure-enabled integrative modeling of drug actions may facilitate transforming conventional drug discovery process to a new paradigm based on systems pharmacology.
When kinetic parameters are lacking, constraint-based Flux Balance Analysis (FBA) presents an alternative approach to compute the phenotypic properties of whole cells, especially genome-scale metabolic networks [61]. New biological insights have been gained when incorporating protein structural information into metabolic network modeling of bacteria, which cannot be achieved by FBA alone [62][63]. For example, structure-based reconstruction of a genome-wide metabolic network makes it possible to determine bacteria growth in response to temperature Figure 2. A structure-enabled integrative framework to model drug action. Given a set of inputs-a new or existing drug, known bioactive chemical space, the whole human proteome, an individual's genotypic data, and context-specific phenotypic data-it is possible, in principle, to construct a structure-enabled integrative model of drug action. Such a model comprises multiple integrated functional modules (rounded boxes) that span multiple levels of biological organization and can be used to infer drug-induced arrhythmia. Solid and open arrows indicate current workflows and missing links, respectively. Blue boxes represent two existing methods: multiscale ventricular electrophysiological modeling [41][42] and proteinprotein interaction (PPI) network-based predictive modeling [43] for the prediction of drug-induced arrhythmia represented as a pseudo electrocardiography (ECG). The other boxes represent functional modules that are critically important but have not been fully developed or incorporated into the modeling process. doi:10.1371/journal.pcbi.1003554.g002 changes [64]. This opens a new door to understanding the impact of the environment on drug action. It is noted that when protein structures are used, they are often treated as a single chain or as simply forming binary interactions during network analysis. In reality, under physiological conditions, proteins perform their functions through biological assemblies that may consist of multiple proteins. In a recent study, the 3-D structure of biological assemblies has been explicitly considered in the context of the genome-scale metabolic network. Novel drug targets and therapeutics are expected to be identified through such an integrative modeling strategy [65]. Beyond microorganisms, the first reconstruction of a genome-scale human metabolic network (Recon-1) by Duarte et al. provided the foundation for applying FBA to complex disease modeling [66]. By taking advantage of the rapidly increasing omics data, new methods have been developed to model cell-specific [67], tissue-specific [68][69], and contextspecific [70][71] human metabolic networks.
Several recent efforts have made progress in reconstructing genome-scale high-resolution protein-chemical interaction models [72][73][74][75][76]. As shown in Figure 4A, the targets identified from chemical genomics and functional genomics data analysis mainly include existing known drug targets and their homologs, which only cover a small portion (,8%) of the human genome [77][78][79][80][81][82][83][84][85][86][87][88][89][90][91][92][93][94][95]. A large number of proteins whose cognate or designed ligands are less characterized (or unknown) or who have low-affinity bindings to drugs are very likely to be important or even critical to pathophysiological processes under consideration [16]. The target space can be significantly extended to ,50% of human genes using structural genomics data [96]. As illustrated in Figure 4B, when integrating chemical genomics data analysis and molecular modeling on a structural genome scale, it is possible not only to greatly extend the existing target space to ,50% of human and pathogen genomes (a five to 50 times increase over existing targets), but also to construct genome-wide high-resolution protein-chemical interaction models for millions of bioactive compounds [72,97]. Although it remains a challenge to accurately determine, under physiological conditions, the binding affinity and binding/unbinding kinetics of these interaction models, these models provide a basis to simulate the conformational dynamics of protein targets perturbed by a drug (see details in next sections) [44,98]. Consequently, the physiological drug response (therapeutic effect or side effect) can be predicted by mapping the conformational states of drug targets into biological pathways and networks [44,98]. We expect that the integration of chemical genomics, structural genomics, and functional genomics will significantly enhance the capability of systems pharmacology for molecular target identification of bioactive compounds, drug repurposing, polypharmacological drug design, and side effect prediction.
In the context of personalized medicine in the treatment of complex diseases, a critically important but less-addressed problem is the need to reconstruct genome-scale, structure-based gene regulatory networks. In the major groove of DNA, every base pair has a unique hydrogen-bonding signature, and the ''direct readout'' mechanism, in which the formation of a series of amino acid base-specific hydrogen bonds contributes to the protein-DNA binding specificity, has been commonly accepted [99]. Recently, Rohs et al. found that the binding of arginine residues to narrow minor grooves is a widely used mode for protein-DNA recognition [100][101]. Differing from the ''direct readout'' mechanism, their findings indicate that the minor groove can also provide information, such as local variations in DNA shape and electrostatic potential for protein-DNA recognition, and offer new insights into the structural and energetic origins of protein-DNA binding specificity. These studies highlight the importance of the role of macromolecular structures in understanding gene regulation.
The reconstruction of the biological network is only the first step in understanding the dynamic and stochastic nature of cellular processes [102]. +When the network topology and kinetic parameters are defined, time-dependent deterministic functions, including Ordinary Differential Equations (ODEs) and Partial Differential Equations (PDEs) [103], are commonly employed to analyze the dynamics of signaling and metabolic pathways upon drug treatment [104]. For instance, ODEs have been successfully applied to investigate the effects of multitarget inhibition [105][106], to search for optimal target combinations for safe and effective anti-inflammation therapy [107], and to predict the network response to target inhibition [108]. PDEs can be used to model the concentration change of each component as a function of both time and space [103]. Such a technique is important in determining the effective concentration of a drug when it reaches its targets and off-targets in an essentially nonlinear, nonequilibrium cellular and microenvironment. Stochasticity is one of the fundamental properties of cellular processes [102]. Moreover, in vivo drug action involves a series of stochastic . Hierarchical cause-effect semantic modeling to understand and predict drug action across temporal and spatial scales by using diverse techniques and integrating multiple sequencing, molecular, and omics data. Arrowed edges represent cause-effect relationships between biological entities: genetic variation, ligand (allosteric or orthosteric), drug target, conformational state of the drug target, biological pathway, molecular phenotypes from multiple omics data, integrated biological network, and organism phenotype (e.g., disease). The thickness of the arrow indicates the degree of probability. And the + and 2 signs represent positive (or activated) and negative (or inhibited) regulation, respectively. For example, an allosteric ligand may interact with target 1 to induce its active conformation that positively regulates pathway 1. The positively regulated pathway 1 can be derived from an observed molecular phenotype 1 (e.g., gene expression profile). A context-specific biological network can be inferred by integrating multiple molecular phenotypes and be used to understand and predict an organismal phenotype. doi:10.1371/journal.pcbi.1003554.g003 processes such as prodrug transport and efflux [109]. Thus, stochastic models will be important tools in modeling drug action [110].
However, dynamic modeling is hampered by the lack of reliable kinetic parameters. In many cases, the kinetic parameters for enzyme reactions can be estimated from protein structures [111]. The computational techniques required are dependent on the reaction mechanism. Quantum Mechanics (QM) or Quantum Mechanics/Molecular Mechanics (QM/MM) is needed if bondbreaking is the rate-limiting step. Whereas, Brownian dynamics is more efficient if the reaction is fast and the diffusion rate is the determinant. Nevertheless, these computational techniques are time-consuming and cannot be easily extended to a genome scale.
It has been proposed that the electrostatic potential in the active site determines not only the stability of transition states [112] but also the diffusion association rate [113]. Thus, it is proposed that the comparison of the Molecular Interaction Field (MIF), including the electrostatic potentials and other physical characteristics of structurally similar proteins, may assist in the estimation of the kinetic parameters [114]. A similar strategy has been applied to predict the association/disassociation rate of protein-protein interactions [115][116][117], which are essential for the dynamic modeling of signal transduction pathways. Furthermore, it is possible to extend the scope of MIF to a structural proteome scale for structurally unrelated proteins by analyzing and comparing the evolutionary, dynamical, physiochemical, geometric, and Novel drug off-targets could be identified by using the drug-target interaction models from chemical genomics analysis, and followed by searching for entire human or pathogen structural genome. In addition to sequence and global structural comparison, ligand binding site comparison is a valuable method, as it can identify binding promiscuity across fold space [119][120][121][231][232][233][234][235][236][237][238]. After putative off-targets have been identified from structural genomics analysis, sophisticated molecular modeling techniques such as protein-ligand docking and Molecular Dynamics (MD) simulation can be applied to determine high-resolution interaction models and their binding affinity and conformational space. To correlate drug-target interactions with their physiological response, the conformational state of the drug--target complex can be mapped to biological pathways, integrated networks, and physiological models. Several examples are shown in the figure. Semantic-based modeling is able to establish cause-effect from drug to target to pathways and, ultimately, to clinical outcomes [98]. Biological pathway analysis will provide the mechanistic understanding of information flow caused by drug modulation [44]. Critical components and interactions involved in drug modulation can be identified through integrated protein-protein interaction (PPI) network analysis [212]. Here, blue and green nodes represent drug targets and genes with observable changes, respectively. The target inhibition or activation along with genetic perturbations can be simulated using reconstructed physiological models [71]. In turn, the information from pathway and network analysis can be used to verify or falsify the drug-target interaction models and to constrain their conformational space. doi:10.1371/journal.pcbi.1003554.g004 transition-state binding properties of the binding interfaces [118][119][120][121][122]. The use of estimated kinetics parameters is particularly promising when coupled with a kinetic hybrid model [123]. In these models, detailed rate equations are only used to describe essential enzymes. Simplified and approximate rate equations are applied to the majority of enzymes. A recent study has shown that 37% of enzymes in Escherichia coli are promiscuous and evolutionarily retained and catalyze 65% of known metabolic reactions [124]. With higher catalytic capability, specific enzymes tend to be essential and more frequently coupled with gene regulation than promiscuous enzymes. This finding provides additional support for the structure-based hybrid model. It is noted that the kinetic parameters are often experimentally measured under threedimensional conditions, which do not reflect the two-dimensional dynamic processes occurring when a drug binds to a receptor. Wu et al. presented a theoretical multiscale simulation approach that converts three-dimensional affinities to two dimensions, accounting directly for the structure and dynamics of the membranebound molecules [125]. In summary, multiscale mathematical modeling and network-based association studies in systems pharmacology will benefit from the information derived from macromolecular structures in terms of the identification of reliable network connectivity as well as the enrichment of individual-and context-specific kinetic parameters.

Pharmacodynamics in the Era of Structural Systems Pharmacology
A major focus of pharmacodynamics is to quantitatively understand drug-target interactions and their effects on the whole organism. It is clear that drug action cannot be fully understood by a conventional one-drug-one-target paradigm. A systematic view of all proteome drug-target interactions is often necessary [16]. A drug molecule may act more than to inhibit or activate a target in a binary manner. Recent studies in biased agonism (or biased signaling, functional selectivity) [126] and partial agonism add a new dimension to understanding pharmacodynamics. For example, it has been recognized that a G-protein coupled receptor (GPCRs) pleiotropically regulates multiple signaling pathways. An endogenous or designed agonist for a GPCR may selectively activate one of its regulated pathways, leading to therapeutic efficacy or, alternatively, to unwanted side effects. At the molecular level, biased agonism originates from the selection of specific conformational states of the target protein, which are dynamically coupled with ligand binding. Fundamentally, the molecular mechanism of biased agonism may be similar to that of partial agonism observed in nuclear receptors (NRs). The transcriptional activity modulated by NR agonists is not dependent on the binding affinity but rather the ensemble of both protein conformations and ligand orientations [127][128]. Moreover, allosteric interaction can shift the conformational ensembles, thereby modulating the activity of agonist binding [36].
These findings provide both new opportunities and impose challenges in linking in vitro drug binding with associated in vivo activity. In addition to identifying proteome-wide drug binding promiscuity and specificity, it is necessary to sample the conformational ensemble associated with these drug-target interactions and to link their conformational state with biological pathways. Although a number of state-of-the-art computational techniques (e.g., those reviewed in [16] and Figure 4) are able to predict drug binding cross-reactivity, few of them provide a highresolution landscape for the complete conformational space of drug-target interactions. State-of-the-art methods accounting for conformational flexibility are not capable of mapping the conformational ensembles to the signaling pathways that they modulate [129]. New concepts and techniques are needed to include the influence of protein dynamics on functional activity of drug binding in the context of biological networks.
At the molecular level, conventional single-target, single-state virtual screening and quantitative structure-activity relationships (QSAR) should be extended to a multitarget, multiconformation model. A wide array of experimental techniques, such as fluorescence spectroscopy [130], plasmon waveguide resonance spectroscopy [131], bioluminescence resonance energy transfer [132], circular dichroism [133], X-ray crystallography [134], sitedirected mutagenesis [135], and 19F-NMR spectroscopy [136], have been developed to determine active conformations associated with biased and partial agonism. These data provide a foundation for developing multistate models of pharmacodynamics. Structurebased molecular modeling may play a critical role in understanding the biased signaling. For example, dynamic protein-ligand homology modeling coupled with site-directed mutagenesis data is used to determine the dimerization and activation models for GPCRs [137][138].
In the context of personalized medicine, the functional selectivity of drug binding can be modulated by mutations in the protein target, which directly impact orthosteric or allosteric interactions. Thus, it is important to assess the importance of the mutation on the energetics and dynamics of proteome-scale drug-target interactions. Protein structure may provide critical insights into how the mutation alters the drug response. The genetic predisposition to adverse drug reactions (ADR) can be rationalized through the atomic details of the interaction between the drug and its potential off-target using structural modeling. For example, Li et al. have discovered that an R41Q mutation in human cytosolic sialidase (HsNEU2), which is predisposed in a small portion of the Asian population, links ADRs to oseltamivir (Tamiflu) [139]. In addition to mutations that directly involve drug binding, the mutations that disrupt allosteric interactions are some of the major determinants of disease but have been little studied [140]. Several structure-based techniques have been developed to predict the effect of mutations on allosteric regulation. They include correlated mutation analysis [141][142][143][144][145][146], the detection of pairwise dynamic [147][148] or energetic coupling [149] between residues, and analysis of the global topology of protein structures [150][151]. These methods may eventually contribute to the design of allosteric drugs [36,152].
Dynamic simulation of drug unbinding kinetics is another area that may significantly impact personalized medicine. Drug-target interactions in vivo are different from those in vitro. In target or cellbased assays, the concentrations of both drug and target are fixed and the binding affinity is measured by thermodynamic equilibrium constants such as IC 50 values, which reflect binding potency. However, in a living organism, the concentration of the drug, the target, and the other molecules constantly changes with time, rarely reaching equilibrium. Thus, the drug binding affinity is not an appropriate indicator of drug potency in vivo [153]. An increasing body of evidence suggests that drug efficacy correlates more strongly with drug-target residence time than with binding affinity [154][155][156][157]. Long residence time can lead to sustained pharmacological effect and may also alleviate off-target toxicity. The residence time of a drug on its target can be greatly influenced by conformational adaptation [158]. Recent studies suggest that the in vivo duration of drug efficacy not only depends on macroscopic pharmacokinetic properties like plasma half-life and the time needed to equilibrate between the plasma and the effect compartments, but is also influenced by long-lasting target binding and rebinding [159].
Experimental approaches to studying drug binding and unbinding to proteins have limitations in temporal and spatial resolution. It was reported recently that a computational network analysis combined with explicit water MD simulations of the unbinding of small inhibitors from the enzyme FK506 Binding Protein (FKBP) provided a clear picture of the free energy landscape (both thermodynamics and kinetics) of ligand dissociation [160]. The dissociation kinetics were characterized as a simple (i.e., single-exponential) time dependence with multiple dissociation pathways. A computational methodology using trajectory data from multiple Brownian dynamics simulations of ligand diffusion has been developed for characterizing the kinetics of drug-receptor interactions in terms of the encounter complex [161]. A computational approach named metadynamics has been used both for reconstructing the free energy and for accelerating rare events in systems described by complex Hamiltonians, at the classical or at the quantum level [162]. All-atom metadynamics simulations of a peptide substrate interacting with wild-type HIV-1 protease in explicit solvent rendered accurate calculations of binding affinity and kinetics constant compared to the experimental data [163].
Ultimately, drug-target interactions and genetic events should be studied in the context of biological networks. Existing biological network analysis is conformationally stateless. Thus, these networks are not sufficient to model the influence of protein dynamics on the drug response phenotype. The integrative modeling depicted in Figures 2 and 3 provides a possible solution to incorporating conformational dynamics into network modeling. It is worth mentioning that this integrative modeling framework could also be a powerful tool in studying the pharmacodynamics of drug-drug interactions. Due to the robust nature of biological systems, drug combinations are often necessary and proven to be successful in treating complex diseases and combating drug resistance [164][165]. However, the Adverse Drug Reaction (ADR) resulting from drug-drug interaction is a serious problem in developing combination therapies. In addition to the pharmacokinetics (see next section), the pharmacodynamics of the drugdrug interaction may play a critical role in the ADR [166]. Existing state-of-the-art methods for the prediction of drug-drug interactions are data-driven, which mainly establish statistical association from empirical observations but provide little information on the mechanism of drug-drug interaction. Thus, they may not be sufficient for the predictive modeling of drug-drug interactions during the early stages of drug development [167][168]. Using the proposed integrative modeling framework, it may be possible to predict drug-drug interactions de novo.

Pharmacokinetics in the Era of Structural Systems Pharmacology
The Absorption, Distribution, Metabolism, and Excretion (ADME) properties of a drug, i.e., absorption and distribution to its target(s), detoxification by metabolism and excretion of the drug from the human body, are the primary concerns of pharmacokinetics. At the molecular level, ADME properties are strongly influenced by the abundance and activity of transporters and metabolizing enzymes such as CYP450, UDP-glucuronosyltransferases, sulfotransferases, N-acetyltransferases, glutathione S-transferases, and methyltransferases. Much effort has gone into developing computational methods for in silico prediction of ADME properties. These methods initially only addressed the small drug molecule. Based on chemical structures, quantitative structure-activity relationship (QSAR)-based approaches have been extensively used to correlate the physiochemical properties of lead molecules with their ADME profiles [169][170]. Shifting from the ligand to the receptor, structure-based methods have been developed which leverage the ever-increasing number of 3-D structures of ADME related proteins. Similarity searches and traditional pharmacophore approaches are enhanced by more advanced molecular descriptors and 3-D pharmacophores that encode the details of ligand-target binding [171]. Dynamic properties of ligand-target binding could be incorporated into the pharmacophore with conformational sampling techniques. Protein-ligand docking based on virtual screening for millions of compounds can now be accomplished with ease [172]. However, applying molecular docking to ADME related proteins is complicated by the existence of large and flexible binding cavities in CYP450 and phase II metabolizing enzymes, which can accommodate more than one ligand [172]. Consequently, the correlation between the docking score obtained for the best poses with experimentally determined binding free energies is usually poor. Nevertheless, in a recent study, Schlessinger et al. used a homology model of a norepinephrine transporter and molecular docking to successfully predict the prescription drugs which specifically bind to it [173]. With the availability of data from chemical genomics and high-throughput screening [28,174], the combination of multiple flexible docking tools with chemoinformatics may boost the performance of structure-based virtual screening [175]. Although more accurate in deriving binding free energy, a more rigorous thermodynamic approach is unfortunately more computationally demanding and not applicable to largescale virtual screening. Besides molecular docking based virtual screening or molecular dynamics simulation, quantum mechanical and hybrid quantum mechanical/molecular mechanical (QM/ MM) methods have emerged as powerful tools for modeling reaction rates of drug metabolism. The whole reaction profile for benzene hydroxylation by CYP2C9 was studied with such a hybrid approach; a combined docking-MD-QM calculation was used to simulate the activation energy of CYP3A4 [176]. The challenge is to extend this technique to a structural proteome scale, as discussed in the previous sections.
So far, pharmacogenomics prediction of ADME properties has mainly focused on the genotypic variations and polymorphisms in metabolizing enzymes-the overall contribution of pharmacogenomics to personalized medicine remains limited [177]. The pharmacokinetics of a drug is the interplay between the inherent physiochemical properties of the drug and its physiological environment. The expression of metabolizing enzymes and transporters is highly regulated by multiple factors, including genetic polymorphisms, xenobiotics induction, cytokines, hormones, and pathophysiological states, as well as gender and age of families [178]. Multi-allelic genetic polymorphisms depend significantly on ethnicity and imply disparate clinical phenotypes including ADR, drug efficacy, drug resistance, and dose requirement. A mechanistic understanding of these regulators is essential for the predictive modeling of pharmacokinetics. It is well established that CYP genes are directly regulated by nuclear receptors [179]. Multiple genes, such as p53, AP-1, Ras, and APC, are involved in the regulation of multiple drug-resistance transporters (ABC transporters) [180]. If a drug itself, or another drug, interacts with these genes, undesirable pharmacokinetics profiles or drug-drug interactions may arise. Thus, the pharmacokinetics regulatory genes are potential drug off-targets that affect the fate of drug molecules in vivo. We call them ''pharmacokinetics off-targets'' to distinguish them from those related to pharmacodynamics. The fact that the activity of direct pharmacokinetics regulatory genes can be modified by their upstream genes adds a new layer of complexity to the problem. Therefore, to fully understand the molecular mechanisms underlying the ADME properties of a drug, it is necessary to identify the pharmacokinetics off-targets as well as the regulatory network of pharmacokinetics genes on a proteome scale. Figure 5 shows a regulatory pathway of CYP3A, whose substrates include several hundred drugs [181]. LCMT1 is a methyltransferase that methylates the PP2A catalytic subunit and promotes its functional association with the PP2A regulatory subunits. PP2A is a major protein phosphatase that dephosphorylates PRMT1, thus inhibiting PRMT1 enzymatic activity. PRMT1 is essential for the PXR transcription process. PXR dimerises with RXR to induce the gene expression of CYP3A. It is clear that genetic variations or drug perturbations on any one of the genes along this pathway may affect the abundance and activity of CYP3A, thereby leading to a change in the ADME properties. Using a structure-based offtarget identification pipeline, LCMT1 has been identified as the off-target of several antibiotics [182], highlighting the potential power of proteome-scale structural modeling in predicting novel pharmacokinetics profiles and drug-drug interactions.
Noncoding DNA may play important roles in the regulation of transporters and metabolizing enzymes. For example, the CYP family includes 58 pseudogenes that do not encode functional protein [183]. An increasing body of evidence suggests that pseudogenes have diverse functions that influence not only their parent genes but also apparently unrelated genes [184]. For example, one of the CYP450 genes, CYP2A6, has a pseudogene, CYP2A7. CYP2A7 may transfer a fragment of DNA to its parent gene CYP2A6, leading to a change in its sequence. It is observed that individuals who smoke have a mutated gene CYP2A6*1B that is converted from a CYP2A7 polymorphism. CYP2A6*1B stabilizes its mRNA, thereby increasing first its expression level, then its activity in metabolizing nicotine [185]. The functional roles of most pseudogenes still remain elusive. Structural systems biology, as discussed in the next section, may shed new light on the functional relevance of noncoding DNA in pharmacokinetics.
Drug metabolism is strongly dependent on the physiological state (e.g., obesity and diabetes) and environment (e.g., gutmicrobiome). The prediction of inter-individual pharmacokinetics variation requires the coupling of pharmacogenomics and pharmacometabonomics [186]. In principle, mechanism-based modeling of drug interactions with transporters and metabolizing enzymes can be integrated with pharmacogenomics, pharmacometabonomics, and other omics data using the integrative modeling framework proposed in the previous section. Physiolog-ically based pharmacokinetics modeling has been developed in the past four decades. Although it is resource consuming in obtaining input parameters, progress made in in silico technologies has greatly facilitated the prediction of oral absorption and hepatic metabolism as well as mechanistic models of tissue distribution based on pharmacokinetics models. The bottleneck for physiologically based pharmacokinetics modeling resides in the limited data available. As discussed in the previous section, dynamic simulations involving comprehensive metabolic pathways, signaling pathways, and cell physiology are being studied together with multiscale modeling at the cellular, organ, and whole-body levels [187]. The sparse number of available kinetic parameters calls for more structure-based modeling efforts in order to enable further multiscale systems pharmacology analysis.

Pharmacogenomics and GWAS in the Era of Structural Systems Pharmacology
Recent advances in pharmacogenetics and pharmacogenomics have identified genetic variants in several hundreds genes, notably drug metabolizing enzymes (e.g., CYP450), transporters, and drug targets [188]. The knowledge derived from such data has already resulted in individualized therapy. For example, the appropriate initial dose of the anticoagulation drug warfarin can be estimated using a pharmacogenomics algorithm [189]. Similarly, certain mutations can be used to predict alternative responsiveness to drugs in cancer therapy [190][191][192][193][194]. Acknowledgment of the emerging role of pharmacogenomics can be found in the labels of the Food and Drug Administration (FDA)-approved drugs montelukast [195] and cetuximab [196] and in FDA-approved diagnostic tests, for example, the microarray-based Roche AmpliChip for cytochrome P450 polymorphisms and the Invader UGT1A1 Molecular Assay for detecting polymorphisms that increase the risk of neutropenia when using the colon cancer drug irinotecan [15,197].
In spite of these advances, much more is needed for predictive modeling of individual drug responses. It is critical to understand the impact of genetic variation beyond direct drug interactions with on-targets, off-targets, metabolizing enzymes, and transporters. Many nontarget-associated genetic factors may affect the drug response phenotype. As shown in Figure 1, if a critical node, edge, or feedback loop is modified in a drug modulation pathway, the drug efficacy or side effect profile can change accordingly. One example is the mutation of K-RAS located downstream of the EGFR pathway, which causes resistance to the anticancer activity of EGFR inhibitors [198][199]. In another situation, genes aside from the drug target may regulate the same biological pathway where the system-level drug response is the result of their combinatorial control. The mutation or expression changes of such genes may enhance or reduce the drug response. The gain-or loss-of-function of mutations that are associated with drug action may come from genetic variations in both coding and noncoding regions. Recently, the 1,000 Genomes Project has identified 38 million single nucleotide polymorphisms (SNPs), 1.4 million short insertions and deletions, and more than 14,000 larger deletions from 1,092 individuals belonging to 14 ethnic groups. The individual-specific, rare, coding variants are located across a broad array of biological pathways. Moreover, there are hundreds of functionally annotated, rare, noncoding variants for each individual [200]. It is expected that these variants will alter the pharmacokinetics, pharmacodynamics, and responsiveness to drug therapy [15]. The extremely large, multidimensional datasets from these studies present an exciting opportunity to expand the horizon of pharmacogenomics by identifying causal variants and genes, and predicting pathways likely to be involved in drug response. The challenge is how to associate these data with a predicted drug response a priori.
Many disease-associated variants and drug-response cryptic genetic factors remain uncharacterized; hence, new methods are urgently needed to annotate DNA variants, their functional roles, and their associations with drug actions. How to annotate the functional roles of DNA variants, especially for noncoding variants, is a challenge because of the diversity of noncoding functions, the incomplete annotation of regulatory elements, and unknown mechanisms of regulatory control. Several large-scale studies have been performed to annotate the noncoding genome and regulatory elements. These studies integrate the highthroughput functional genomics and comparative genomics datasets [3][4][5] to map the functional noncoding elements on a genome-wide scale. Variants on these elements can result in the different regulation of their target genes. For example, studies from the Encyclopedia of DNA elements (ENCODE) [201] and model organism Encyclopedia of DNA elements (modENCODE) [202][203][204] projects provide comprehensive maps of transcription binding for select cell lines and DNase maps for many primary cells and highlight the importance of noncoding DNAs in the regulation of complex phenotypes. With known functional elements and motifs, methods have been developed to predict the effect of newly observed rare and private mutations by integrating models of sequence motifs, chromatin states, and expression patterns in model organisms and in cultured human cells [205][206][207][208]. For example, quantitative sequence-activity models (QSAMs) [209] are trained based on these data in a massively parallel reporter assay (MPRA) developed to enable systematic dissection and optimization of transcriptional regulatory elements [208].
Since annotated pharmacogenomics biomarkers are incomplete and biased, the conventional ''guilt-by-association'' approach may not be sufficient to identify novel drug-response genetic markers [43]. The power of statistical analysis of pharmacogenomics and GWAS data is often limited by the moderate effect size of samples. As a result, a number of rare variants could be missed. Fundamentally, the genotype-phenotype relations established by statistical or machine learning approaches may be merely mathematical correlations. Macromolecular structures and their interactions may provide critical mechanistic insight into the functional roles of DNA variants and their impact on drug action. The modeling and analysis of macromolecular structures has already made significant contributions to our understanding of how mutations affect the stability, folding, and binding of macromolecules [210][211]. Several recent structure-based studies have provided high-resolution pictures for how variants rewire biological networks through allosteric regulation, protein-protein interaction (PPI), and protein-nucleic acid interaction (PNI) [59,140,[212][213]. Kowarsch et al. showed that co-evolving residues can influence each other through allosteric regulation and are significantly more likely to be disease-associated than expected by chance [140]. By mapping the mutations on the structures, Wang et al. found that in-frame mutations are enriched on the interaction interfaces of proteins associated with the corresponding diseases and that the disease specificity for different mutations can be explained by their location within an interface [59]. Similar findings were observed in David et al.'s work [213] by combining a database of the 3-D structures of human protein/ protein complexes [214] and the humsavar database of nonsynonymous Single Nucleotide Polymorphisms (nsSNPs) [215].
As shown in Figure 1, proteome-wide drug-target interactions and genome-wide genetic variants may collectively affect the functional state of complex biological networks that mediate the system-level response to the drug, leading to both therapeutic and adverse effects. Few computational tools are able to model the collective effects of drug perturbation and genetic variants in the context of the whole human genome and biological network, which is essential for the development of personalized medicines. Pathway analysis such as gene-set enrichment analysis (GSEA) [216] can be applied to pharmacogenomics data to concentrate the genetic perturbation along the annotated biological pathways. Another way to use prior knowledge of gene interrelationships is to incorporate the information into the association study itself through Bayesian techniques [217] or by using boosting to prioritize disease networks [218]. By integrating the atomic details of molecular interactions, co-evolution information, protein-protein interaction networks, transcriptional profiles, and pathway enrichment analysis, Xie et al. developed a structural systems biology approach to identifying the functional role of DNA variants and causal mutations with an extremely small sample size [212]. Through this approach, the driver mutations that confer hypoxia tolerance in Drosophila melanogaster were identified. Furthermore, the functional roles of several nsSNPs, which are predominantly involved in allosteric regulation, protein-protein interaction, and protein-nucleic acid interaction, were determined [212]. The power of variationmediated pathway analysis can be further enhanced by incorporating other regulatory and signaling network components, such as microRNA-target interactions, protein-nucleic acid interactions, and phosphorylation events, etc., and taking advantage of advanced graph mining algorithms [219][220].
The systematic perturbation of sequence variants can be introduced into the dynamics simulation of pathways and genome-scale modeling of biological networks through macromolecular structures. Recently, Cheng et al. developed a computational framework to integrate missense mutation, protein structural modeling, and ODE [221]. They introduced the Systemic Impact Factor (SIF) as a measurement of phenotype changes resulting from the mutation. SIF is a function of the free energy change caused by the mutation and systemic control coefficient. The free energy change in a protein directly leads to the change in its kinetic parameters. The control coefficient quantifies the sensitivity of the phenotype readout to the change in kinetic parameters. They tested their models on two cases: G2-M transition control in yeast and the human mitogen-activated protein kinase (MAPK) pathway. The SIF from the simulation is well correlated with the experimental results. The missense mutations in this study are mainly associated with protein stability. In the future, it will be interesting to quantify the systematic impact of broad types of mutations that modify allosteric regulation, molecular interaction, and gene expression. Chang et al. have integrated a reconstructed kidney model with structural bioinformatics and molecular modeling to predict the side effect profile of cholesterylester transfer protein (CETP) inhibitors and to identify genetic risk factors that cause adverse drug reactions [71]. An interesting finding from this study is the synthetic effect of drug-target interactions and nontarget-associated genetic modifications. Serious side effects are caused by the combination of the drug treatment and the genetic alteration but not by either alone. It is anticipated that the identification of nontarget genetic factors that affect drug action will have a significant impact on personalized medicine.

Conclusion
The holistic, adaptive, and evolving nature of biological systems makes the quest for a simple and elegant mechanistic model to explain and predict biological phenomena less than fruitful. At the same time, the emergence of big data should significantly shift our approach to biomedical research. Given these data and increased computer power, useful predictive statistical models are possible. The question then becomes, how can we discover new knowledge from these statistical models under conditions that are constantly changing?
A complex disease state is not static under drug treatment, but evolves to new states in adapting to the drug-induced environment. When enough data are collected to describe one disease state and a successful model can be built, the disease may already be different from the one used to build the model. In such a temporal situation, a data-driven model is essentially retrospective but not prospective. Another major question is how much data are enough to build an accurate predictive model given the genetic, epigenetic, and clinical heterogeneity of complex diseases? Will the model still work if the individual has an unobserved new mutation? Moreover, genetic and/or epigenetic events and drug actions are rooted in the fundamental principle of physics and chemistry. Indifference to the detailed physical and chemical nature of biological processes in the modeling of big biological data could eventually hinder scientific advances in biomedicine. Discovery of new knowledge requires more than just a query of a big reference table built from data. Macromolecular structure plays an irreplaceable role in linking the physical and chemical origins of genetic events and drug action to the systematic response at the cellular, tissue, and organism levels. Thus, the incorporation of physiochemical-based macromolecular structure modeling with data-driven and mathematical-based pharmacodynamics, pharmacokinetics, pharmacogenomics, and systems pharmacology will not only enhance the power of modeling a predictive personalized drug response but will also shed new light on our understanding of living systems in a broad sense.
One of the barriers in applying macromolecular structure to pharmacogenomics and systems pharmacology is that the structural coverage of macromolecules and their complexes has been limited. Recent progress in both experimental and computational techniques has dramatically improved the structural coverage of the human genome. In addition to continuous efforts in structural genomics [222], breakthroughs in the crystallography of membrane proteins make representative 3-D structures of pharmaceutically important proteins such as G-protein coupled receptors (GPCR) [223] and transporters available [224][225][226]. Moreover, complexes of protein-ligand, protein-protein, and protein-nucleic acid structures, as found in the Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB), are increasing rapidly in both number and complexity [227]. Using the increased number of structural templates, and given the availability of high-performance computing, structure prediction has become routine for data from next generation sequencing. Even without structural templates, novel protein folds can be predicted with increasing accuracy [228]. Using co-evolutionary information from amino acid residues, high-quality structural models can now be built for proteins that are not amenable to conventional homology modeling or other methods of ab initio prediction [229]. Integrative modeling using different experimental techniques has contributed significantly in constructing macromolecular ensembles of unprecedented complexity [230]. Similarly, novel computational techniques have been developed to predict proteome-scale, 3-D interaction models of protein-protein interactions from medium to high resolution [50][51][52]58]. Collectively, these advances provide new opportunities to use macromolecular structures in pharmacogenomics and systems pharmacology.