Conceived and designed the experiments: JK SK. Performed the experiments: JK IK J-SY SP. Analyzed the data: JK IK J-SY Y-ES JH SP YSC SK. Wrote the paper: JK IK SK.
The authors have declared that no competing interests exist.
PDZ domain-mediated interactions have greatly expanded during metazoan evolution, becoming important for controlling signal flow via the assembly of multiple signaling components. The evolutionary history of PDZ domain-mediated interactions has never been explored at the molecular level. It is of great interest to understand how PDZ domain-ligand interactions emerged and how they become rewired during evolution. Here, we constructed the first human PDZ domain-ligand interaction network (PDZNet) together with binding motif sequences and interaction strengths of ligands. PDZNet includes 1,213 interactions between 97 human PDZ proteins and 591 ligands that connect most PDZ protein-mediated interactions (98%) in a large single network via shared ligands. We examined the rewiring of PDZ domain-ligand interactions throughout eukaryotic evolution by tracing changes in the C-terminal binding motif sequences of the PDZ ligands. We found that interaction rewiring by sequence mutation frequently occurred throughout evolution, largely contributing to the growth of PDZNet. The rewiring of PDZ domain-ligand interactions provided an effective means of functional innovations in nervous system development. Our findings provide empirical evidence for a network evolution model that highlights the rewiring of interactions as a mechanism for the development of new protein functions. PDZNet will be a valuable resource to further characterize the organization of the PDZ domain-mediated signaling proteome.
Rewiring of interactions is a powerful tool for the evolution of organism complexity. Rewiring among preexisting proteins provides a simple mechanism for the development of new signaling circuits by redirecting information flows without a gain or loss of genes. Particularly, interactions mediated by short linear motifs can be easily changed by mutations during evolution, resulting in a rewiring of interactions. However, how interaction rewiring of linear motif interactions facilitates the emergence of new protein function during evolution is poorly understood. Here, we systematically investigated the rewiring of interactions mediated by PDZ domains, which are one of the most commonly found peptide recognition modules. We found that PDZ domain-ligand interactions are frequently rewired by C-terminal sequence mutations in PDZ ligands during evolution. Especially, rewiring of PDZ domain-ligand interactions was involved in neuronal function development, occurring concurrently with the emergence of vertebrates and suggesting that reorganization of signaling pathways by rewiring PDZ domain-ligand interactions significantly contributed to the evolution of nervous systems in vertebrates. Our findings highlight the rewiring of interactions as an effective means for functional innovation, providing new insight into eukaryotic evolution, which has not been fully explained by only the expansion of protein families.
PDZ domains are linear motif-mediated protein-protein interaction modules. PDZ domain-ligand interactions have been greatly expanded in metazoans and are widely used to assemble signaling complexes, including those found in neuronal synapses
Systematic analysis of interaction rewiring will provide new insights into eukaryotic evolution, which is not fully explained via only the expansion of protein families. Recently, it was suggested that rewiring of interactions is an important mechanism for the evolution of biological systems. Network comparison studies showed that protein interactions frequently change after gene duplication
Structural information of interacting cellular components (i.e., structural interactome) would provide a more complete picture of a cell and help elucidate the evolutionary principle of the protein interaction network
In this work, we attempted the first systematic investigation of interaction rewiring in the PDZ domain-ligand interaction network and its role in eukaryotic evolution. We constructed a comprehensive human PDZ domain-ligand interaction network and traced the changes in interaction rewiring during evolution. We developed position weight matrices (PWMs) of human PDZ domains from the experimental data of PDZ domain-ligand interactions. The binding motif information of PDZNet helped to elucidate the changes in PDZ domain-ligand interactions. We found that PDZ domain-ligand interactions are frequently rewired throughout evolution via mutations of C-terminal PDZ ligand sequences. Particularly, interaction rewiring occurred concurrently with emergence of vertebrates whose rewired interactions were largely involved in neuronal signaling, suggesting that nervous system evolution might be achieved by the interaction rewiring of signaling components, such as PDZ protein-ligand interactions. Furthermore, the broad specificity of PDZ domains contributes to interaction rewiring by increasing the chance of acquiring PDZ binding motifs by sequence mutations. Our findings will prompt a new approach for the study of eukaryotic evolution by considering the rewiring of interactions as a major evolutionary process of domain-ligand interactions.
To elucidate how PDZ domain-ligand interactions have evolved, an accurate and detailed understanding of their interactions is essential. Furthermore, a network approach is useful to understand how evolution of PDZ domain-ligand interactions contributed to eukaryotic evolution, because protein functions may not be encoded in an individual protein but rather be encoded in the relationships between proteins in a protein-protein interaction network
(A) Building position weight matrices (PWMs) of human PDZ domains. Experimental data of the PDZ domain and peptide interactions were used to generate PWMs of PDZ domains. (B) Construction of the PDZ domain-ligand interaction network. Human protein interactions were collected by integrating existing PPI databases. (C) Integration of binding strengths into PDZNet.
We developed a quantitative model of PDZ domain binding strengths from the experimental data of PDZ domain-ligand interactions, including interactions between 81 PDZ domains and 217 peptides from a protein array
We found that the binding scores of PWMs well represent the experimental affinities of PDZ domain-ligand interactions (
(A and C) PWMs of the PDZ domains of SNA1 (human) and ERBIN (human). Black bars represent the affinity contribution of the binding scores to the corresponding amino acids. Clusters of amino acids with no preference are labeled “others.” (B and D) Scatter plots showing the correlation between binding score and binding affinity.
(A) The PWM of the PSD-95_1 domain. (B) Known interacting partners of the PSD-95_1 domain from three species are shown. (C) Fraction of known PDZ domain-ligand interactions are examined by percentile rank of binding scores.
To construct PDZNet with high-confidence interactions, we prioritized the experimentally validated PDZ protein-ligand interactions from the prediction results of the PWMs. It is a challenge to correlate the occurrence of amino acids in a linear motif to the binding specificity of peptide-binding domains
PDZNet is composed of 97 PDZ proteins and 596 partners with 1,212 interactions (
(A) Network representation of PDZNet. Orange and blue circles correspond to PDZ proteins and ligands, respectively. The size of the node is proportional to the number of interacting partners. (B) Construction of the PDZ Protein Network (PPN) and the PDZ Ligand Network (PLN). (
We discovered that an interface similarity exists among PDZ domains that share the same ligands. In the PPN, PDZ protein pairs connected by the same ligands tend to have similar pocket residues (
We then asked how PDZ domains and ligands obtained multiple partners during evolution. Gene duplication and subsequent diversification events are considered major factors for network growth. Although gene duplication played a significant role in PDZ proteins and ligand evolution
We found that sequence mutations played an important role for the attachment of non-homologous ligands to PDZ domains. On an evolutionary time scale, the compendium of PDZ ligands expands via two processes: (1) the introduction of new PDZ ligands by gene duplication of existing partners, or (2) the
(A) Two evolutionary models describe the expansion of PDZ domain-ligand interactions. In the gene duplication model, a new PDZ domain-ligand interaction is added by duplication of an existing PDZ ligand. In the sequence mutation model, a new interaction is added by mutations of the C-terminal sequence of the non-PDZ ligand. (B) Paralog fractions of PDZ ligands that share the same PDZ proteins (
Next, we examined the sequence evolution of the binding motifs of human PDZ ligands and discovered that a large portion of PDZ ligands acquired their binding motifs via sequence mutations. We examined the C-terminal sequences of PDZ ligands in each PDZ domain-ligand interaction pair across 16 representative species. We found that nearly one-third of human PDZ ligands gained their PDZ domain interactions by C-terminal mutations during evolution (
Interaction rewiring is an effective evolutionary mechanism given that it reconfigures molecular systems without a gain or loss of genes
Invertebrates and vertebrates are colored green and yellow, respectively.
We also examined the types of biological processes that are significantly affected by the rewiring events of PDZ domain-ligand interactions. We found that the PDZ ligands that arose in invertebrates and gained their PDZ-binding motifs in vertebrates participated significantly in the process of neurological system development (
We found that metazoan-specific PDZ proteins adopted their ligands from proteins of premetazoan origin. The phylogenetic profile shows the origin of the PDZNet proteins (
(A) Phylogenetic profiles of EXOC4 and SAP102 are presented. ‘−’ indicates that no ortholog was found in the corresponding species. Four C-terminal residues of EXOC4 orthologs are placed on the right side of the protein. (B) The PWM of SAP102_3. Four C-terminal residues of vertebrate EXOC4 orthologs (ITTV) are presented in red. “Others” indicates amino acids that were not preferred in the binding pockets.
Next, we asked which physiological system was most affected by the mutations of PDZNet proteins. Mutations could affect the binding specificity of PDZ-ligand interactions via the replacement of interfacial residues or the destabilization of PDZ domain and ligand structure. If an interaction gained from the evolution of PDZNet had contributed to the development of a certain physiological system, an alteration of the interaction could be associated with genetic diseases caused by a malfunction of the system.
We investigated the disease associations of the PDZNet components and found that many PDZNet proteins are significantly associated with neurological diseases (
Disease classes with a
In this study, we describe the first PDZ protein-ligand interaction network coupled with quantitative binding strength. Our network approaches elucidated how PDZ domains have diversified their binding partners in the organization of various signaling complexes from receptors to downstream signaling relays. Moreover, we showed that
PDZNet provides information beyond just the state of interaction binding. First, PDZNet provides information regarding the binding interface. High-throughput experiments provided large-scale PPI information; however, the identification of which amino acids were used in the interactions has been difficult. The quantitative model of PDZ domain-ligand interactions provides sequence information on domains and linear motifs, enabling a deeper understanding of the mechanisms involved in their interactions. Second, PDZNet provides the binding strengths of the interactions. The quantitative binding strengths of PDZ domain-ligand interactions enable us to understand the competition among interaction partners for switching between signaling flows.
The multispecificity of PDZ domain-ligand interactions has unique advantages in the evolution of PDZ domain function in the cell signaling network. First, the multispecificity of PDZ domains contributes to the frequent rewiring of PDZ domain-ligand interactions and broadens the extent of recognizable sequences, thus increasing the chance that a protein gains a suitable sequence to interact with its partners. Indeed, we found that PDZ domain pockets prefer multiple amino acids for interactions. We analyzed amino acid preference patterns from the PWMs of human PDZ domains (
We found that almost one-third of human PDZ ligands obtained their PDZ-binding motifs via C-terminal sequence mutations, providing evolutionary advantages to the PDZ domain-mediated interactions. First, the formation of linear motifs is an efficient mechanism to increase the number of interactions. Emergence of short linear motifs rarely disrupts the protein structure and can be accompanied by few amino acid changes
We were also interested in whether new PDZ domain interaction sites were acquired via C-terminal point mutations or DNA insertions. After careful observation of DNA modifications in newly acquired PDZ ligands, we found instances of both. For example, protein PBK of
We found that the rewiring of PDZ domain-ligand interactions most frequently occurred between invertebrates and vertebrates. This massive rewiring may be connected to repeated rounds of whole-genome evolution in ancestral vertebrates. According to Ohno's model
We found that the components of PDZNet are largely associated with neurological diseases. We then asked whether we could identify mutations affecting PDZ-ligand binding, which causes genetic diseases. The disruption of the PDZ domain interaction between PICK1 and GluR7 is known to cause seizures, a chronic neurological disease
An important issue of the present biological network study is its incompleteness
Due to the incompleteness of the interactome networks, expansion of network coverage is of significant value. PDZ domain-ligand interactions were relatively difficult to detect using current experimental techniques because transient interactions are often lost during experimental washing steps. Furthermore, a PDZ domain-ligand interaction often depends on phosphorylation
We assembled experimentally confirmed PDZ domain-ligand interactions from various data sources. In detail, we obtained PDZ domain-peptide binding data from a high-throughput binding assay between 81 mouse PDZ domains and 217 peptides derived from genome-encoded receptors by protein array
We collected 563 human PDZ domain sequences from the Pfam repository
We developed a two-step approach to quantify the strength of binding between the PDZ domains and ligands. Using this approach, the binding affinity between each PDZ pocket and its corresponding ligand position was predicted individually based on the idea that the contribution of each ligand position to the binding affinity is additive
(
In the first step, we designed the selectivity space of each pocket (
In the second step, to build a PWM of a query PDZ domain, we generated an affinity profile that represents the relative affinity contributions of 20 amino acids to the PDZ domain pocket (
We converted the pocket residue sequences of a PDZ domain into vector representations by replacing all 20 amino acids with 10 physicochemical properties (amino acid indices) that describe the number of hydrogen bond donors
Our goal was to predict the specificities of a PDZ domain without knowledge of its structure. As such, a method to extract pocket residues from the sequences of PDZ domains was designed. To identify the positions of pocket residues within the PDZ domain sequence, an MSA was constructed, and the known structure of the PSD-95_1 domain was referenced. We performed a multiple alignment of the PDZ domain sequences using a HMM
To estimate the PDZ domain-ligand binding affinity, we adopted an information theory-based PWM method that is widely used to estimate protein-DNA binding affinities
A PWM was used to calculate the binding score of a potential interaction partner with a given sequence by summing the corresponding amino acids for the affinity contribution of each position. The binding score of each peptide was calculated according to the following formula:
Affinity values of the 5,257 peptides against both the SNA1 and ERBIN PDZ domains were obtained from Wiedemann et al.
To evaluate the performance of our method, we measured the ability to identify the 217 known binding partners of 145 PDZ domains in the PDZBase
We compiled human protein interactions from a total of 22 existing protein interaction databases: the Bio-molecular Interaction Network Database (BIND), the Human Protein Reference Database (HPRD), the Molecular Interaction database (MINT), DIP, IntAct, BioGRID, Reactome, the Protein-Protein Interaction Database (PPID), BioVerse, CCS-HI1, the comprehensive resource of mammalian protein complexes (CORUM), IntNetDB, the Mammalian Protein-Protein Interaction Database (MIPS), the Online Predicted Human Interaction Database (OPHID), Ottowa, PC/Ataxia, Sager, Transcriptome, Complexex, Unilever, protein-protein interaction database for PDZ-domains (PDZBase), and a protein interaction dataset from the literature
We collected all physical interactions mediated by the PDZ proteins from the integrated PPI network. This PDZ protein-mediated interaction set may have some interactions that are mediated by interaction domains other than PDZ domains, because many PDZ proteins have various domains other than PDZ domains. Therefore, we removed such interactions that were connected by domain-domain interactions rather than PDZ domain-ligand interactions. First, we confirmed that PDZ domain-mediated interactions are rarely augmented by other interaction domains. We found that domain-domain interactions are not present in the experimentally confirmed PDZ protein-ligand interactions from the PDZBase
We also removed interactions that could be mediated by other peptide-binding domains, such as SH3 and WW domains, rather than PDZ domains. We searched the known peptide-binding motifs and removed interactions mediated by peptide-binding domains that had low binding scores. The cut-off binding score was set to the lowest binding score of the experimentally confirmed PDZ domain-peptide interactions from the PDZBase
Let two species,
To analyze the interactions between orthologous PDZ domains, we calculated the binding scores of the C-terminal sequences of orthologous PDZ ligands and the predicted PWMs with orthologous PDZ domain sequences.
We examined whether particular protein functions were enriched for protein categories that were defined based on the time of protein emergence and PDZ-binding motif acquisition. We systematically classified PDZ ligands into two categories: (1) proteins arose in invertebrates and acquired PDZ domain interaction sites in vertebrates; (2) proteins arose and acquired PDZ domain interaction sites in invertebrates; we then analyzed the overrepresented functional terms of each group. We used DAVID
Mutations of PDZNet proteins were mapped to genetic diseases using disease-gene association databases from OMIM. The OMIM database lists gene-disease associations between 2,929 disease types defined by Morbid Map (MM) and 1,777 genes associated with particular disease types. Disease types were further categorized into 1,340 distinct diseases by joining disease subtypes into a single disease if similar disease names were used. These disease types were further classified into 20 disease classes based on the physiological system affected
We created a user-friendly web service that provides a PWM and rank list of interaction candidates of a given PDZ domain sequence (
Comparisons of the quantitative model- and phage display data-derived PWMs of MAGI1_2, DLG1_2, and PTN13_2.
(PDF)
Distribution of the interaction partners of 97 human PDZ proteins. The maximum number of ligands per PDZ protein is 102. The average interaction partner of the human PDZ protein is 12.
(PDF)
Network representation of domain-level interactions in PDZNet (best viewed by magnification in a PDF viewer). Domain numbers are presented on the right side of the PDZ protein names with a delimiter (‘_’). The network is composed of 2,643 interactions between 190 PDZ domains and 593 ligands.
(PDF)
Dendrogram of PDZ domains based on the identity of pocket residues. Domain numbers are presented on the right side of the PDZ protein names with a delimiter (‘_’).
(PDF)
Relationship between specificity determining residue (SDR) identity and PWM similarity. Each point represents an orthologous PDZ domain pair.
(PDF)
Phylogenetic profile of human PDZ proteins and ligands across 13 fully sequenced species. The presence (yellow) and absence (black) of orthologs for the 104 PDZ proteins and 554 PDZ ligands are presented.
(PDF)
Multiple sequence alignment (MSA) of EXOC4 orthologs. The MSA was generated using Muscle with default options. C-terminal PDZ binding motifs are shown in bold.
(PDF)
Amino acid preference patterns of human PDZ domain pockets. (A–D) Clustering of amino acid preference profiles of 241 human PDZ domain pockets is shown.
(PDF)
Alternative expression of SAP97 ligands across three human tissues. The protein expression levels of SAP97 PDZ protein and its 13 ligands were compared across brain, bone, and epidermis. Protein expression was measured by quantitative mass spectrometry
(PDF)
Types of DNA modifications that gain PDZ-binding motifs. (A) A point mutation generated a PDZ-binding motif in the C-terminal amino acids of the
(PDF)
Mutation effects of the C-terminal GluR7 sequence. (A) C-terminal sequences and binding scores of wild-type and mutation forms of GluR7. (B) The PWM of the PICK1 PDZ domain. Four C-terminal residues of wild-type GluR7 are highlighted.
(PDF)
Repeated analysis of PDZNet by randomly removing 20% of proteins (trial 1). (A) Network representation of PDZNet. (B) Paralog fractions of PDZ ligands that share the same PDZ proteins (
(PDF)
Repeated analysis of PDZNet by randomly removing 20% of proteins (trial 2). (A) Network representation of PDZNet. (B) Paralog fractions of PDZ ligands that share the same PDZ proteins (
(PDF)
Repeated analysis of PDZNet by randomly removing 20% of proteins (trial 3). (A) Network representation of PDZNet. (B) Paralog fractions of PDZ ligands that share the same PDZ proteins (
(PDF)
Repeated analysis of PDZNet by randomly removing 20% of interactions (trial 4). (A) Network representation of PDZNet. (B) Paralog fractions of PDZ ligands that share the same PDZ proteins (
(PDF)
Repeated analysis of PDZNet by randomly removing 20% of interactions (trial 5). (A) Network representation of PDZNet. (B) Paralog fractions of PDZ ligands that share the same PDZ proteins (
(PDF)
Repeated analysis of PDZNet by randomly removing 20% of interactions (trial 6). (A) Network representation of PDZNet. (B) Paralog fractions of PDZ ligands that share the same PDZ proteins (
(PDF)
Discriminating power of selectivity axes. Each boxplot shows distributions of binders and non-binders of an amino acid, which are presented at the top of the plot. Binders are PDZ domain pockets that prefer the amino acid, and non-binders are those domain pockets that do not prefer the amino acid. The vertical axis corresponds to an axis of a selectivity space. Fisher's score (FS) is presented at the top of each plot, indicating the discriminating power of the selectivity axes.
(PDF)
Procedure for extracting pocket residues. (A) Schematic drawing of the PSD-95_1 domain structure. (B) Position of each pocket residue on the structure. (C) The MSA of three representative PDZ domains was constructed using the hidden Markov model that was optimized for the PDZ domain. By adjusting the secondary structural profile on the MSA, the positions of pocket residues were identified. Gray boxes indicate the positions of pocket residues.
(PDF)
Fraction of domain-domain interactions according to the binding scores of all PDZ protein-mediated interactions. The PDZ protein-mediated interactions were binned based on binding score. The fraction of domain-domain interactions were measured for each bin.
(PDF)
Flow chart of web server and a sample output. The web server takes a query PDZ domain sequence and a species name. The outputs are pocket residues, a PWM of the query PDZ domain, and a genome-wide rank list of proteins from the species chosen by the user.
(PDF)
Position weight matrices (PWMs) for 515 human PDZ domains. For resource purposes, homologous PDZ domains are included in the list.
(XLS)
Validation of PWMs on
(XLS)
PDZ domain-ligand interactions in PDZNet.
(XLS)
C-terminal sequences of human PDZ ligand orthologs.
(XLS)
Experimental evidence of human PDZ domain-ligand interactions that emerged via sequence mutations. ‘−’ indicates the absence of an ortholog
(XLS)
Over-represented gene ontology (GO) terms of PDZNet proteins based on the time point of acquiring PDZ domain interaction sites.
(XLS)
Disease classes associated with mutations of PDZNet components.
(XLS)
A morbid map of PDZNet components with the classification of genetic diseases.
(XLS)