Figure 1.
Overview of integrative analysis of yeast multi-omics data.
New phosphoproteins were identified by LC-MS/MS analysis and unified with the publicly available phosphoproteome datasets of Holt et al. [16] and UniProt [19] (Step 1). A protein–protein interaction (PPI) map was obtained from DIP (Database of Interacting Proteins) [33] (Step 2). Y2H, yeast two-hybrid; IMM, co-immunoprecipitation; TAP, tandem affinity purification. The “phospho-PPI” map was generated by superimposing the phosphoproteome data onto the PPI map (Step 3). Negative controls for the phospho-PPI map were generated by “node label shuffling (NLS)” and “random edge rewiring (RER)” (Step 4). Comparative analyses of the real phospho-PPI and its negative controls were performed with other yeast multi-omics data (Step 5).
Figure 2.
Number counts of phosphoproteins (A) and their phosphosites (B) newly identified in this study and of those obtained from the data of Holt et al. [16] and UniProt [19].
Figure 3.
Node degree distributions of phosphoproteins and nonphosphoproteins in the phospho-PPI data sets of “ALL” (A–C) and “Y2H” (D–F).
(A,D) Cumulative probability distribution of node degrees. For each group of phosphoproteins and nonphosphoproteins, circles represent proportions of proteins with more than the k interacting partners indicated on the horizontal axis [P≥(k)]. (B,E) P to N (P/N) ratio, where P and N are P≥(k) of phosphoproteins and nonphosphoproteins, respectively. (C,F) O to E (O/E) ratio, where O is P≥(k) of phosphoprotein and E is that expected from negative controls generated by node label shuffling (N = 10,000). For each node degree level, line graph and bar represent mean and two-sided 95% confidence intervals, respectively. Background colors and asterisks denote statistical significance over neutral O/E value of 1.0 (*P<0.05, **P<0.01, ***P<0.001).
Figure 4.
Difference between node degree levels of phosphoproteins and nonphosphoproteins at each level of protein abundance (A–F) and protein disorder (G–I).
(A–C) YEPD, protein abundance dataset for cells grown in rich medium; (D–F) SD, protein abundance dataset for cells grown in synthetic complete medium. Protein abundance provided in the original dataset [37] was log-transformed (base 10) as abundance level α. The type of phospho-PPI network is “ALL” (for each analysis, protein nodes for which abundance or disorder levels and their corresponding edges were not provided were eliminated from the phospho-PPI network). Each bin corresponds to the protein abundance level between α (indicated on the horizontal axis) and α+0.5 (A–F) or the protein disorder probability between d (indicated on the horizontal axis) and d+0.2 (G–I). For protein nodes corresponding to each bin, the P to N (P/N) ratio of protein number count (where P and N are number counts of phosphoproteins and nonphosphoproteins) (A,D,G), average node degree levels [Log(k) (base 10)] of phosphoproteins (red line) and nonphosphoproteins (blue line) (B,E,H), and statistical significance [−Log(P value) (base 10)] of differences between Log(k) of phosphoproteins and nonphosphoproteins (C,F,I) are represented. Error bars denote s.e.m. Asterisks denote −Log(P value)>8.0 (i.e. P<10−8).
Figure 5.
Probabilities that phosphoproteins and nonphosphoproteins will interact with proteins that have phosphoprotein binding domains (PPBDs).
The average normalized interaction probabilities of phosphoproteins (red bars) and nonphosphoproteins (blue bars) in the “ALL” phospho-PPI network with each type of PPBD or with all PPBDs (indicated on the horizontal axis) are shown.
Figure 6.
Number counts of interacting protein pairs of each phosphorylation pattern shown in the phospho-PPI network.
Respective rows of panels correspond to the three phosphorylation patterns of two interacting proteins: “Both” (A,B) “Either” (C,D) and “Neither” (E,F) respective columns correspond to PPI categories of “ALL” (A,C,E) and “Y2H” (B,D,F). In each panel, data are shown for two types of phospho-PPI networks: “whole” (i.e. unfiltered) and “filtered” (see text). Colored bars (purple, blue, and pink) represent number counts of protein interactions in real data sets; gray bars show mean values of those estimated by negative controls generated by random edge rewiring (N = 10,000). Error bars represent s.d. Blue/red asterisks denote significance of values higher/lower than those of negative controls (***P<0.001).
Figure 7.
Kinases inferred to yield co-phosphorylation of proteins in the same protein complex.
(A) Conceptual diagram of an interacting kinate module (IKM) motif. (B,C) Kinases revealed to have significantly higher IKM formability than negative controls by data integration of the PPI network with in vitro kinase–substrate relationships (B) and with a literature-based collection of signaling pathways (C). For each kinase, arrows denote number counts of IKMs formed by that kinase and the “whole” (i.e. unfiltered) PPI network, with P values estimated from negative controls of the PPI network generated by RER and NLS (N = 10,000). Expected probability density distributions of number counts of IKMs observed in negative controls generated by node label shuffling and random edge rewiring are shown by gray and white bars, respectively.