Citation: Bertin N, Simonis N, Dupuy D, Cusick ME, Han J-DJ, Fraser HB, et al. (2007) Confirmation of Organized Modularity in the Yeast Interactome. PLoS Biol 5(6): e153. doi:10.1371/journal.pbio.0050153
Published: June 12, 2007
Copyright: © 2007 Bertin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by the Keck Foundation (FPR and MV) and by NIH grants HG0017115 (FPR and MV) and HG003224 (FPR).
Competing interests: The authors have declared that no competing interests exist.
A recent PLoS Biology article  rejected the conclusions of two previous publications [2,3] that two categories of highly connected “hub” proteins—“date” and “party” hubs—have distinct properties in the Saccharomyces cerevisiae interactome network. Currently available protein–protein interaction datasets are vastly incomplete, even for yeast . Therefore, it is reasonable to rigorously re-scrutinize global properties of interactome networks as new datasets become available. Here we show that distinctions between date and party hubs , previously shown in a high-quality filtered yeast interactome (FYI) dataset [2,3], are in fact confirmed in an updated literature-curated yeast interactome network.
Two protein–protein interaction datasets were used in : a high-confidence (HC) network obtained from both curated literature and high-throughput sources, and a subgraph of HC that was obtained by linking the nodes of FYI with HC edges (HCfyi). As explained in , it is crucial that high-quality data be used to partition date and party hubs. Therefore, FYI was originally generated as the union of two high-confidence interaction datasets: one curated from small-scale studies published in the literature  and another obtained by stringently requiring support from at least two out of four sources of high-throughput interaction evidence . We use a similar definition here to derive a filtered high-confidence (‘filtered-HC’) dataset containing 2,561 proteins linked by 5,996 interactions (Table S1) from HC. To eliminate false positive interactions that were either reported once but never confirmed or that were obtained through curation error, our analysis included literature-curated interactions only if they were observed in two independent articles (i.e., associated with two or more independent PubMed IDs). Moreover, many interactions in HC were derived from a single experiment reported in multiple publications—e.g., reference  describes an approximate superset of the experiments including those reported in reference . Such publications [6–10] were considered dependent and merged. Thus 2,423 protein pairs were removed from HC. Also, we did not include interactions supported solely by high-throughput yeast two-hybrid screening [11,12] (97 pairs) or supported solely by high-throughput pull-down followed by mass spectrometry screening (742 pairs) [6–10,13] (see Table S1 for a complete list of interactions in filtered-HC).
Consistency of Date and Party Hub Classification across Datasets
We identified date and party hubs in both HC and filtered-HC (all analyses were also performed on the HCfyi network; see Figure S1). Since both networks contain many new interactions relative to FYI, and since some erroneous interactions might have been corrected, the proteins originally identified as hubs in FYI cannot and should not be assumed to be identical. For the analyses described here, we therefore defined hubs anew using a degree threshold that includes the top 20% most connected nodes . This corresponds to a degree of 10 or more for HC (19.4% of the proteins) and a degree of 7 or more for filtered-HC (21.7%).
In the original report of the date/party hub distinction , bimodality was observed in the average Pearson's correlation coefficient (AvgPCC) distribution of hubs for two out of five expression datasets examined . The complete lack of bimodality observed in  may stem from a conservative statistical test that assumed a uniform unimodal null distribution. We emphasize that bimodality was not deemed essential evidence of the party/date hub distinction in the initial report .
Since party and date hubs fall along a continuum, the choice of an AvgPCC threshold that distinguishes them is somewhat arbitrary (although our previous conclusions were robust to this choice ). Therefore, we adopted the PCC threshold of 0.5 for all networks considered here (this is the same threshold applied previously to PCC distributions that did not appear bimodal [1,2]). Thirteen expression datasets [14–31] were considered in addition to the original five independent datasets  (see Table S2). Strikingly, 86% of the FYI-defined hubs found in filtered-HC retained their date/party designation (Figure 1A) (81% for HC). This indicates that assignment to one or the other category is robust across datasets.
(A) Consistency of the party/date attribution between FYI and filtered-HC. Because filtered-HC network has many more interactions than FYI, only 162 of the 546 hubs in filtered-HC had been previously found in FYI. Filtered-HC confirmed 86% of the party/date designations in FYI. In addition, 20% of FYI hubs are not considered as hubs anymore in the new filtered-HC network because of the higher connectivity threshold.
(B) The effect on the characteristic path length (top panels) and main component size (bottom panels) of the networks upon gradual node removal for HC (left panels) and filtered-HC (right panels). Attacks against all hubs (brown curve), party hubs (blue curve), date hubs (red curve), and random nodes (green curve). Insets show an additional control for connectivity differences between categories with the x-axis representing the number of edges removed from the network.
(C) Date hubs participate in more genetic interactions than party hubs or non-hubs , as measured here by mean number of interactions  from a network of curated genetic interactions  for both filtered-HC (right panel) and HC (left panel). Inside each panel, bars show the number of genetic interactions held by date hubs (red), party hubs (blue), and non-hub proteins (yellow). The p-values assessing the difference of the means between date and party hubs (Mann-Whitney U-test) are indicated above the bars.
We suggest that some analyses presented in  (in particular the network tolerance to hub deletion) erred by not taking into account new hubs defined by the increased number of interactions relative to the original FYI. This strategy ignores 46% of the hubs in HCfyi  and thus effectively immunizes them in the attack resistance analysis and eliminates them from the genetic interaction comparison.
Distinct Topological Properties of Date and Party Hubs
When removed from the network, party and date hubs have strikingly distinct effects on the overall topology of HC, filtered-HC, and HCfyi. Removing date hubs dramatically disrupts the characteristic path length (CPL) of the network, whereas removing party hubs has a negligible effect (Figure 1B), as previously observed . Importantly, this difference in behavior is not sensitive to the specific threshold values of degree k and AvgPCC chosen here to define hubs and party hubs, respectively (Figure S2). The CPL of a network measures the mutual closeness of nodes in a network. The claim in  that date and party hub removal has an indistinguishable effect on network topology was based on the analysis of a different topological feature altogether—main component size. This is a poor measure of network clustering in that it does not, for example, discriminate an extended beads on a string topology from a completely connected clique. This measure is also highly sensitive to a single spurious interaction that connects two otherwise disconnected subgraphs. By contrast, the dramatic decrease in CPL that we observe for date hubs in HC, filtered-HC, and HCfyi suggests their coordinating role and confirms the original findings .
In  we showed that date hubs exhibit a higher genetic interaction density than party hubs. Reference  described analysis of two sets of genetic interactions: one from a union of high-throughput studies (HTP-GI), and another from the literature (LC-GI) . Both LC-GI and HTP-GI datasets are potentially subject to bias since gene pairs were selected nonrandomly for testing, but these are the best datasets currently available. While the LC-GI analysis confirmed our original finding, the HTP-GI analysis did not , which we confirmed using date/party hubs defined from FYI. However, examining HTP-GI in the larger HC and filtered-HC networks, we find that date hubs in both HC and filtered-HC exhibit higher genetic interaction density than party hubs or non-hubs (Figure 1C), confirming the original report . This difference remains after controlling for connectivity of hubs in the protein interaction network (Figure S3).
We also confirmed the difference in evolutionary rates  between date and party hubs that was reported previously . Using the filtered-HC network (with hubs defined as above) we found that date hubs evolve significantly faster than party hubs (Wilcoxon p = 0.01). Furthermore, using our expanded expression dataset, the PCC of hubs was negatively correlated with their evolutionary rates (Pearson r = −0.22, p = 1 ×10−7), even when controlling for protein abundance  in either rich (Pearson partial r = −0.19, p = 3 ×10−6) or minimal media (Pearson partial r = −0.20, p = 2 ×10−6). The same result was obtained when considering the HC and HCfyi networks (unpublished data). Moreover, a recent report independently supported evolutionary rate differences between date and party hub and explained these differences in terms of three-dimensional protein structure .
We confirmed that date and party hubs have different topological properties, with the coordinating role of date hubs being supported by a greater impact on CPL. We also confirmed that date hubs participate in more genetic interactions and evolve more rapidly than party hubs. These observations, as well as the identity of the nodes considered as date and party, remained largely consistent within all tested networks (HC, filtered-HC, HCfyi), demonstrating the robustness of the results originally observed in . Thus, this updated analysis confirms the validity of the distinction between date and party hubs in the yeast interactome [2,3], and shows that the date and party hub concept and the “stratus-like” network  model are not mutually exclusive.
Figure S1. Hub Deletion and Genetic Interaction Analysis for the HCfyi Interaction Network as Defined in 
(172 KB PDF).
Figure S2. Different Effect on Gradual Date or Party Node Removal on the CPLs of the Networks for Filtered-HC Is Not Dependent on the PCC Threshold Used to Define Party Hubs.
(378 KB PDF).
Figure S3. Genetic Connectivity of Date and Party Hubs
(A) Mean number of genetic interactions reported corrected by the physical connectivity. (B) The mean absolute connectivity for each hub category and the genetic interaction connectivity normalized by the number of protein–protein interactions observed for all three protein–protein interaction datasets using either HTP-GI or LC-GI separately or combined. p-values assessing the difference of the means (Mann-Whitney U-test) are indicated.
(102 KB PDF).
Table S1. Filtered-HC Protein-Protein Interaction Dataset.
(8.5 MB XLS).
Table S2. Filtered-HC Date and Party Hubs Degrees, Clustering Coefficients and AvgPCC Values for Each Microarray Dataset.
(252 KB XLS).
- 1. Batada NN, Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, et al. (2006) Stratus not altocumulus: A new view of the yeast protein interaction network. PLoS Biol 4(10): e317. doi: 10.1371/journal.pbio.0040317.
- 2. Han JD, Bertin N, Hao T, Goldberg DS, Berriz GF, et al. (2004) Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430: 88–93.
- 3. Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW (2002) Evolutionary rate in the protein interaction network. Science 296: 750–752.
- 4. Han JD, Dupuy D, Bertin N, Cusick ME, Vidal M (2005) Effect of sampling on topology predictions of protein-protein interaction networks. Nat Biotechnol 23: 839–844.
- 5. Mewes HW, Frishman D, Guldener U, Mannhaupt G, Mayer K, et al. (2002) MIPS: a database for genomes and protein sequences. Nucleic Acids Res 30: 31–34.
- 6. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, et al. (2006) Proteome survey reveals modularity of the yeast cell machinery. Nature 440: 631–636.
- 7. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, et al. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415: 141–147.
- 8. Grandi P, Rybin V, Bassler J, Petfalski E, Strauss D, et al. (2002) 90S pre-ribosomes include the 35S pre-rRNA, the U3 snoRNP, and 40S subunit processing factors but predominantly lack 60S synthesis factors. Mol Cell 10: 105–115.
- 9. Krogan NJ, Peng WT, Cagney G, Robinson MD, Haw R, et al. (2004) High-definition macromolecular composition of yeast RNA-processing complexes. Mol Cell 13: 225–239.
- 10. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, et al. (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440: 637–643.
- 11. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, et al. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A 98: 4569–4574.
- 12. Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403: 623–627.
- 13. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, et al. (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415: 180–183.
- 14. Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, et al. (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11: 4241–4257.
- 15. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, et al. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9: 3273–3297.
- 16. Roberts CJ, Nelson B, Marton MJ, Stoughton R, Meyer MR, et al. (2000) Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. Science 287: 873–880.
- 17. Travers KJ, Patil CK, Wodicka L, Lockhart DJ, Weissman JS, et al. (2000) Functional and genomic analyses reveal an essential coordination between the unfolded protein response and ER-associated degradation. Cell 101: 249–258.
- 18. Chu S, DeRisi J, Eisen M, Mulholland J, Botstein D, et al. (1998) The transcriptional program of sporulation in budding yeast. Science 282: 699–705.
- 19. Chitikila C, Huisinga KL, Irvin JD, Basehoar AD, Pugh BF (2002) Interplay of TBP inhibitors in global transcriptional control. Mol Cell 10: 871–882.
- 20. Gasch AP, Huang M, Metzner S, Botstein D, Elledge SJ, et al. (2001) Genomic expression responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog Mec1p. Mol Biol Cell 12: 2987–3003.
- 21. Mnaimneh S, Davierwala AP, Haynes J, Moffat J, Peng WT, et al. (2004) Exploration of essential gene functions via titratable promoter alleles. Cell 118: 31–44.
- 22. Segal E, Shapira M, Regev A, Pe'er D, Botstein D, et al. (2003) Module networks: Identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 34: 166–176.
- 23. Yoshimoto H, Saltsman K, Gasch AP, Li HX, Ogawa N, et al. (2002) Genome-wide analysis of gene expression regulated by the calcineurin/Crz1p signaling pathway in Saccharomyces cerevisiae. J Biol Chem 277: 31079–31088.
- 24. Haugen AC, Kelley R, Collins JB, Tucker CJ, Deng C, et al. (2004) Integrating phenotypic and expression profiles to map arsenic-response networks. Genome Biol 5: R95.
- 25. Lai LC, Kosorukoff AL, Burke PV, Kwast KE (2006) Metabolic-state-dependent remodeling of the transcriptome in response to anoxia and subsequent reoxygenation in Saccharomyces cerevisiae. Eukaryot Cell 5: 1468–1489.
- 26. Roberts GG, Hudson AP (2006) Transcriptome profiling of Saccharomyces cerevisiae during a transition from fermentative to glycerol-based respiratory growth reveals extensive metabolic and structural remodeling. Mol Genet Genomics 276: 170–186.
- 27. Wyrick JJ, Holstege FC, Jennings EG, Causton HC, Shore D, et al. (1999) Chromosomal landscape of nucleosome-dependent gene expression and silencing in yeast. Nature 402: 418–421.
- 28. Lee SE, Pellicioli A, Demeter J, Vaze MP, Gasch AP, et al. (2000) Arrest, adaptation, and recovery following a chromosome double-strand break in Saccharomyces cerevisiae. Cold Spring Harb Symp Quant Biol 65: 303–314.
- 29. Smith JJ, Marelli M, Christmas RH, Vizeacoumar FJ, Dilworth DJ, et al. (2002) Transcriptome profiling to identify genes involved in peroxisome assembly and function. J Cell Biol 158: 259–271.
- 30. Ogawa N, DeRisi J, Brown PO (2000) New components of a system for phosphate accumulation and polyphosphate metabolism in Saccharomyces cerevisiae revealed by genomic expression analysis. Mol Biol Cell 11: 4309–4321.
- 31. Huang J, Zhu H, Haggarty SJ, Spring DR, Hwang H, et al. (2004) Finding new components of the target of rapamycin (TOR) signaling network through chemical genetics and proteome chips. Proc Natl Acad Sci U S A 101: 16594–16599.
- 32. Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hon GC, et al. (2006) Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol 5: 11.
- 33. Hirsh AE, Fraser HB, Wall DP (2005) Adjusting for selection on synonymous sites in estimates of evolutionary distance. Mol Biol Evol 22: 174–177.
- 34. Newman JR, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, et al. (2006) Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441: 840–846.
- 35. Kim PM, Lu LJ, Xia Y, Gerstein MB (2006) Relating three-dimensional structures to protein networks provides evolutionary insights. Science 314: 1938–1941.