Combinatorial Complexity and Compositional Drift in Protein Interaction Networks

doi:10.1371/journal.pone.0032032

Figure 1.

Binding surfaces and complex formation.

Center: The traditional plain graph representation of a PPI network represents the binding capabilities of a hub protein (red) through several incident edges. The diversity of molecular species generated by these potential interactions depends on the extent to which they compete for binding surfaces (white circles), to which we refer as “sites”. These conflicts are best represented as a “site graph”, derived from a domain-level resolution of protein-protein interactions. We depict two extreme cases. Top: All interaction partners compete for the same site. Bottom: All interactions occur at different sites and are mutually compatible. In the language we deploy to represent processes based on protein-protein interactions, a site denotes a distinct interaction capability. A comparison between the scenarios depicted at the top and the bottom illustrates how combinatorial complexity is affected by binding conflicts.

More »

Expand

Figure 2.

The network subject of this paper.

The graph of proteins, sites and interactions found in the cytoplasmic portion of the Structural Interaction Network (cSIN), as compiled by Kim et al [11]. The cSIN displays interactions at the level of domains or binding surfaces, making explicit which interactions compete for the same binding site. We refer to such a graph as a site graph. Its nodes are proteins (ovals), which are sets of sites (small circles on the ovals). Sites, rather than proteins, anchor the edges of this graph.

More »

Expand

Figure 3.

Kappa rules.

A: A rule expresses a local mechanistic statement (of empirical or hypothetical origin) about a protein-protein interaction in terms of a rewrite directive plus a rate constant (not shown). The left hand side (LHS) of the rule consists of partially specified protein agents, and represents the contextual information necessary for identifying reaction instances that proceed according to the rule. The right hand side (RHS) expresses the actions that may occur when the conditions specified on the LHS are met in a reaction mixture. In this case, the rule specifies a binding action. Site graphs are represented in a simple syntax, explicated in Figure 1 of Supporting Information S1. B: The rule in panel A can match the shown sample mixture of molecular species in two ways, giving rise to two possible reactions with different outcomes. Because of their local nature, Kappa-rules may apply in both a unimolecular and bimolecular situation. In general, such rules are given two rate constants (a first-order and a second-order constant), and the simulator will automatically generate the appropriate stochastic kinetics. However, in the present paper, global constraints prevent this ambiguity at the outset and the rules of the cSIN therefore necessitate only one rate constant (bimolecular for association and unimolecular for dissociation).

More »

Expand

Figure 4.

Schematic free energy landscape.

The schematic shows the free energy landscape for a case in which differences in affinities are entirely represented by differences in off-rates. Here we have two different binding reactions: A binds B and C binds D. “A+B” and “C+D” represent the unbound states on the far left of the schematic reaction coordinate; the unbound states in this case have roughly the same free energy. The transitions states (represented by “A B” and “C D”) also have approximately the same free energy; the change in free energy from the unbound state to the transition state is identical in both cases (giving identical values of ). However, the bound states (“AB” and “CD”) exhibit very different free energies, and the difference in free energy change between the transition state and the bound state results in a much higher value of for the C–D binding reaction compared to the A–B binding reaction.

More »

Expand

Figure 5.

Combinatorial complexity of the cSIN.

A: Panel A reports the number of unique complexes that could be produced by the cSIN as a function of complex size using brute force enumeration. As described in the text, complexes that contain more than one copy of a particular protein are discarded, since they could correspond to polymers. Given that the NR constraint allows for multiple copies of a protein to enter a complex in certain situations (see section 7.1 of Supporting Information S1), the numbers displayed here represent a lower bound on the number of unique complexes for the NR constraint. The red line represents an exponential regression of the data, with . B: Panel B reports the estimated combinatorial complexity of cSIN-like acyclic networks as a function of network size, using the procedure described in section 3 of Supporting Information S1. Each point represents an average over 10 independently generated model networks with the same edge density as the cSIN. The red line depicts an exponential regression with .

More »

Expand

Figure 6.

Dynamic diversity of the cSIN in yeast cells.

A: The graph reports the number of unique complexes actually present in a simulated system (“cell”) as a function of time. Each point represents an average over 15 independent simulations. In all panels of this figure, the error bars represent approximately 95% confidence intervals. B: The normalized distance between the complement of complexes (“complexomes”) generated by individual simulations is shown as a function of time. Each point is an average over all unique comparisons between 15 independent simulations. Using the parameters described in the text, the separation between steady states reaches % of the maximal distance. C: The stationary distance between cells is shown as a function of complex size, averaged over all of the unique comparisons between 15 independent simulations. The complexomes of cells are nearly identical with regard to small complexes, due to fewer combinational possibilities and the high relative abundance of small complexes (see Figure 7 below). However, complexomes differ dramatically for large complexes. This is the case for all combinations of parameters and ring closure scenarios we have tested (see below and Supporting Information S1). Since other parameter sets do not substantially change the relationship shown here, much of the difference in inter-cell distances for these parameter sets derives from how heavily the dynamics sample large complexes. D: The distance between a cell at time and the same cell at time is shown as a function of . The first time point is taken after cells have reached steady state (in this case, = 2, see panels A and B). The blue line denotes the average inter-cell distance at steady state, taken from the last time point in panel A above. The red curve represents an exponential fit to the relaxation, with .

More »

Expand

Figure 7.

Distribution of complex sizes.

The graph shows the distribution of complex sizes for NR simulations with all dissociation constants set to . This distribution is calculated at the final time point for the simulations represented in Figure 6. The points on the graph represent the average probability of finding a complex of a certain size across 15 independent simulations. The error bars in this case are set to approximate confidence intervals; for large complexes, the error bars exceed the scale for the lower bound. This is because the confidence intervals include 0, which cannot be displayed on the logarithmic scale of the ordinate.

More »

Expand

Figure 8.

Comparison between network dynamics based on uniform affinities and concentration-basd affinities.

A: The number of unique complexes in independent simulations as a function of time: each curve represents the average over 15 independent simulations. In this panel, as with all of the panels in this figure, the error bars represent confidence intervals. Allowing interaction strengths to vary across the network produces more unique complexes at steady state ( for the variable case compared to for the case). B: Comparison of the distribution of complex sizes: the distributions represent the probability of finding a complex of a particular size across the entire population of 15 simulations at the final time point in panel A. The two interaction affinity scenarios produce similar distributions, with the simulations sampling somewhat larger complexes. C: Comparison of the distance between independent simulations over time: each curve represents the average over all unique comparisons between 15 independent simulations using the distance measure defined in equation 5. As in panel B, the two scenarios produce essentially identical curves. D: Comparison of the distance between independent simulations as a function of complex size: each curve represents the average over all unique comparisons between 15 independent simulations at the final time point in panel A. Again, the two parameter scenarios produce essentially the same result.

More »

Expand

Figure 9.

Binding free energies and dissociation constants for the cSIN2.

A: A plot of the distribution of free energies for reactions in the cSIN2. The black circles are a histogram of the free energies; the grey line represents a smoothed version of the distribution. The average free energy is kcal mol, which corresponds to a dissociation constant of nM. B: This plot presents a comparison of the structure-based 's for each edge in the cSIN2 (abscissa) and the concentration-based 's (ordinate). For each interaction in the cSIN2 the concentration-based is obtained using equation 3. Despite the similarity in the average affinity in both cases (corresponding to a of around nM), the two methods produce values that are very different from one another: the linear correlation produces an R of .

More »

Expand

Figure 10.

Results from NR simulations of the cSIN2.

A: The number of unique complexes in independent simulations as a function of time: this curve represents the average over 15 independent simulations. In this panel, as with all other panels in this figure, the error bars represent confidence intervals. The steady-state number of unique complexes is slightly smaller for the cSIN2 than the original cSIN using constant nM affinities ( compared with ). B: This plot shows the probability of finding a complex of a particular size across the entire population of 15 simulations at the final time point in panel A. The distribution of sizes is similar to that found for NR simulations of the original cSIN, although the complexes are, on average, somewhat smaller than those obtained from NR simulations of the cSIN at . C: This plot displays the distance between independent simulations over time: the curve represents the average over all unique comparisons between 15 independent simulations using the distance measure defined in equation 5. The distances obtained from the cSIN2 are slightly lower than those obtained from the cSIN at ( vs. ). D: This curve represents the distance between simulations as a function of complex size, averaged over all unique comparisons between 15 independent simulations at the final time point in panel A. The overall shape of this curve is essentially identical to the nM case for the original cSIN as displayed in Figure 5; the main difference is that the simulations based on structure-derived 's sample somewhat smaller complexes than the original nM case.

More »

Expand