Perturbation Centrality and Turbine: A Novel Centrality Measure Obtained Using a Versatile Network Dynamics Tool

Analysis of network dynamics became a focal point to understand and predict changes of complex systems. Here we introduce Turbine, a generic framework enabling fast simulation of any algorithmically definable dynamics on very large networks. Using a perturbation transmission model inspired by communicating vessels, we define a novel centrality measure: perturbation centrality. Hubs and inter-modular nodes proved to be highly efficient in perturbation propagation. High perturbation centrality nodes of the Met-tRNA synthetase protein structure network were identified as amino acids involved in intra-protein communication by earlier studies. Changes in perturbation centralities of yeast interactome nodes upon various stresses well recapitulated the functional changes of stressed yeast cells. The novelty and usefulness of perturbation centrality was validated in several other model, biological and social networks. The Turbine software and the perturbation centrality measure may provide a large variety of novel options to assess signaling, drug action, environmental and social interventions.

Scale-free, modular benchmark graphs [1] were generated as described in Supplementary Methods. Average perturbation centrality (black squares), average degree (red circles), the reciprocal of average fill time (green triangles, added perturbation: 10,000 units per step) and average "number of visited nodes" (blue diamonds) for a damping value of 0.85 were calculated from 3 randomly generated benchmark graphs with ratios of inter-modular edges ranging from 0.05 to 0.85 with steps of 0.05 as described in Methods of the main text for Turbine (with the change that a 95% threshold was used for the fill time -that is, 95% of the network had to have an energy value larger than 1 -since the benchmark networks were much more homogeneous than real-world networks) and in the Supplementary Methods for ITM-Probe [2]. Values were normalized using the scale function of the R package [3]. Note that the Spearman correlation between the various values was more than 0.95, except for the average degree. Figure S2. Correlation of perturbation centrality, "number of visited nodes" and node degree. Scale-free, modular benchmark graphs [1] were generated as described in Supplementary Methods. Perturbation centrality and "number of visited nodes" measures were calculated from 3 sets of randomly generated benchmark graphs with ratios of inter-modular edges ranging from 0.05 to 0.85 with steps of 0.05 as described in Methods of the main text and in Supplementary Methods for ITM-Probe [2], respectively. Spearman correlations were calculated using the R package [3]. Panel A shows correlation of perturbation centrality versus node degree, Panel B shows the correlation of number of visited nodes versus the node degree, and Panel C shows the correlation of perturbation centrality and number of visited nodes. The data in Panel A reinforces the observation in the main text that the degree becomes more important in the determination of the silencing time as the modules become more and more fuzzy and overlapping. Interestingly, results from ITM-Probe behave in an exactly opposite way: as the communities become more overlapping, the number of visited nodes measure quickly becomes negatively correlated with the degree (possibly because random walks can "turn back"). These two effects taken together resulted in a large correlation between the perturbation centrality and number of visited nodes when there were pronounced modules, and a negative correlation when the modules became fuzzier, which is shown by the data in Panel C. Figure S1. Comparison of perturbation centrality with the "number of visited nodes" measure of ITM-Probe as a function of node degree with different ratios of inter-modular edges of benchmark graphs. Scale-free benchmark graphs [1] with overlapping modules were generated as described in Supplementary Methods. Perturbation centrality (red, Panels A through F) and "number of visited nodes" measures (blue, Panels G through L) were calculated as described in Methods of the main text and in Supplementary Methods for ITM-Probe [2], respectively. For the generation of the benchmark graphs with ratios of inter-modular edges 0.05, 0,2, 0.35, 0.5, 0.65 and 0.80 appearing on Panels A/G, B/H, C/I, D/J, E/K and F/L, respectively, a random seed of 87 was used. The results suggest that nodes in the networks with pronounced modules give similar results using Turbine and ITM-Probe (observe the same striping pattern showing the better perturbation propagation capability of nodes having inter-modular edges). On the contrary, in networks with fuzzy modules, the result is still correlated with the degree in Turbine, but ITM-Probe results do not seem to depend on the degree. These are the same results that can be obtained from Figure S2; this figure serves as an illustration of the possible underlying pattern behind the change in correlation: the number of nodes visited measure seems to have an upper saturation-like limit in ITM-Probe in networks with largely overlapping modules.

Figure S4. A visual representation of the relation among different centrality measures.
We calculated Spearman correlations between different centrality measures for the 10 benchmark and real-world networks shown in Table 2 of the main text. A 7-node graph was created from this data using the different centrality measures as nodes, and the average correlation between pairs of centralities as edge weights. The graph was thereafter laid out using the ForceAtlas 2 layout algorithm (which uses edge weights) of Gephi [4] with default settings except for the "Edge weight influence" option, which was set to 5.0. The layout generated this way can be a good approximation of the relations between different centralities, since the more correlated centrality measures are connected by more powerful "springs"; thereby their final position is closer to each other. It is visible on the figure that the perturbation centrality measure occupies a new position with largest correlations to closeness centrality, community centrality and weighted degree, just as the mean correlations of Table 2 in the main text show.

Figure S5. Distribution of perturbation centralities in 10 benchmark and real-world networks.
Histograms were generated using the same data set for 2 benchmark and 8 real-world networks that were used for Table S1 and S2. Detailed descriptions of the networks are available in Supplementary Methods. Perturbation centralities were calculated according to Methods of the main text. Histograms from the perturbation centralities were generated using the "hist" command of R [3], with default settings. The tested social network (Panel F), the modular benchmark network [1] with pronounced modules (Panel I) and both conformations of Met-tRNA synthetase [5] (Panels G and H) had approximately normal distributions of perturbation centrality values. The Filtered Yeast Interactome [6] (Panel E) and the 2010 release of the Database of Interacting Proteins [7] (Panel D), as well as the modular benchmark network with fuzzy modules (Panel J) seemed to have approximately lognormal distributions for perturbation centrality. The histogram of the 2005 release of the Database of Interacting Proteins looked like a scale-free distribution (Panel C), and finally, the most skewed distributions were the perturbation centralities of the two metabolic networks [8,9], which looked like exponential distributions. However, all distributions failed the Shapiro-Wilk normality test (p=5*10 -16 , 5*10 -22 , 10 -44 , 10 -31 , 3*10 -26 , 0.0003, 0.0094, 2*10 -7 , 3*10 -9 , 3*10 -45 , respectively), so the Wilcoxon rank-sum test had to be used for statistical significance analysis instead of a t-test.  [5] ("Signaling residues"), as well as amino acids with experimentally verified importance [5] ("Experimental residues"). Light blue, dark red and green bars show average perturbation centralities of amino acids in loop, α-helical and β-sheet structures, while gray bars show the global average perturbation centrality calculated for the whole protein. Pink and brown bars of Panels E and F show average perturbation centralities of the Signaling and Experimental residues, respectively. Note that both betweenness and closeness centralities were less successful than perturbation centrality in differentiating between the above amino acid sets (cf. the current Figure with  This figure is a direct parallel to Figure S6 and Figure S8, here using betweenness centrality instead of perturbation and closeness centralities, respectively. Protein structure networks of the substrate-free and substrate-bound forms of the E. coli Met-tRNA synthetase and rabbit cytochrome P450 2B4 proteins were generated as described in Methods of the main text and Supplementary Methods. Assignment of secondary structures for different amino acids was performed by PyMOL. Error bars show standard error of the mean. Different letters mean significantly different groups (α=0.01, Wilcoxon rank-sum test). Panels A and B show data calculated for the substrate-free and substratebound form of Met-tRNA synthetase, respectively. Panels C and D show data calculated for the substrate-free and imidazole-bound forms of cytochrome P450 2B4. Panels E and F show the average perturbation centralities of predicted communication pathways as described by Ghosh and Vishveshwara [5] ("Signaling residues") and other residues with experimentally verified importance [5] ("Experimental residues"). Light blue, dark red and green bars on Panels A through D show average perturbation centralities for residues in loops, α-helices and β-sheets, while gray bars show the global average perturbation centrality calculated for the whole protein. Pink bars on Panels E and F show average perturbation centralities of the Signaling residues, and brown bars show the means of the Experimental residues of Met-tRNA synthetase. Betweenness centrality returned by far the largest deviations of the three tested centralities (i.e. closeness, betweenness and perturbation centralities). Loops still had significantly (p=0.0016, 0.011, 4.3*10 -5 , 6.8*10 -7 for the free and bound conformations of Met-tRNA synthetase and cytochrome P450, respectively; Wilcoxon rank-sum test, α=0.00625 adjusted with Bonferroni correction) lower mean centrality values than the global average in all networks (but the substrate-bound Met-tRNA synthetase). Using betweenness centrality α-helices can no longer be distinguished from the global mean (except for the substrate-bound form of cytochrome P450). Signaling residues could still be distinguished from the global mean (Panels E and F), but the differences in centralities for the Experimental residues were no longer significant.

Figure S8. Average closeness centralities of different residue groups in Met-tRNA synthetase and cytochrome P450 enzymes.
This figure is a direct parallel to Figure S6 and Figure S7, here using closeness centrality instead of perturbation or betweenness centralities, respectively. Protein structure networks of the substrate-free and substrate-bound forms of the E. coli Met-tRNA synthetase and rabbit cytochrome P450 2B4 proteins were generated as described in Methods of the main text and Supplementary Methods. Assignment of secondary structures for different amino acids was performed by PyMOL. Error bars show standard error of the mean. Different letters mean significantly different groups (α=0.01, Wilcoxon rank-sum test). Panels A and B show data calculated for the substrate-free and substratebound form of Met-tRNA synthetase, respectively. Panels C and D show data calculated for the substrate-free and imidazole-bound form of cytochrome P450 2B4. Panels E and F show the average perturbation centralities of predicted communication pathways as described by Ghosh and Vishveshwara [5] ("Signaling residues") and other residues with experimentally verified importance [5] ("Experimental residues"). Light blue, dark red and green bars on Panels A through D show average perturbation centralities for residues in loops, α-helices and β-sheets, while gray bars show the global average perturbation centrality calculated for the whole protein. Pink bars on Panels E and F show average perturbation centralities of the Signaling residues, and brown bars show the means of the Experimental residues of Met-tRNA synthetase. Closeness centrality returned smaller deviations than perturbation centrality. Interestingly, the distinction power of closeness centrality was slightly lower in Met-tRNA synthetase, and exactly the same in cytochrome P450 as the distinction power of perturbation centrality. In particular, the Experimental residues could no longer be distinguished from the global mean (Panels E and F, p=0.034, p=0.04, respectively; 0.007 and 0.006 with perturbation centrality), α-helices did not have significantly higher mean closeness centrality than the global average in the substrate-bound Met-tRNA synthetase (Panel B, p=0.014 vs. 0.00015 with perturbation centrality), and β-sheets were no longer distinguishable from loops in the substrate-free Met-tRNA synthetase (Panel A, p=0.035 vs. 0.0034 with perturbation centrality). P-values were calculated using the Wilcoxon rank-sum test, α=0.0125 adjusted with Bonferroni correction for a FWER of 0.05. Figure S9. Amino acids of Met-tRNA synthetase directly bound to tRNA Met . The underlying protein structure network of Met-tRNA synthetase was calculated and visualized by the Turbine program as described in Methods, and was overlaid on the 3D image of the substratebound form of the protein (and its tRNA Met complex) generated with PyMOL [10] using ray-tracing.
The bottom of the image shows the structure of tRNA Met . The purple molecule in the middle of the protein structure is the substrate Met-AMP marking the active site of the enzyme, the white sphere on the right is the Zn 2+ ion. Blue circles mark those amino acids, which are directly bound to the tRNA Met , evidenced by an atomic distance of less than 4.5Å between any atom of the residue and the tRNA Met , excluding hydrogen atoms.

Figure S10. Results for edgetic perturbations.
We also tested the effects of edge-based perturbations as their importance have previously been stated in the literature [11]. Silencing time for an edge was calculated by propagating a given amount of energy (10,000, 40,000 and 1,000,000 units) from both end-nodes of an edge, simultaneously. We calculated silencing times for all edges for two modular benchmark networks [1], one with pronounced modules (only 5% of the links were inter-modular), and one with fuzzy modules (40% intermodular links), both generated with the random seed 10. Silencing times were measured with a dissipation value of 1, and silencing threshold was also set to 1. Further definition of the silencing time is available in Methods of the main text. A link was termed inter-modular, if it was connecting two different communities. Dark red bars show data obtained on the network with pronounced modules, while light blue bars display data from the network with fuzzy modules. We could observe the very same effects for edge-based perturbations that we have observed for node-based perturbations. Panel A shows the mean perturbation centralities calculated for a large starting perturbation (1,000,000 units on both endpoints of an edge). There was a major (p=0, Wilcoxon rank-sum test) difference between the mean silencing times between fuzzy and pronounced modules in this case, and inter-modular edges were significantly (p=0.0094, α=0.025 with Bonferroni correction, Wilcoxon rank-sum test) better spreaders of perturbation in the network with pronounced modules. The disparity between the edgetic perturbations of networks with fuzzy versus pronounced modules was nearly eliminated, if a starting perturbation of 10,000 units was applied (Panel B). These effects were the same as what we have demonstrated with node-based perturbations on Figures 1 and 2 of the main text. Panel C shows silencing times calculated for 40,000 units of starting energy corresponding to the definition of perturbation centrality measure. (40,000 units of starting energy was used, since the benchmark networks contained 4000 nodes, and perturbation centrality was defined as the reciprocal of the silencing time resulting from applying a perturbation of 10*n units of energy, where n is the number of nodes in the network, see Methods of the main text.) Data of Panel C verifies that the choice of n*10 units as a starting perturbation is a nice compromise between weighing the nodes' mesoscopic position, that is, the modular location (which can be detected using a large starting perturbation) and their local position, that is, the weighted degree of the node and its near neighbors (which can be detected using a small starting perturbation). Tables   Table S1. Correlation between the reciprocal of fill time and closeness centrality a Network descriptions are given in Supplementary Methods of Text S1. b Correlation data show the Spearman's rho correlation of the reciprocal of the fill time versus the standard closeness centrality measure. Correlations between the reciprocal of fill time and closeness centrality are much stronger than those between perturbation centrality and closeness centrality (mean is 0.895 compared to 0.67 in Table 1, p=0.000487, Wilcoxon rank-sum test (correlations with closeness centrality in Table 1 failed the Shapiro normality test with p=0.0019). c Fill time was calculated for each node in each network by applying a perturbation of size 10,000 to the selected node in each time step until more than 80% of the network nodes had an energy level larger than 1. Dissipation was set to 0.

Table S2. Correlation of silencing times calculated for three different-sized starting perturbations in 8 real-world and two benchmark networks
Parameters of the different networks are available in the Methods section of the main article. Only the two benchmark graphs were model networks, generated using the benchmark graph generator tool of Lancichinetti and Fortunato [1]. The other graphs originated from different real-world scenarios. Low-intensity perturbation corresponds to an n unit large Dirac-delta starting perturbation, medium-intensity perturbation corresponds to 10*n units, and high-intensity means that a 100*n unit-sized starting perturbation was applied to a single node when calculating its silencing time, n being the number of nodes in the network. Silencing time was calculated for each node in the network as described in the Methods section of the main article. The columns show Spearman's correlation between silencing times calculated for nodes in the same network with different-sized starting perturbations. Correlations above 0.8 were marked with bold letters. It can be noticed that there are no abrupt changes in the importance in the perturbation dissipation capability of different nodes as the perturbation grows larger except for the benchmark network with disjunct modules, which means that real-world networks may display behavior closer to the benchmark graph with fuzzy modules. The table also underlines the choice of choosing n*10 as the size of the perturbation when calculating perturbation centrality, since it already seems to display module entrapment effects evidenced by the high correlation observed between medium-and high-intensity perturbations in the benchmark network with disjunct modules compared to the substantially lower correlation observed in the same network between lowand medium-intensity perturbations.

Network Correlation between low-and medium-intensity perturbations
Correlation between mediumand high-intensity perturbations Perturbation centralities were calculated with the Turbine software as described in Methods of the main text. Term enrichment analysis was performed with the R plug-in of g:Profiler [12], which returns both the enriched terms, and the proteins connected with the term. A term was stated as statistically significant, if the resulting p-value was strictly less than 0.05 after applying Bonferroni correction. Results show the high importance of cell cycle maintenance and DNA repair in both stressed and unstressed cases.
List of individual perturbation properties of residues predicted to participate in intra-protein signaling [5]

Residue
High importance in the substrate-free conformation The list of amino acids of E. coli Met-tRNA synthetase predicted to participate in the transmission of conformational changes was taken according to the text and Figure 5 of the paper of Ghosh and Vishveshwara [5]. The list of amino acids with experimentally verified importance was taken from the same article [5] listing earlier findings. Protein structure networks of the substrate-free and substratebound forms of E. coli Met-tRNA synthetase were constructed as described in Supplementary Methods. Perturbation centralities were calculated as described in Methods of the main text. Differences of perturbation centralities were obtained by subtracting the rank of the perturbation centrality of the substrate-bound conformation from the rank of the perturbation centrality of the substrate-free conformation. Residues in the top 20% of largest perturbation centrality in either the substrate-free or the substrate-bound conformation, and residues having an increase or decrease of perturbation importance in the top 20% are shown in the respective column marked by letter "Y". Boldface Y-s signify a top 10% level of importance. Starting and ending residues of the predicted communication pathways, Leu-13 and Trp-461 were also marked with bold letters.

Supplementary Results
Connection of the communicating vessels model with perturbation dissipation using shortest paths and having an exponential decay In the communicating vessels model the change of the energy of a node is given by the following differential equation: (1) where S is the energy of the current node, l is the number of edges of the current node, i w is the weight of the i th edge, i S is the current energy of the node on the other end of the i th edge, and 0 D is a parameter defining the amount of energy dissipated by a node in a given time step.
Let us define the case of unobstructed propagation as i S S >> . The consequence of the above definition is that at any given time step, the amount of perturbation transmitted to neighbors is . The amount transferred to the n th neighbor is approximately using the same logic, which results in an exponential decline with distance. If 1 ≥ i w , transfer will stop after a second or less in simulation with energy levels evening out to S/2, S/2 for the two affected nodes, which results in the same exponential decline, if the condition i S S >> holds for the other nodes as well. If there is a connection between nodes at the same distance from the origin, no energy will be transmitted, since i S S = if the edge weight from the origin is the same for both nodes. Generally, if there are no large differences between edge weights, nodes with the same distance from the origin will tend to have similar values. If there are multiple edges at a given network shell (i.e. distance from the node where perturbation started) to the next one, this only changes the amount of energy propagated by a proportional amount. According to this logic, we can hypothesize that these perturbations tend to travel on shortest paths. , which is a simple linear dissipation with time, resulting in no energy transfer with distance. Most real world cases are somewhere between the above two extremes.
Module entrapment occurs when the energy level of a module is too high for the number and weight of inter-modular edges to propagate the perturbation outwards the module, so the energy level inside the module is raised to i j S S = for every i and j node inside the module.

Comparison of the communicating vessels model with the random walk model of ITM-Probe
ITM-Probe is an algorithm modeling information propagation in complex networks using a random walk model [2]. For the comparison of the communicating vessels model used by Turbine with the ITM-Probe method, perturbation centralities obtained on networks with different ratios of inter-modular edges were compared with the "number of visited nodes" measure of the emitting model of the ITM-Probe method as described in the Supplementary Methods section.
The "number of visited nodes" measure used by the emitting model of the ITM-Probe method is theoretically quite similar to the perturbation centrality of the communicating vessels model used by Turbine, since in the ITM-Probe model the random walk has a defined probability (with a default value of 15%) to end at every node visited. This means that the average distance of the last node of the random walk is proportional with the distance from the originating node, so in the end the nodes closer to the origin tend to dissipate more walks in ITM-Probe [2] in the same manner as nodes closer to the origin dissipate more energy in our communicating vessels model.
In the comparison of Turbine with ITM-Probe a total of 51 benchmark graphs [1] were tested, with a ratio of inter-modular edges increasing from 0.05 to 0.85 with steps of 0.05. Three networks were generated with each ratio of inter-modular edges value, with three different random seeds to make the results more robust as described in Supplementary Methods. Perturbation centrality and the "number of visited nodes measure" were calculated for each node (1,000 in each network).
In the first simulation perturbation centrality and the "number of visited nodes" measure were averaged for all the 3 randomly generated networks for each inter-modular edge ratio. Spearman correlation of perturbation centrality and the "number of visited nodes" measure was calculated with the R program package [3]. Results are shown in Figure S1. Spearman correlation of the average number of visited nodes with the average perturbation centrality calculated for low-intensity perturbations was a perfect 1 meaning that the limitation of perturbation propagation imposed by modular boundaries decreased in exactly the same manner, when assessed by either the Turbine or the ITM-Probe method as the modular structure was coalescing by the gradual increase of the number of inter-modular edges.
Despite the above effect, large differences were observed between the tested measures of ITM-Probe and Turbine. Correlations of degrees, number of visited nodes and perturbation centrality were much lower in different simulations depending on the level of modularity as shown on Figure S2. The correlation between perturbation centrality and the "number of visited nodes" measure ( Figure S2C) was high when the modularity of the network was noticeable (the ratio of inter-modular edges was smaller than 0.3). However, when the modules started to merge, the correlation diminished, and even turned to negative at the highest ratios of inter-modular edges. An explanation of this effect lies in the fact that in the ITM-Probe method, the degree of a node is inversely correlated with the "number of visited nodes" measures at ratios of inter-modular edges higher than 0.2 ( Figure S2B). The diminished importance of hubs in information spread is rather counter-intuitive. For low ratios of inter-modular edges, the correlation between the "number of visited nodes" measure and node degree was positive, which shows that high-degree inter-modular nodes are the most important information spreaders in highly modular networks, and in the scale-free benchmark graphs nodes having a higher degree also have a higher chance of gaining inter-modular edges. Correlation of perturbation centrality with node degree varied with a much lower effect size (Δr=0.25), as shown on Figure S2A. The differences observed can be explained by the limiting effect of modular boundaries on perturbation propagation described in the main text. Figure S3 illustrates the individual perturbation centralities and "number of visited nodes" measures for a single scale-free benchmark graph generated using the random seed value of 87. Perturbation centrality was positively correlated with the node degree in all ratios of intermodular edges analyzed, but the limiting effect of modular boundaries on the propagation of perturbations was noticeable at low ratios of inter-modular edges. This effect was evident by the segregation of perturbation centrality data to stripes at the ratio of inter-modular edges of 0.05. At this benchmark graph configuration nodes had only one or two inter-modular edges, if any. The top segregated layer of perturbation centrality data corresponds to nodes having two inter-modular edges, while the middle layer of perturbation centrality values corresponds to nodes having one inter-modular edge. The highest "number of visited nodes" values did not change with growing ratios of inter-modular edges (cf. Panels G through L of Figure S3), only the lowest "number of visited nodes" values -and thus the average -got higher as ratio of inter-modular nodes increased, resulting in a saturation-like effect for the "number of visited nodes" measure possibly explaining the inverse correlation with degree in networks with highly overlapping modules.

Dissipation-free propagation of perturbation is characterized by fill time, which reciprocally correlates with closeness centrality
Fill time was defined as a measure assessing the propagation of a continuous perturbation in the communicating vessels model without dissipation as stated in the main text. The fill time of node i is the time needed to raise the energy level of 80% of the nodes above 1 unit (0.01% of initial perturbation) in a simulation, where a perturbation of 10,000 units was added to node i in each time step, and was propagated without dissipation. Fill time was calculated for all nodes in several benchmark graphs and in real-world networks. Table S1 shows that the reciprocal of the fill time of node i strongly correlated with the closeness centrality of the same node ( , one-sample t-test, Shapiro-Wilk normality test successful with p=0.178). We note that the two metabolic networks, where the correlation with the fill time was the lowest, correspond to a different class of networks according to Guimerà et al. [20] Since the closeness centrality of node i is defined as the mean geodesic distance (mean shortest path) between node i and all other nodes, the high correlation between the reciprocal of fill time and closeness centrality meant that the shortest paths determined most of the dissipation-free propagation of perturbations. This agrees with the expectations [21], and validates the use of the communication vessels model, which has this property as shown in the Supplementary Results of Text S1.

Comparing the effect of degree and modular position on perturbation centrality in realworld networks
In the experiment comparing the silencing times of the Lancichinetti [1] benchmark networks, the ratio of perturbation dissipation efficiencies of two node categories out of the 4, namely: inter-modular non-hubs and intra-modular hubs, had interesting changes with the modular structure. As we noted earlier, modules of real world networks seem to be more overlapping than the pronounced modules of our benchmark graphs. Starting from these notions we compared the mean perturbation centrality of intra-modular hubs against inter-modular nonhubs as the percentage of the mean perturbation centrality of intra-modular non-hubs in multiple real-world networks ( Table S8). Hubs were nodes with the top 10% degree, and inter-modular non-hubs were those with at least 40% inter-modular edges -just as in the previous calculations. Inter-modular non-hubs had only a 15% larger perturbation centrality, while hubs had a 115% larger perturbation centrality than intra-modular non-hubs. The large (87%) difference between the effect of hubs versus the effect of inter-modular non-hubs suggest that from a perturbation perspective real-world networks resemble the benchmark graphs with fuzzy modules more, than the benchmark graphs with pronounced modules. (Note that the same observation was obtained when we compared the low-intensity and highintensity silencing times -see Table S2.)

Amino acids participating in intra-protein signaling have a high perturbation centrality in the protein structure network
We have assessed the perturbation centrality of amino acids forming α-helices, β-sheets and loops in two pairs of protein structure networks corresponding to the substrate-free and substrate-bound conformations of E. coli Met-tRNA synthetase and rabbit cytochrome P450 2B4, respectively. The Wilcoxon rank-sum test (α=0.00625 adjusted with Bonferroni correction) indicated significantly (p=0.00023, 0.00015, 0.00083, 0.0014 for the free and bound conformations of Met-tRNA synthetase and cytochrome P450, respectively) larger perturbation centralities of α-helices compared to the global mean, while the same test indicated significantly (p=3.2*10 -6 , 9.5*10 -6 , 2.3*10 -6 , 0.0001 for the free and bound conformations of Met-tRNA synthetase and cytochrome P450, respectively) smaller perturbation centralities for loops compared to the global mean. (Figure S6A through S6D; α=0.001). The average perturbation centrality for β-sheets showed a larger variation. The same data calculated for betweenness and closeness centralities is shown on Figures S7 and  S8 of Text S1. Betweenness centrality had a much larger deviation than perturbation centrality, and both closeness and betweenness centralities could differentiate less between amino acids in different secondary structures than perturbation centrality. residues (p=0.0072, 0.061 for the free and Met-tRNA-bound conformations, respectively) than average. Data in Figures S7 and S8 of Text S1 show that both closeness and betweenness centralities could differentiate Signaling residues, but neither closeness nor betweenness centralities could differentiate the Experimental residues from the average centrality of the whole protein. 67 or 60% of Signaling or Experimental residues, respectively, were in the top 20% of amino acids having the largest perturbation centrality and/or the largest change of perturbation centrality upon substrate binding (Table S9 of Text S1).

Testing the effects of edgetic perturbations
The definition of perturbation centrality described in the main text: "the reciprocal of silencing time retrieved by using a Dirac delta type starting perturbation of 10*n units, where n is the number of nodes in the network, using a dissipation value of 1 can be extended to describe the perturbation centrality of an edge, where the perturbation centrality of the edge connecting nodes i and j is the reciprocal of the silencing time obtained when the same perturbation was started from nodes i and j at the same time. Testing these edgetic perturbations on benchmark networks revealed that edgetic perturbations show the same basic properties as (single) node-based perturbations (cf. Figure 1 and Figure S10 of Text S1).

Benchmark graphs
Scale-free, modular benchmark graphs were generated using the unweighted, undirected benchmark graph generating tool of Lancichinetti and Fortunato [1]. Double edges were removed from the generated networks. Random seed values of 59, 87 and 88 were used to generate three sets of networks. When 7 sets of networks were generated, the additional seeds 19, 20, 42 and 85 were used. These benchmark graphs had 4,000 nodes and 13,785 ± 421 edges. The ratio of inter-modular edges was set to 0.05 in the case of the networks termed as "networks with pronounced modules", and 0.4 for the networks termed as "networks with fuzzy modules". Networks with no overlapping nodes were used in all cases except for the testing of inter-modular nodes (Figure 2 of the main text), where 200 overlapping nodes were generated, each belonging to two separate modules. A total of 28 networks of this type were created. For the ITM-Probe simulations in this Supplementary Information the number of nodes was 1,000 and the number of edges was 7,825 ± 133 with the ratio of inter-modular edges varying between 0.05 and 0.85 in steps of 0.05 for a total of 51 networks. None of the networks generated for the ITM-Probe comparison contained overlapping nodes. Detailed information about the generated benchmark graphs, as well as the exact commands used for generation is available in Table S7. Benchmark graph data can be downloaded from our website: http://turbine.linkgroup.hu.

Protein structure networks
The protein structure networks of Escherichia coli Met-tRNA synthetase were generated from the starting and equilibrated state of the molecular simulation of the E. coli Met-tRNA synthetase/tRNA Met /Met-AMP complex corresponding to the substrate-free and substratebound forms of the enzyme, respectively. The structure for the substrate-free form of E. coli Met-tRNA synthetase is available (pdb ID: 1QQT) [22]. However, the substrate-bound conformation containing tRNA Met and Met-ATP was not experimentally solved yet, but was created using molecular simulation software using the known substrate-bound structure of Aquifex aeolicus Met-tRNA synthase as a template, as described and kindly shared by Ghosh and Vishveshwara [5]. The protein structure network was obtained from the PDB data with the help of the RINerator software [23], which uses the Probe program [24] for calculating interaction strengths. Probe returns negative values for repulsive interactions; we have taken the absolute value of the interaction strength for all interactions, since both repulsive and attractive interactions transmit perturbations. In contrast with the previous article of Szalay-Bekő et al [19], where multiple edges adversely affect calculations, in Turbine, multiple edges are neutral, so the full network was used with every edge and self-loop intact instead of averaging the strengths of multiple links into a single one. The rationale behind this is to retain as much information as possible from the original file. The protein structure network had 547 nodes, since the first 3 N-terminal amino acids were not participating in the network and all ligands and cofactors, including the tRNA were removed. The final weighted, undirected protein structure networks of the substrate-free and substrate-bound enzyme contained 6,901 and 6,744 edges, respectively.
Protein structure networks of the substrate-free and substrate-bound forms of rabbit cytochrome P450 2B4 protein were also created with the RINerator software [23] by using the 1PO5 and 1SUO pdb files of the Protein Data Bank, and taking the absolute value of the resulting interaction strengths. The surrounding water molecules, the ligands and the cofactors were removed from the networks, resulting in 465 nodes with 6,278 edges in the substratefree (open) conformation and 465 nodes with 6,409 edges in the substrate-bound (closed) conformation, both undirected, weighted networks. Protein structure network data can be downloaded from our web-site: http://turbine.linkgroup.hu.

Yeast protein-protein interaction networks
The Filtered Yeast Interactome (FYI) is a high-fidelity yeast protein-protein interaction dataset containing data consistently obtained using several different methods [6]. The downloadable data set in the Supplement of the article contained 1,302 proteins (nodes) having 2,312 interactions (edges). The giant component of this network had 695 nodes and 1,614 edges.
The "Database of Interacting Proteins yeast interactome (release 2005)" network is the giant component of the unweighted and undirected yeast protein-protein interaction network assembled by Ekman et al. [25] using the 2005 March compilation of the Database of Interacting Proteins [26] consisting of 2,444 nodes and 6,271 edges covering approximately half the proteins of yeast genome. Besides the rather high confidence of its data, we choose this network, because it was used in the identification of party and date hubs, an interesting dynamic feature of protein-protein interaction networks [25] and its properties were assessed in our earlier publications [19].
The "Database of Interacting Proteins yeast interactome (release 2010)" network was created from the 2010 October compilation of the Database of Interacting Proteins [26]. Only highfidelity interactions marked as "core" were included in the network, yielding a giant component of 1,884 nodes and 4,234 edges.
Interaction weights of yeast proteins were obtained from the yeast whole-genome mRNA expression dataset of Holstege et al. [27] containing data of 6,180 yeast genes (5461 with expression levels, and one gene with two different expression levels). Missing data (719 nodes total, less than 12%) were substituted as the ln-transformed average expression level of all other proteins, 0.9205, taking into account that the distribution of expression data is approximately lognormal [28]. For calculating the changes of expression levels in different types of stress, the datasets of Gasch et al. [29] were used, which describe the relative changes of expression levels based on a set of microarray data. Thus, the stressed expression levels were calculated by multiplying the Holstege baseline with the Gasch relative changes according to the previous article [30]. The particular datasets used for the different stresses were the following: "Heat Shock 15 minutes hs-1" (25°C to 37°C heat shock for 15 minutes) for the heat-shocked network; "constant 0.32 mM H 2 O 2 (30 min) redo" (0.32 mM hydrogenperoxide treatment for 30 minutes) for the oxidative stress, and "1 M sorbitol -15 min" (hyper-osmotic shock using 1 M sorbitol for 15 minutes) for the osmotic stress. Detailed experimental data is available in the article describing the dataset [29]. Edge-weights of nonstressed and stressed protein-protein interaction networks were generated from the expression data by multiplying the abundances of the two connected proteins as described earlier [30], since larger concentrations of involved proteins make their interaction more likely. Thus all final interactomes were weighted and undirected.
The use of even larger interactomes such as BioGRID or STRING were also considered, but was dropped due to computational constraints, since there were 6 magnitudes of difference between weighted out-degrees. Adding the fact that the weight distribution was approximately lognormal (so most of the nodes had low weighted outdegrees), and the required constraint that the maximum weighted outdegree should be no more than 1 ( 1 1 that unrealistically high analysis times would have been required to attenuate the perturbations, since enforcing an upper limit for the maximum weighted outdegree results in median weighted outdegrees becoming extremely low, which in turn makes propagation speed disproportionately slow. Protein-protein interaction network data can be downloaded from our web-site: http://turbine.linkgroup.hu.

Metabolic networks
Metabolic networks were created by Balázs Szappanos (Biological Research Centre, Hungarian Academy of Sciences, Szeged, Hungary), and were the same as used in the papers of Mihalik and Csermely [30] and Szalay-Bekő et al. [19]. Escherichia coli metabolic network contained 249 metabolites (nodes) and their 730 transformations (edges), while the Buchnera aphidicola metabolic network contained 190 nodes and 563 edges. These networks were constructed based on the primary data of Feist et al. [9] and Thomas et al. [8], respectively. Frequent cofactors were deleted from the networks, except of those metabolic reactions, where cofactors were considered as main components. For the better comparison of networks, metabolic reactions were taken irreversible, and flux balance analyses (FBA) were performed resulting in weighted networks. All flux quantities were minimized, whereas reactions non-affecting the biomass production were considered having zero flux. Weights were generated as the mean of the appropriate flux quantities in absolute value, except the case when one of the fluxes was zero that automatically resulted in a zero weight [30]. For E. coli, data from rich medium was used to make the metabolic network more similar to the network of B. aphidicola. Final networks were thus weighted and undirected. Metabolic network data can be downloaded from our web-site: http://turbine.linkgroup.hu.

School friendship network
The social network was community-44 from the Add Health survey 2 as described by James Moody [31] and Mark Newman [32]. This network has an approximately equal number of black and white students and 4 well-developed, rather dense communities. The network contains 1,147 students with 6,189 directed edges between them. In our current study directed parallel edges were merged into a single undirected edge with a weight equal to the sum of the original weights, and only the giant component of the network was used. This process resulted in a weighted undirected network consisting of 1,127 nodes and 5,096 edges with weights between 1 and 12. Network data can be downloaded from our web-site: http://turbine.linkgroup.hu.

Network measures and their calculation
The degree of a node is defined as the number of edges of the node. More precisely for the perturbation simulations we have to consider the number of outgoing edges (i.e. the outdegree). However, all networks we used were undirected, so the degree of a node was equivalent to its out-degree in all cases. Weighted degree was calculated by summarizing the edge weights of all edges for a given node. Degree and weighted degree were calculated using the built-in algorithm of Turbine.
PageRank is a random-walk based measure [33], where the outbound edges of a node increase the centrality of those nodes for which the given edge is an inbound edge. The exact value of increase is proportional to the current PageRank value of the edge's outbound node. Multiple iterations of this method yielded a converging result [33]. PageRank values were calculated using the algorithm of the igraph [34] package.
Closeness centrality [35] is defined as the mean geodesic distance (mean shortest path) between a given node and all other nodes reachable from it. Betweenness centrality [36] gives the (relative) number of shortest paths between every two nodes in the network, which include the examined node. Closeness centrality and betweenness centrality were calculated using the Pajek [37] package.
Community centrality is a measure coined for the ModuLand overlapping community structure determination program [19], which measures the centrality of a node separately for different communities. Community centrality was calculated by the ModuLand Cytoscape plug-in [19].

Network modularization
Community structures of networks were determined by the ModuLand Cytoscape plug-in [19] with basic settings. A threshold of 0.9 was used for merging highly correlated modules in all networks. For the school friendship network, modules of the second hierarchical level were used (where meta-nodes of the level represent modules of the original network and metaedges of the level represent the overlap of the modules of the original network as described in [19]) to obtain the 4 densely connected communities of the network [32,38].

Determination of the "number of visited nodes" measure of the ITM-Probe method
To compare the "number of visited nodes" measure of the ITM-Probe method [39] with the perturbation centrality obtained by using the Turbine program, the recently released standalone version of the ITM-Probe method [39] was used, since calculating the ITM-Probe results for all nodes on a network would take a tremendous amount of time using the webbased interface. A plug-in was written for Turbine that converts any complex network in Turbine format (needed for the fast-calculation of the Turbine program) to the format required by ITM-Probe (JSON). This plug-in can be downloaded from the Turbine web-site at http://turbine.linkgroup.hu. The ITM-Probe method was used with its emitting model with a damping factor of 0.85. The main script file of the ITM-Probe method was executed separately for every single node in a certain network, and all output was concatenated into a single text file. The "number of visited nodes" measures were extracted from the resulting ITM Probe text file with an awk script.