Using Effective Subnetworks to Predict Selected Properties of Gene Networks

Background Difficulties associated with implementing gene therapy are caused by the complexity of the underlying regulatory networks. The forms of interactions between the hundreds of genes, proteins, and metabolites in these networks are not known very accurately. An alternative approach is to limit consideration to genes on the network. Steady state measurements of these influence networks can be obtained from DNA microarray experiments. However, since they contain a large number of nodes, the computation of influence networks requires a prohibitively large set of microarray experiments. Furthermore, error estimates of the network make verifiable predictions impossible. Methodology/Principal Findings Here, we propose an alternative approach. Rather than attempting to derive an accurate model of the network, we ask what questions can be addressed using lower dimensional, highly simplified models. More importantly, is it possible to use such robust features in applications? We first identify a small group of genes that can be used to affect changes in other nodes of the network. The reduced effective empirical subnetwork (EES) can be computed using steady state measurements on a small number of genetically perturbed systems. We show that the EES can be used to make predictions on expression profiles of other mutants, and to compute how to implement pre-specified changes in the steady state of the underlying biological process. These assertions are verified in a synthetic influence network. We also use previously published experimental data to compute the EES associated with an oxygen deprivation network of E.coli, and use it to predict gene expression levels on a double mutant. The predictions are significantly different from the experimental results for less than of genes. Conclusions/Significance The constraints imposed by gene expression levels of mutants can be used to address a selected set of questions about a gene network.


Introduction
Living systems are typically able to maintain their physiological state under environmental changes and isolated genetic mutations [1]. This robustness, referred to as homeostasis [2] or canalization [3,4], is achieved through feedback within highly connected regulatory networks of genes, proteins and metabolites [5][6][7][8][9][10]. For example, an action that reduces the expression of one gene may cause coordinate changes in other nodes to leave the physiological state unaffected. If a genetic mutation blocks one pathway, other avenues on the associated network may take its place. Unfortunately, this systemic stability often makes it difficult to eliminate defects in a biological network, as evidenced by the surprising lack of efficacy of many drugs that were designed to act on single molecular targets [11,12]. The coupling can also lead to side effects from medications. For example, anti-inflammatory COX-2 inhibitors (e.g., Vioxx) cause adverse cardiovascular effects due to a concomitant imbalance of the lipids prostacyclin and thromboxane A2, which lie on the same network [13]. Clearly, the most effective and least detrimental changes in a biological process are implemented by altering the system in its entirety. This task requires predictive mathematical models which can be constructed from experimental data. In this paper, we propose an approach for such a construction.
There are hundreds of genes, proteins, and other molecular participants associated with most biological processes. Gene regulatory networks model all interactions between these nodes. However, the forms of these dependencies, as well as kinetic parameters such as reaction rates and diffusion constants are, at best, only known approximately [14]. It is unlikely that gene regulatory networks which are sufficiently accurate to make quantitative predictions on the underlying biological processes will be available in the near future [15,16].
Many approaches to reduce the complexity of regulatory networks have been proposed [5]. Small modules or network motifs [17,18] associated with specific tasks have been identified. Boolean variables [19] can reduce the complexity, although the coarse-graining will limit predictability to qualitative characteris-tics such as bifurcations. In gene influence networks [14,20], a gene, its transcript, and protein are represented by a single node, which is quantified by the expression level of the mRNA. Regulatory interactions between nodes of an influence network include actions mediated by other components in the network.
Consider an influence network containing N genes G~G 1 ,G 2 , . . . ,G N f g ; denote the expression level of the K th gene by X K and the state of the network by X: X 1 ,X 2 , . . . X N f g . The influence network can be modeled by a set of ordinary differential equations F : Steady states of influence networks can be obtained from DNA microarray experiments [5]. However, most influence networks contain hundreds of genes; thus, even if F K (X) are assumed to have a simple (e.g., linear) form [21], a prohibitively large number of microarray experiments will need to be conducted in order to compute F. Moreover, gene expression levels in microarray experiments have large (*10%) error bars; when N is large, the inversions needed to compute F will exacerbate the uncertainty to a level which will make predictions difficult. One possibility is to only extract partial information on these networks through inference algorithms such as Network Identification by Multiple Regression [14], and Mode-of-action by Network Identification [22].
We propose an alternative approach. Rather than attempting to construct an accurate model of a gene network, we ask what questions on the network can be addressed (perhaps approximately) using low-dimensional and highly simplified effective models constructed from empirical data. What data would be needed for the construction? Will issues addressed through the approach be useful in applications?
We first note that genes in an influence network can be partitioned into strongly coupled subgroups or clusters. This partition can be made either using co-expression under genetic perturbations [23][24][25], or through the use of the Gene Ontology (GO) database (http://geneontology.org). Our main assumption is that the behavior of all nodes within a cluster can be controlled by imposing suitable changes in a small, specially chosen, subset of its members. The set could include genes that translate to transcription factors, and would hence influence many other genes [26,27]. It may also include microRNAs within the cluster, each of which affect many genes through post-transcriptional regulation [28,29], even though their fold induction on each gene is small.
Suppose we have partitioned the N genes of the influence network into clusters, and identified a small number of genes/ microRNAs from each cluster that can be used to control the expression levels of the remaining genes. Denote the set of these nodes by S. The number n of nodes in S is much smaller than N. We will represent their expression levels by x~x 1 ,x 2 , . . . ,x n f g , and re-index the variables so that the remaining expression levels are fX nz1 , . . . ,X N g. With the new ordering, we write the state of the network as X~x int ,X ext f g where we will refer to x intx 1 ,x 2 , . . . ,x n f gand X ext~Xnz1 , . . . ,X N f g , as ''internal'' and ''external'' variables respectively.
In this paper, we limit consideration to networks with steady state solutions. When external perturbations are made on genes within S, expression levels of the remaining genes at equilibrium are determined by Eqn. (1). These steady states lie on an ndimensional surface in R N , which we denote by S S . Figure 1(a) shows a schematic (2-dimensional) solution surface for the synthetic network introduced in the Results Section.
We make the following observations on solutions of the system _ X X~F X ð Þ. First, we assume that the unconstrained system has a unique stable steady state which we denote by . The point P 0 representing it lies on S S . Next, consider the single knockout mutant DG m (assumed to be viable) obtained by knocking out the m th gene. Since x m is set externally, the m th equation of (1) is no longer valid. The solution for the expression levels is obtained by solving the remaining (N{1) equations. We denote this equilibrium by P (m)~p(m) int ,P (m) ext n o with p (m) m~0 , and represent it by P m . Since the equilibrium is associated with changes made within the set S, P m lies on S S . Figure 1. Example of an n-dimensional solution surface S S of (1). The example is chosen from the synthetic network introduced in the Results Section. (a) The surface shown represents the expression levels of the external variable X 7 as the internal variables x 1 and x 2 are modified. (b) The point P 0 representing expression levels of the wildtype and points P m , m~1,2 representing expression levels of single knockout mutants DG m lie on this surface. The EES is defined so that its solutions lie on the unique 2-dimensional plane (blue) H S passing through P 0 , and P m , m~1,2. As can be seen, due to restrictions imposed on the EES, the surfaces S S and H S are close. doi:10.1371/journal.pone.0013080.g001 Consequently, P 0 as well as P m for m~1,2, . . . ,n lie on S S , see Figure 1(b).
Our goal is to construct a system, referred to as the ''effective empirical subnetwork'' (EES), that can be computed using the gene expression levels of mutants discussed above, and whose equilibria are close to S S . Observe that P 0 and the n points P m define a unique n-dimensional plane in R N , which we denote by H S . Figure 1(b) shows the surface for the example above. The EES describes the set H S as parametrized by the internal variables. Since both S S and H S contain the points P 0 , and P m , m~1,2, . . . ,n), we expect them to be close in the region of interest.
Observe that the EES : R n ?R N is a linear function determined by P 0 , and P m , (m~1,2, . . . ,n), but is otherwise independent of F. In particular, each X K is a linear function of x 1 , . . . ,x n . Since P 0 lies on H S for each K~(nz1), . . . ,N. The coefficients a Ki can be evaluated by noting that P 1 ,P 2 , . . . ,P n lie on H S . There is one additional complication, that we illustrate using the following example. Suppose we consider a mutant where only x 1 is externally set. The remaining expression levels of the steady state of this mutant are solved using the last (N{1) components of Eqn. (1). In particular, the expression levels x 2 ,x 3 , . . . ,x n of the internal variables in this mutant depend on x 1 . In general, the internal variables themselves depend on the gene expression levels whose values are externally imposed. Thus, we expect there to be relationships between the internal variables as well. As we show in the Methods Section, these dependencies can be assumed to take the form for k~1,2, . . . ,n.
We have thus implemented two significant simplifications. The original system F : R N ?R N contained a large number (N* several hundred) of nonlinearly coupled variables. In contrast, the EES : R n ?R N has a small number (n*10) of internal variables, is linear, and can be constructed using the steady state solutions of the original system (wild-type) and n single knockout mutants. Clearly, the EES is not an accurate representation of the original network. The issue is whether there are questions about F that can be addressed using the EES. As we show below, this is indeed the case due to geometrical constraints imposed on the solution surface. Specifically, the EES can be used to predict, approximately, the expression levels of all nodes in F, when external changes are made within S; e.g., double knockout mutants. The validity of the EES construction can be tested by comparing its predictions with microarray data from such mutants. More significantly, the EES can be used to compute how the equilibrium of the system can be moved from its initial state P 0 to a prespecified set of expression levels defined by a point P aim , see Figure 2. Since P aim will, in general, not lie on the solution surface, it cannot be reached through changes within S. Instead, we can use the EES to compute P lin , which is the closest point to P aim on the plane H S , see Figure 2. Since the surfaces S S and H S are close, changes imposed on the system by the external actions are expected to be close to those computed from the EES. Below, we verify this proximity in a synthetic influence network.

A Synthetic Influence Network
In the model system, F K (X) is a linear combination of sigmoidal Hill functions; specifically, where H(X ; c)~X h = X h zc h À Á is the Hill function and the Hill index h is chosen to be 2. The action of the I th gene on the K th one is characterized by parameters a KI and c KI , which are assumed to be independent of the state X of the system. The action is activating if a KI w0 and inhibiting if a KI v0. The system is constructed so that P (0) is a steady state of Eqns. (1). Numerically, we find that model systems defined by Eqns. (1) and (4), have at most one stable equilibrium. We suspect that this is due to restrictions imposed by the fact that the sign of each partial derivative LF K =LX I is independent of the state of the system.
In order to compute the solutions to the knockout mutant DG k , we set x k~0 , and solve the remaining equations of (1) as a nonlinear least squares problem. When the normalized residue fails to fall below 10 {10 , it is assumed that the corresponding solution does not exist.
We report on a model system containing 21 nodes and shown schematically in Figure 3. We start with the three subnetworks, each of size 7. The vector P (0) for each of these subgroups consists of random entries between 0.5 and 1.5, and the matrix (c KI ) contains random values between 0 and 2. The entries of the Jacobian of the system given by Eqns. (1) and (4) at P (0) are J K I~aK I H 0 (P (0) ; c KI ); thus a KI can be computed for a given set Moving the equilibrium from P 0 to P aim by implementing changes within S. In general this is not possible because interactions between nodes force the equilibrium to remain on S S . However, it is possible to compute P lin , the point closest to P aim that can be reached by the EES. Due to the proximity of S S and H S , the point P sys obtained by projecting P lin to S S is close to P lin . Thus, it is possible to pre-determine if the movement of the equilibrium forced by the changes made in S are acceptable. doi:10.1371/journal.pone.0013080.g002 of J KI 's. Since we require P (0) to be stable, we insist that all eigenvalues of the Jacobian be negative. This is guaranteed by starting with a diagonal matrix with negative entries and performing a (random) orthonormal transformation. Once three such subnetworks are computed, their nodes are coupled by sparse, weak interactions. Each node in a subnetwork is coupled to only one from each of the other subnetworks, and the mean coupling strength is chosen to be 0.1 of the average coupling within subgroups.
The EES is to be constructed using the expression levels of single knockout mutants. As illustrated in Figure 3, mutants DG 1 , DG 3 , DG 8 , DG 10 , DG 15 , DG 18 , DG 20 , and DG 21 in our example are not viable; i.e., when the corresponding X is set to zero, the system (1) does not have a solution. The subset on which to construct the EES can contain any of the other nodes. In the work reported here, S~fG 2 ,G 4 ,G 9 ,G 11 ,G 16 ,G 19 g (genes marked in red in Figure 3). The variables X K , K~7,8, . . . ,21 are re-indexed as described before. The EES : R 6 ?R 21 is computed using the expression levels of all 21 genes at P 0 , P 1 , P 2 , P 3 , P 4 , P 5 and P 6 .
In order to illustrate the proximity of H S to S S , we use the following example, see Figure 1. Since we need to reduce the dimensionality for visualization, we fix the expression levels of (the re-indexed genes) G 3 , G 4 , G 5 , and G 6 at their values at P 0 ; for our model, fx 3 ,x 4 ,x 5 ,x 6 g~f1:1716,0:6279,0:5140,0:5128g. For each pair of values for (x 1 ,x 2 ), we solve the model system (1) for the remaining 15 expression levels. These solutions lie on 2dimensional surface in R 17 . The gray surface of Figure 1 is X 7 x 1 ,x 2 ð Þ. The 2-dimensional plane H S of the EES contains points P 0 , P 1 , and P 2 .
Next, we compare expression levels of double knockout mutants predicted by the EES with the corresponding solutions of the model system (1). The 14 viable double knockout mutants of the system are DG 1 DG 2 , DG 1 DG 3 , DG 1 DG 4 , DG 1 DG 5 , DG 1 DG 6 , DG 2 DG 3 , DG 2 DG 4 , DG 2 DG 5 , DG 2 DG 6 , DG 3 DG 5 , DG 3 DG 6 , DG 4 DG 5 , DG 4 DG 6 , and DG 5 DG 6 . In each case, the expression levels of the 4 remaining nodes in S, and the 15 nodes outside of S are compared. We differentiate between these two groups.
Results for the first group (genes in S) are as follows. Of the 56 comparisons, 46 EES predictions are within 1% of the expression levels computed from (1), 3 others are between 1{5%, and 3 between 5{10%. Results for the second group (genes outside of S) are as follows. Of the 210 expression levels to be compared, 170 EES predictions are within 1% of the expression levels computed from (1), 30 more are between 1{5%, and 7 others are between 5{10%.
We finally demonstrate how the equilibrium of the system can be moved (near) to a pre-specified set of expression levels. The original equilibrium of our example is P We want to find out how the expression levels of genes in S need to be changed so that the system moves to, or as close as possible to, a pre-specified set of expression levels for all genes. As an example, we attempt to change the equilibrium of the system to P aim (see The Euclidean distances between the points are d P 0 ,P aim ð Þ0:55, d P 0 ,P lin ð Þ0:40, d P aim ,P lin ð Þ0:38, and d P lin ,P sys À Á 0:15. Thus, we attempted to move the equilibrium from P 0 to a point P aim that was a distance 0.55 away, but were only able to move it on H S to a point P lin , which is a distance 0.40 away from P aim . However, P lin is only a distance 0.15 from the point P sys , which is the solution of the original system when expression levels of the internal variables are set to p (lin) int . We have found that P lin and P sys are close in studies of several model systems and for many points P aim . Thus, the EES can be used to pre-determine, approximately, the equilibrium of the original network when changes made within S.

Transcriptional Regulatory Network in E.coli
The EES can be constructed using microarray data from the wildtype and single knockout mutants of genes in S. It can then be used to predict gene expression levels of other mutants. This observation is of interest due to the availability of previously published data on an oxygen deprivation network in E.coli [30,31]. Ref. [32] reports gene expression levels in the wildtype and in single knockout mutants of key transcriptional regulators in the oxygen response, namely DarcA, DappY , Dfnr, DoxyR, and DsoxS, as well as in the double knockout mutant DarcADfnr, in aerobic and anaerobic glucose minimal medium conditions. Since the oxygen deprivation network is not fully active under aerobic conditions, we focus on the behavior of E.coli under anaerobic conditions.
It should be noted that gene expression levels in E.coli are unlikely to be in a steady state; rather, the expression levels reported in Ref. [32] are averages from a group of cells in various stages in the cell cycle. The analysis in this Section assumes that the computation of the EES and its predictions are valid for these averages. Preliminary results from our current work on systems exhibiting circadian rhythms validate this assumption.
We construct G as follows. In the Gene Ontology classification assigned by Affymetrix, the five genes arcA, appY , fnr, oxyR, and soxS have a common term ''GO:0006355, Regulation of transcription, DNA-dependent.'' Moreover, this is the only common classification for the five genes. We choose G to be the set of all genes carrying this term. The full list of 299 genes is given in Supporting Information S1.
The data set GSE1121 of the GEO site (www.ncbi.nlm.nih.gov) [32] provides gene expression levels for four replicates of the wildtype and three each for the mutants. The replicates are used to estimate the mean and standard deviation for the expression levels of each gene in G, see Supporting Information S1. Since the EES is linear, we rescale the expression levels of each gene by its (mean) value in the wildtype. Table 1 gives these rescaled expression levels for the internal variables ½arcA, ½appY , ½fnr, ½oxyR, and ½soxS under anaerobic glucose minimal medium conditions. Note that error estimates for the expression levels of several genes is large. This is the reason that a reduced network is essential in order to make useful predictions. Second, as seen from Table 1, reported expression levels of the gene G k in the mutant DG k is nonzero. Presumably, what is measured are non-functional analogs of the corresponding genes. In calculating the EES, we set these expression levels (shown in parentheses in Table 1 The next step is to compute the EES predictions for ½appY , ½oxyR, and ½soxS in the double knockout DarcADfnr. This is done using the matrix (5) and setting ½arcA and ½fnr to zero. Expression levels of the remaining genes in S, predicted using the EES, are ½appY EES~{ 0:23, ½oxyR EES~0 :91, and ½soxS EES0 :78. We need to determine, at a 5% level of confidence, if these predicted values are consistent with those from the replicates of the double mutant. The comparison is made using the t-test (ttest in MATLAB, The Mathworks, Inc.), and it is found that the null hypothesis, that experimental data comes from a (normal) distribution with mean equal to the computed gene expression level, is rejected at the 5% level only for appY.
Next, we implement the analysis for genes outside of S. The null hypothesis cannot be rejected at the 5% level for 213 of the 294 genes. The three experimental values of the expression level of each gene in the double knockout, the corresponding predictions of the EES, and the test statistic t are given in Supporting Information S1. Since the Student's distribution associated with the comparison has two degrees of freedom, the null hypothesis is rejected when tw4:303. The histogram of the test statistic for the 299 genes is shown in Figure 4(a). In Supporting Information S1, we highlight the genes for which the null hypothesis is rejected. We emphasize that, unlike in many prior studies whose assertions are limited to whether genes in mutants are up/down regulated, our predictions are quantitative.
The proximity of the predicted and experimental values is not due to a lack of variability in the expression levels of genes in G. We verify this by computing the differential expression of genes in the mutants. Figure 4(b) shows the histogram of the largest deviations from the wildtype, normalized by the standard deviation (between replicates) in the wildtype. Expression levels of over half the genes deviate by more than 2 standard deviations.

Discussion
An accurate model of the gene regulatory network associated with a hereditary disease can be used to compute the most effective and least detrimental treatment to prevent its onset. Unfortunately these networks contain hundreds of genes, proteins, and other molecules whose interactions are only partially known [5,[8][9][10]. It is unlikely that detailed models of such networks will be available in the near future. The question raised in the paper is whether information needed to move the steady state of a network can be deduced from an analysis of highly simplified, empirically determined models. The data used for analysis is obtained from microarray experiments.
Our approach is as follows. We first identify a (relatively) small set S of n nodes (internal variables) in the influence network which Table 1. Normalized gene expression levels in the wildtype and mutants. can be used to affect the remaining genes. (Mathematically, for each external variable X I , we require one or more of LF I =Lx k , where x k are the internal variables, to be non-vanishing.) Next, we limit consideration only to steady states of the network as internal variables are modified. Finally, this solution surface S S is approximated using the unique n-dimensional plane H S defined by the gene expression levels of the wildtype and the n single knockout mutants in S; the model system whose solutions lie on the plane is the EES. Some genes may be critical in the sense that the organism may not be viable when they are knocked out. There were similar nodes (shown in black in Figure 3) in our synthetic model. In our approach they cannot be used as internal nodes. However, if they need to be utilized, the EES can be computed using heterozygous mutants or those where the expression level is up/down regulated to a value other than zero through, for example, transfection [33,34].
We emphasize that, due to the reduced dimensionality and its linearity, we do not expect the EES to be an accurate model of the original system. However, because of the geometrical constraints, it is possible to use the EES to (approximately) compute answers to a very limited set of questions about the system. Specifically, they are questions on gene expression levels when external changes are made within S. As an example, the EES can predict gene expression levels in double knockout mutants. We tested the predictions using previously published data on a double knockout mutant in an oxygen deprivation network of E.coli. (Here, as in most cases, the underlying network is unknown.) We identified the group of 299 genes to be studied using the Gene Ontology database. The EES was computed using the expression levels of five single knockout mutants, and used to predict their expression levels in the double mutant. The predictions were significantly different from the experimentally obtained expression levels for less than 30% of genes.
Interestingly, the EES can be used to compute how expression levels of genes within S need to be changed so that the equilibrium of the entire network is moved from its initial state P 0 to, or as close as possible to, a pre-specified position P aim . We showed through an example that the solution computed using the EES is close to that of the full network. However, the efficacy of the move depends on the proximity of P aim to the surface H S . If P aim is far from H S , then the set of internal variables need to be expanded in order to find acceptable solutions.
Before concluding, we briefly address a few issues; the first is the observation that, in the parameter range considered, the model system given by Eqns. (1) and (4) have at most one stable steady state. Even though we required P (0) to be stable (by an appropriate choice of eigenvalues of the linearization), non-linear systems can, in general, be expected to have additional solutions. However, our model has a special feature: the signs of the partial derivatives LF K =LX J are independent of the state of the system. The analogous biological statement is that, if nodes J and K are isolated, the action of node J on node K increases in magnitude as X J increases. Is this condition, combined with the choice of eigenvalues, sufficient to guarantee a uniques stable solution? We are currently studying this question. It should be noted that the uniqueness of solutions has been proven for several other classes of monotonic nonlinear systems [35][36][37].
The second issue involves the partitioning of genes into clusters and the choice of internal variables. Internal variables in the oxygen deprivation network of E.coli were already determined from the experiments reported in Ref. [32]. We used the GO classification to identify nodes belonging to the network. Different approaches can be used to partition genes into clusters when biological classifications are not available. For example, one could use topological (e.g., persistent homology [38,39]) or graph theoretic (e.g., spectral clustering [23], community clustering [24]) methods. Integrated genomic analysis, which successfully identified subtypes of gliobastoma [40], can also be used in clustering genes through the use of heat maps [41,42]. The choice of internal variables requires biological input. Mathematically, the requirement is that each node in the cluster can be affected by suitable changes in internal variables. As we mentioned, genes that translate to transcription factors, or microRNAs [28,29] within the cluster, could act as internal nodes.
Third, can one estimate the proximity of H S to the solution surface S S ? Differences in gene expression levels of double knockout mutants are one measure of the proximity. Alternatively, we could use the corresponding differences in heterozygous single knockouts (whose expression levels are roughly half of the wildtype) and the predictions of the EES.
We believe that approaches similar to those outlined here can prove useful in treating complex genetic diseases by helping identify optimal combinations of up/down regulation of genes (or optimal combinations of single target drugs) that have minimal The good agreement in (a) is not due to lack of variation in the gene expression levels between the wildtype and mutants. The histogram shows the largest differential expression level of mutants, normalized by the standard deviation for the wildtype (computed from the four replicates given in the data set GSE1121 of the GEO database [32]). doi:10.1371/journal.pone.0013080.g004 side effects and are most effective in moving the equilibrium of the network in its entirety to a preferred state. We hope our work motivates studies on this issue.

Construction of the EES
As illustrated in Figures 1, the EES is constructed so that, as internal variables are modified, the solutions of the system lie on the n-dimensional plane H S . Thus the external variables are linear in x k 's, and consequently, have the form given by Eqns. (2). We need to compute the coefficients a Ki for K~(nz1),(nz2), . . . ,N and i~1, . . . ,n. This is done by noting that the expression levels p (m) of each of the n mutants DG 1 ,DG 2 , . . . ,DG n satisfies Eqn. (2), thus providing the conditions necessary to compute a Ki 's.
We note, however, that the internal variables themselves are interrelated. For example, in the single knockout mutant DG 1 , all expression levels (other than x 1 ) are determined by solving the last (N{1) equations of (1). Thus, we need to derive relationships between the internal variables. Consider for example, the dependence of x n on the remaining internal variables. In order to find its form, let us reduce the set of internal variables to x 1 ,x 2 , . . . ,x n{1 f g ; x n is now an external variable. Hence, with the approximations used in the paper, x n is a linear combination of the remaining internal variables. Since P (0) is one solution of the system Similar relationships are obtained for the other internal variables.

Supporting Information
Supporting Information S1 Tables