^{1}

^{2}

^{*}

^{1}

^{1}

The authors have declared that no competing interests exit.

Conceived and designed the experiments: GR NJG GMC. Performed the experiments: GR NJG. Analyzed the data: GR NJG. Contributed reagents/materials/analysis tools: GR NJG GMC. Wrote the paper: GR NJG GMC.

Advances in computational metabolic optimization are required to realize the full potential of new

A deeper understanding of biological processes, along with methods in synthetic biology, is driving the frontier of metabolic engineering. In particular, a better representation of cell metabolism will enable the engineering of bacterial strains that can act as factories for valuable biochemical products, from medicines to biofuels. Models which predict the behavior of these complex biological systems enable better engineering design as well as a more comprehensive understanding of fundamental biological principles. Here we develop a new method, called Redirector, for modeling metabolic alterations, and their relationship to cell growth. This method optimizes genetic engineering changes to achieve metabolite production using a new representation of the metabolic impact of genetic manipulation, which is more biologically realistic than existing models. We discover proven and novel engineering targets to improve fatty acid production, correctly predicting how different combinations of genes build upon one another. This work demonstrates that Redirector is a powerful method for designing cell factories and improving our understanding of metabolic systems.

Building a better predictive understanding of genome-scale metabolic networks is critical to fully harnessing the power of bacterial metabolism in general and for designing biofactories in particular. Biofactories have been engineered for a variety of products including pyruvate

While kinetic models have been developed to simulate biological networks

FBA has been used to successfully design production strains by optimizing metabolic alterations such as gene knockouts

Representing genetic alterations to a metabolic system (specifically up- and down-regulation) as direct changes to flux boundaries presents a number of problems. Firstly, representing metabolic alterations in this way is difficult in the common case when a single enzyme affects many reactions. Imposing different flux bounds on reactions controlled by the same enzyme creates a disconnect with experimental implementation. However, applying the same flux constraints to all reactions controlled by one such enzyme will often fail to produce the desired metabolite. Second, using a limit on one flux is sufficient to control a whole pathway, and all up-stream reactions, in an FBA model. Further, multiple metabolic alterations will often only be as effective as the tightest limit imposed, and metabolic alterations beyond the single strongest will often provide no additional benefit.

To address the above issues and in order for metabolic design optimization to reach its full potential, metabolic alterations must be represented in a more biologically relevant way. Hence, we develop a framework modeling metabolic alterations using the FBA objective to represent the balance of resource allocation between growth and metabolite production. Metabolic alterations (up- and down- regulations) are modeled through incentivizing flux changes by adding reaction fluxes to the existing FBA biomass objective with associated positive or negative coefficients. These coefficients determine the relative strengths and direction of the impact the metabolic alterations have on the reaction fluxes. We can then find a set of metabolic alterations optimized for the production of a metabolite of interest, using a bilevel optimization approach. To make this possible, we develop a method by which an optimal set of reactions, grouped by enzyme, can be included in, or excluded from, the metabolic (inner) objective. We maintain the original flux that makes up the biomass function, along with the added incentives on fluxes towards metabolite production in the inner FBA objective, to ensure that selected metabolic alterations toward metabolite production account for cellular growth.

It is important to enable Redirector to discover a growing set of metabolic alterations that work synergistically to drive flux towards production to better overcome growth driven adaptation and regulatory mechanisms. To achieve this, we improve the iterative local search in GDLS

A. Here are the novel aspects of the Redirector algorithm brought together to depict the algorithmic flow. An iterative local search alternates between a bilevel optimization using objective control and the progressive target discovery. Objective control produces enzyme genetic alterations (+targets) and the associated metabolite production level, while the progressive target discovery increases the progressive growth parameter, or γ (+growth), based on the enzyme optimization targets and metabolite production level, from the previous iteration. B. The objective control method involves an FBA objective that includes the biomass (growth) flux as well as a selected set of enzyme associated reaction fluxes, which are up- or down-regulated. An optimized set of enzymes is included in the objective to drive the production of the metabolite of interest. The dotted lines show that an enzyme appearing in the objective incentivizes changes in the associated reaction fluxes. C. The progressive target discovery method adjusts a coefficient on the biomass term, used in objective control, after each iteration of the optimization. Here we show a decision tree for the adjustment of the progressive growth parameter based on the discovery of new targets, and the metabolite production level from the previous iteration.

In objective control (

Because both the biomass and reactions leading to the production of a metabolite appear in the objective, if a high metabolite production level is achieved, discovery of new targets will only continue after increasing the incentive for growth in the objective. This process incentivizes resources to biomass and necessitates new metabolic alterations to meet the metabolite production goal, dictated by the outer objective of the bilevel optimization. For this reason, we have developed progressive target discovery. After each iteration of bilevel optimization using objective control, the Redirector algorithm checks if there have been new metabolic engineering targets discovered, and also checks the level of production of the metabolite for which we are optimizing.

The optimal set of targets to drive metabolite production is in large part determined by their relative weighting in the objective. Reaction fluxes are included in the objective using weights, called redirection coefficients (β), selected from a set of values we call the redirection coefficient library. In this work we focus on three methods we have developed for constructing the coefficient library which we term the flat, power series and sensitivity redirection coefficient libraries (discussed in the Redirection Coefficient Library section in the Supporting Information ^{n},−2^{n}) where n≤0, and the sensitivity library is created by performing sensitivity analysis for every reaction on each of the growth and production objectives. The flat library is the simplest approach, giving each reaction equal weight but not allowing further tuning of the targets. The power series library allows for tuning the impact of metabolic targets but is computationally intensive. Finally, the sensitivity library indicates how a reaction flux influences the metabolite production or growth, with only those reactions that directly affect one of these objectives getting a coefficient. Thus, using the sensitivity redirection coefficient leads to a smaller pool of enzymes from which to select targets, and therefore, a less computationally intensive optimization.

To demonstrate metabolic engineering designs produced by the Redirector framework, we focus on the production of the fatty acids, in particular myristoyl-CoA (C14:0-CoA), using the iAF1260 genome-scale

To provide insight into the progress of design construction using the Redirector Framework, we present

To give an overview of the performance of the Redirector method we present

Product | Unique Targets | Largest Design |

Malonyl-CoA | 89 | 24 |

Myristoyl-CoA (14:0) | 132 | 32 |

Myristoleoyl-CoA (14:1) | 120 | 33 |

Palmitoyl-CoA (16:0) | 144 | 25 |

Palmitoleoyl-CoA (16:1) | 131 | 28 |

Stearoyl-CoA (18:0) | 103 | 28 |

Oleoyl-CoA (18:1) | 96 | 21 |

To show an illustrative example of a Redirector design, we optimize for the production of myristoyl-CoA, and chose the design from the optimization results found after 15 iterations using the largest search size

This optimization was run at neighborhood size 6 with flat coefficient library. Blue arrows indicate increased enzymes while the red arrows are decreased. The orange box indicates the production objective. A. Pentose phosphate pathway B. Glycolysis C. Fatty acid biosynthesis and β-oxidation.

Enzyme Name | Flat | Power Series | Total Flux Change | Flux Change Count |

Biomass | 155.25 | 186.41 | −0.59 | (−): 1, no change: 0, (+): 0 |

fabF or fabB | 1 | 1.5 | 9.87 | (−): 3, no change: 2, (+): 7 |

fabZ or fabA | 1 | 0.5 | 9.39 | (−): 3, no change: 2, (+): 7 |

fabG | 1 | 1.5 | 9.34 | (−): 3, no change: 2, (+): 7 |

fabD and acpP | 1 | 1.5 | 9.34 | (−): 0, no change: 0, (+): 1 |

accABCD | 1 | 0.5 | 9.34 | (−): 0, no change: 0, (+): 1 |

gpmA or gpmI or gpmB | 1 | 0.5 | 2.11 | (−): 0, no change: 0, (+): 1 |

aspC | −1 | −0.5 | 1.72 | (−): 0, no change: 0, (+): 1 |

fabK or fadD | 1 | 0.5 | 1.54 | (−): 0, no change: 9, (+): 1 |

gapA | 1 | 0.5 | 1.09 | (−): 0, no change: 0, (+): 1 |

pgk | 1 | 0.5 | 1.09 | (−): 0, no change: 0, (+): 1 |

tktB or tktA | 1 | 0.5 | 1.09 | (−): 0, no change: 0, (+): 2 |

ppk | 1 | 1.5 | 0.30 | (−): 0, no change: 1, (+): 1 |

rpiA or rpiB | −1 | −1.5 | 0.12 | (−): 0, no change: 0, (+): 1 |

fadE | −1 | −1 | 0.00 | (−): 0, no change: 8, (+): 0 |

acs | −1 | −0.5 | 0.00 | (−): 0, no change: 1, (+): 0 |

idi | −1 | −0.5 | −0.001 | (−): 1, no change: 0, (+): 0 |

fabB | −1 | −0.5 | −0.27 | (−): 3, no change: 1, (+): 0 |

folD | −1 | −0.5 | −1.20 | (−): 2, no change: 0, (+): 0 |

gdhA | −1 | −0.5 | −5.04 | (−): 1, no change: 0, (+): 0 |

acnB or acnA | −1 | −1.5 | −8.72 | (−): 2, no change: 0, (+): 0 |

In

The metabolic engineering targets shown in

Fluxes through the longer chain fatty acids, bigger than C14:0, do not increase, as we have added an export reaction for myristoyl-CoA into the model, such that myristoyl-CoA can be exported when it is overproduced. This selectivity for the carbon chain size of fatty acids and biofuel product export is biologically relevant. It has been shown that, with up-regulation of

The order in which genetic manipulations should be targeted is important information since the efficacy of some genetic alterations can depend on other genetic changes being made beforehand. Genetic manipulation is often carried out serially with a selection, and if there is an order of efficacy for these targets it is important to understand that order for the selection to work. This effect is illustrated when trying to produce fatty alcohols in

Boxes indicate those enzymes that can work alone while the ovals are those enzymes that require one other enzyme to increase myristoyl-CoA production. Those enzymes in blue are increased while those in red are decreased. Darkened lines indicate the dependency groups which produce at least 90% of maximum output of myristoyl-CoA. A. Enzyme group dependencies for optimization using a flat redirection coefficient library. B. Enzyme group dependencies using a sensitivity redirection coefficient library.

The sensitivity redirection coefficient library dependency targets, while sparser, also center on fatty acid biosynthesis and degradation targets with the exception of

In order to give a more complete picture of the dependency sets we present

Dependency Set | Dependency Size | Sensitivity/Flat | Production |

fadA or fadI, fabH | 2 | Sensitivity | 1.54 |

fadA or fadI, fabB or fabF | 2 | Both | 1.54 |

fabB or fabF, fadE | 2 | Both | 1.54 |

acpP and fabH, fadA or fadI | 2 | Both | 1.54 |

fabI, fadE, fabB | 3 | Sensitivity | 1.54 |

fabH, fadE, fabB | 3 | Sensitivity | 1.54 |

fabA, fabI, fadE | 3 | Sensitivity | 1.54 |

fabA, fabH, fadE | 3 | Sensitivity | 1.54 |

fabA, fadA or fadI, fabI | 3 | Sensitivity | 1.54 |

acpP and fabH, fadE, fabB | 3 | Sensitivity | 1.54 |

acpP and fabH, fabA, fadE | 3 | Sensitivity | 1.54 |

fadE | 1 | Both | 1.54 |

fadA or fadI, fabI | 2 | Sensitivity | 1.51 |

fabG, fadA or fadI | 2 | Flat | 1.51 |

fadA or fadI, fabA or fabZ | 2 | Flat | 1.51 |

trpC, pssA, pgk | 3 | Sensitivity | 1.15 |

trpC, pgk, psd | 3 | Sensitivity | 1.15 |

trpC, pgk | 2 | Sensitivity | 1.13 |

gltA, aceEF and lpd, pgk | 3 | Flat | 1.02 |

aceEF and lpd, pgk, acnAB | 3 | Flat | 1.02 |

We further illustrate the importance of the Redirector approach, using objective control, by contrasting it with the limitations of using flux boundaries to model metabolic alterations. Further details beyond those presented here are included in the Supporting Information

This method proved to be problematic in two ways (

The Redirector framework provides a new capability in modeling metabolic alterations, using the FBA objective. This capability is harnessed to develop designs incorporating many metabolic alterations, which work in concert to drive flux in new directions and result in high production cellular metabolic factories. The objective control approach provides a more biologically relevant model of metabolic alterations, avoiding the unrealistically unlimited impact of changing flux boundaries. The Redirector framework is able to successfully develop designs for pathways where multiple chemical reactions are catalyzed by single enzymes, such as those that have elongation cycles, or complex branching or alternative pathways such as fatty acid metabolism. These designs find experimentally proven combinations of engineering targets along with novel targets in intuitive as well as distant pathways. Analyzing orthogonal and overlapping designs discovered by the framework, target dependency network mapping elucidates the relative importance and relationships of metabolic targets, in order to guide metabolic engineering. All together these methods form an effective, flexible and widely applicable framework for developing metabolic engineering designs for high production strains.

To demonstrate its capacity to optimize challenging pathways, we have applied Redirector to the production of myristoyl-CoA, examining the highest growth parameter design as well as the dependency analysis of multiple designs. The single high growth parameter design rediscovers experimentally proven combinations of targets including up-regulation of

Looking at the single myristoyl-CoA design in

Using objective control, Redirector pushes fluxes in new directions finding ever higher impact metabolic engineering designs. Alternatively, OptForce, a method with similar goals as Redirector, iteratively constrains the system to find minimal sets of reactions that force more flux into the desired product. We have shown that using flux bounds based models of metabolic alterations proves challenging for fatty acid production for three main reasons. First, if flux bounds are not sufficiently strict, no viable constraint sets will be found. Second, the flux bounds discovered can lead to unrealistic constraints when mapping reactions to causal enzymes. Third, limits do not work additively which restricts the number of targets that can work together in any one design, causing experimentally proven targets for fatty acid production to be missed. To further compare these methods, we applied the Redirector framework to the production of malonyl-CoA, which has been optimized as a precursor for the production of the flavonoid naringenin using OptForce

The breadth of malonyl-CoA production target combinations also allows us to further compare our dependency network mapping to experimental results for engineering target interdependencies, elucidating those targets which are required for others to be effective. Looking at the target dependency network mapping (Supporting Information

Redirector makes it possible to model metabolic alterations in a manner more closely representing reality as changes to the catalytic landscape of the metabolic system, resulting in a new balance between synthetically created, and natural existing cellular drives. Redirector is able to represent the impact of metabolic alterations as redirection coefficients, which can be associated with each enzyme target and possible metabolic alteration. This model of balanced and interacting impacts is critical to discovering experimentally proven combinations of engineering targets. We have also shown that this objective control model of metabolic alterations is critical for enabling optimization of enzyme targets when enzymes catalyze multiple reactions. We observed when enhancing fatty acid biosynthesis, fluxes incentivized by the same enzyme had necessarily varying responses. These included reactions controlled by the same enzyme changing in opposite directions, as some increase, while other fluxes are limited by network topology or are reduced in response to changes in biomass production.

The Redirector method complements experimental techniques in strain design and development. In particular high throughput genetic manipulation techniques such as MAGE

Here we develop a novel and more biologically relevant model for representing alterations to the metabolic system. Rather than modeling metabolic alterations as directly changing reaction flux values or boundaries, the Redirector framework uses changes in an objective function, in which both engineered enzyme targets and natural biological objectives are represented. Such an objective describes an organism that has to allocate its resources to achieve a compromise between its natural cellular programming and the alterations imposed on it by human engineering, aiming to generate a desired product. This combined objective of the system is represented as:

The system objective Z^{system} reflects the combination of the growth function Z^{growth} and the metabolic engineering changes modeled in the redirection function Z^{redirecton}. In this paper the growth function is the standard FBA biomass production flux, v_{biomass}. We refer to γ as the progressive growth parameter, and it is used to tune the relative contribution of the growth function in the system objective which becomes important in the progressive target discovery. J is the set of all metabolic reactions

The impact of metabolic engineering alterations is represented by fluxes included in the redirection function. Each flux _{j}_{j}^{l}. One of the design goals of Redirector is to represent engineering changes as modifications that cause resources to be diverted away from growth to desired end products. Yet, using metabolic engineering, the magnitude of diversion cannot be forced between specific bounds. Furthermore, the flux through any particular reaction also depends on the balance of fluxes throughout the metabolic network. Redirection coefficients support this design goal because they operate only as incentives to increase or decrease target fluxes rather than hard constraints. Redirection coefficients can be thought of as ‘impact factors’ for the engineered changes, as stronger metabolic alterations can be represented as larger redirection coefficients, which will generate greater contributions to the total redirection function, in turn providing stronger incentives for a metabolic network to direct flux through the corresponding reaction.

To represent the effects of metabolic engineering changes of different magnitude and direction, each reaction _{j}^{l} each uniquely identified by l. Though each reaction can have multiple redirection coefficients, they are added together to form a single level that will be suggested as the metabolic alteration instruction for that reaction, as described below. Reaction fluxes are included in the redirection function through the use of the objective inclusion variable _{j}^{l}_{j}^{l}_{j}_{j}_{j}^{l}_{j}^{l}_{j}_{j}^{l} allowed for a particular optimization is referred to as the “redirection coefficient library”.

Using this formulation Redirector can also allow for the selection of multiple redirection coefficients for the same reaction. The final contribution of this reaction to the redirection function is then equal to the sum of the associated redirection coefficients. The sum of the associated coefficients is then considered the one singular suggested genetic manipulation for that reaction in the final target solution set with a relative strength equal to the summed value. Using multiple coefficients for the same reactions allows tuning of the level of up-regulation or down-regulation suggested for one reaction during the optimization.

This formulation of the system objective allows us to create and control which reactions can be included in the redirection function, as well as their possible contribution to the redirection function. Specifically these factors are determined by choosing the contents of the redirection coefficient library. The number of reactions considered for inclusion in the redirection function can be narrowed by limiting the reactions that get redirection coefficients or broadened by allowing multiple redirection coefficients for each flux to be included in the redirection function. The number of coefficients for each reaction flux is determined using a coefficient tuning variable (s) described in the Supporting Information

The Redirector framework selects the optimal enzymes and metabolic changes and includes the related reaction fluxes in the redirection function by choosing from a list of possible redirector coefficients as set out in the redirection coefficient library. The inclusion of a positive redirection coefficient for a flux in the redirection function incentivizes increase of that flux (this would also penalize a decrease in flux) and a negative redirection coefficient has the opposite effect. Discovery of the optimal magnitude of redirection coefficients informs the relative strength of metabolic engineering changes. The optimal set of enzyme targets is determined as much by the reactions affected, as it is by the possible redirection coefficients in the redirection coefficient library. Further details of these redirection coefficient libraries can be found in the Supporting Information

We extend the bilevel implementation of objective control to find progressively higher impact sets of interacting targets. This is achieved by harnessing the competition between the two parts of the system objective, in which the growth function directs resources to the biomass, while the redirection function directs resources to the metabolite production. Once an optimal set of metabolite production targets has been discovered and the production objective is at its optimal possible level, no more targets will need to be discovered. However, increasing the relative strength of the growth function will necessitate selecting more enzyme targets to once again achieve high metabolite production. To this end, we expand the iterative local search method to include the adjustment of the progressive growth parameter (γ). At each iteration, γ is increased to a value where more targets must be selected to increase the strength of the redirection function and satisfy the production objective. This allows the Redirector framework to build upon the set of enzyme targets at each iteration. The combined effect of these targets results in ever increasing set of incentives to drive flux towards the production objective until the maximum potential redirection function is reached.

To drive the discovery of new targets we seek to increase the value of the progressive growth parameter (γ) such that the growth term will dominate the system objective. This is achieved by finding a new value of the progressive growth parameter, γ^{new}, that will result in an effective growth function value which is at least slightly larger than the contribution of the current redirection function to the system objective. The first term in the equation is the current strength of the redirection function. This term is calculated as the included redirection coefficients β_{j}^{l} multiplied by the difference in current flux v_{j} and the flux when growth is maximized v_{j}^{max growth}, for each included reaction j. The variable δ^{progress} is a small number used to insure γ^{new} is slightly larger than the current strength of the redirection function. This new progressive growth parameter will increase the effective strength of the growth function in the system objective such that overcoming it requires the inclusion of new reaction fluxes in the redirection function as a result of new enzyme targets being selected. The logical flow of the Redirector iterative search local algorithm, which incorporates the progressive growth parameter, is detailed in ^{progress} and γ parameters in Supporting Information

Many enzyme targets generated during the progressive target discovery depend on the inclusion of other core targets before they can contribute to an increase in the production objective. To determine the order in which targets should be engineered as well as their interdependency we develop a dependency network mapping method.

To carry out the process of dependency network mapping all targets for one production objective from separate neighborhood sizes and redirection coefficient libraries are pooled in any relevant combination. Then all subsets of this target pool up to size N are searched in separate optimizations, by forming the relevant system objective and performing a single-level FBA with this objective. The resulting flux states are checked to discover if the target combinations result in at least 20 percent of the maximum possible production of the metabolite of interest and, importantly, if they result in higher production than their component target sets. Through this analysis we discover which targets work as singles, doubles etc. In this way, dependency network mapping shows which enzymes form the core of metabolic production designs. Currently we focus our dependency network mapping on the discovery of dependency target sets needed to produce the metabolic product when γ = 0.02; thus, we only require that each engineering design needs to overcome a very weak growth function. As a result, the large sets of simultaneous targets found in progressive growth driven target discovery, which require larger values of γ, are not rediscovered.

The Redirector framework is built using free, and whenever possible, open software in a flexible lightweight solution. The core is built with Python and currently uses the GLPK and SCIP solvers. LP optimizations were largely carried out in GLPK because of the ability to directly access GLPK functions from Python while MILP optimizations are carried out by SCIP for faster solving speed. Computation was carried out on the Broad Institute computational cluster. The Redirector Package including operational software code and metabolic network model files used for this publication are available at

(EPS)

(EPS)

(DOCX)

(XLSX)

(DOCX)

^{progress}. Shown here are the targets discovered by the Redirector method for the production of myristoyl-CoA (C14:0-CoA), using a search size of 4 metabolic alterations (k = 4) during iteration 3 and 4 (i = 3,i = 4). The left most column indicates the gene id of the targets. Redirection coefficients for the selected targets and the sum of fluxes through the reactions associated with those gene ids are given in the other columns. The table shows the targets discovered and fluxes through the associated reactions are completely unchanged as δ^{progress} is varied over a ranged of four orders of magnitude.

(DOCX)

(DOCX)

(DOCX)

(DOCX)

(DOCX)

(XLS)

(DOCX)