Multi-Target Analysis and Design of Mitochondrial Metabolism

Analyzing and optimizing biological models is often identified as a research priority in biomedical engineering. An important feature of a model should be the ability to find the best condition in which an organism has to be grown in order to reach specific optimal output values chosen by the researcher. In this work, we take into account a mitochondrial model analyzed with flux-balance analysis. The optimal design and assessment of these models is achieved through single- and/or multi-objective optimization techniques driven by epsilon-dominance and identifiability analysis. Our optimization algorithm searches for the values of the flux rates that optimize multiple cellular functions simultaneously. The optimization of the fluxes of the metabolic network includes not only input fluxes, but also internal fluxes. A faster convergence process with robust candidate solutions is permitted by a relaxed Pareto dominance, regulating the granularity of the approximation of the desired Pareto front. We find that the maximum ATP production is linked to a total consumption of NADH, and reaching the maximum amount of NADH leads to an increasing request of NADH from the external environment. Furthermore, the identifiability analysis characterizes the type and the stage of three monogenic diseases. Finally, we propose a new methodology to extend any constraint-based model using protein abundances.

. Comparison of the Pareto fronts obtained for the algal metabolism [1] when maximizing ATP and NADH production. The seven optimization strategies are described in Section 1, while the optBioCAD optimization algorithm is described in Section 4.2 of the main manuscript. (a) (b) Figure S2. -dominance analysis in the algal mitochondria model [1] for the maximization of ATP and NADH, with the gene sets as Boolean decision variables (a), and the fluxes as real decision variables (b). The red points are the non-dominated solutions, while the other points are suboptimal solutions found by varying epsilon. In (a), the -dominance with = 10 −3 and = 10 −5 revealed no additional points.

-dominance analysis
We consider the optimization of ATP and NADH relating to the search of genetic strategies (ten allowed knockouts), and strategies of input fluxes (with bound equal to 10 mmolh −1 gDW −1 ). The corresponding Pareto front is shown in Figure S1, purple and green points respectively. They are also shown in Figure  S2 (red points). In fact, if i = 0, ∀i = 1, . . . , r, the solutions found by the -dominance analysis are non-dominated, i.e., Pareto-optimal points. Figures S9 and S10 (red line) show the number of optimal solutions found. They are 10 and 16, respectively. Subsequently, the analysis is repeated by varying from 10 −10 to 1. To adapt to each objective function f i we use * i = · f max i where f max i is the best value obtained by f i . In the first case ( Figure S2a and S9) no suboptimal points are obtained for from 10 −10 to 10 −2 . Therefore, in Figure  SI1-S9 the number of solutions remains equal to 10. For = 10 −1 we have 59 points, so 49 suboptimal solutions are found. Finally, for = 1, we find more than 7000 points. In Figure S2a we show these extra solutions (at different ) and their position with respect to the Pareto front. In the second case ( Figure  S2b and SI1-S10), no suboptimal points are obtained for from 10 −10 to 10 −7 . The number of solutions remains equal to 16 ( Figure SI1-S10). For from 10 −6 to 10 −1 , the number of total solutions increases from 20 to more than 90000 (only ten points are optimal, while the others are suboptimal); in Figure  S2b we show their position.

Pathway-oriented sensitivity analysis
PoSA (Pathway-oriented Sensitivity Analysis) is able to rank the pathways of the metabolic network by perturbing genes in terms of knockout. All the genes in a pathway b s , s = 1, . . . , p, where p is the number of metabolic pathways of the network, are perturbed randomly. The output(s) of the model is compared with the output without the input perturbation. We consider as output the vector of the fluxes after performing flux balance analysis and perturbing input(s). PoSA performs combinatorial perturbation, since gene knockout are represented by means of binary variables.
For this problem, we define the "elementary effect" for the input b s as   Figure S3. PoSA applied to the algal metabolism of C. reinhardtii [1]. In the key, only the most sensitive pathways are reported.
whereb s is the mutation on the input b s , and consists of the flip of bits chosen randomly in b s : if a bit is equal to 0 (or 1), the permutation turns it in 1 (or 0). ∆ s is a scale factor defined as The distribution of effects EE s is obtained permuting y by randomly sampling points from Ω. The estimation of the mean µ* and standard deviation σ* is used as indicator of which inputs should be considered important. Figure S3 highlights that the pathway related to the "transport mitochondria" (all the reactions of transport from and to the mitochondrion) is the most sensitive. Other very sensitive pathways are the "transport chloroplast" (all the reactions of transport from and to the chloroplast), pyruvate metabolism, pyrimidine metabolism, transport glyoxysome and glycerolipid metabolism. This result underlines that the mitochondrion plays a key role in the algal metabolism.
In two additional spreadsheets, we also report the lists of genes knocked out as a result of the ATP-NADH optimization in the algal model. The two lists have been obtained with a maximum of 10 and 50 knockouts allowed in the metabolic network.   Figure S4. We seek the optimal amounts of internal fluxes in order to maximize ATP and minimize NADH (a), and to maximize ATP and NADH (b). For this optimization, we take into account the internal fluxes of the metabolic network. In this experiment we search for the optimal value of 229 metabolic fluxes. We perturb a random number (between 0 and 5) of internal fluxes by using the mutation genetic operator, and we evaluate if the configuration of the network remains feasible, i.e. satisfies all the constraints of the metabolic network. If the constraints are satisfied, we calculate the objective functions, otherwise we repeat the random mutation until a maximum number of 10 trials. Beyond 10 trials, the mutation is not performed and the current solution is maintained. Red points represent the Pareto front, blue points feasible solutions and in black the wild type condition, i.e., mitochondria before optimization.     Figure S7. Pareto front obtained for the algal metabolism [1] when optimizing NAD and NADH production. We take into account the knockout status of the genes as decision variables, simulating the presence or the absence of the corresponding reactions in the metabolic network. We plot the output points corresponding to the Pareto optimal genetic strategies found with a bound of 10 and 50 knockouts allowed.  Figure S8. Feasible points associated with the Pareto front obtained in Figure S7.   indicates that the data are scattered (practical non-identifiability). "n.a." stands for "not available", indicating that the response is not significantly estimated by the predictors. An asterisk is added when r 2 > 0.9 and cv > 0.1, while another asterisk is added if the same functional group with r 2 > 0.9 and cv > 0.1 has been detected even if the role of response and predictors is switched, thus highlighting a strong interdependence between the variables involved.         Figure S31. α-ketoglutarate dehydrogenase deficiency -pathological stage. Optimal transformations β (y axis) found for the three fluxes R01214MM, R07599MM, and R07600MM (x axis) [µmol min −1 gDW −1 ] in the mitochondrial FBA model [2]. The order in which we observe variations in the optimal transformation is R01214MM > R07600MM > R07599MM. in the mitochondrial FBA model [2].