Providing new insights on the byphasic lifestyle of the predatory bacterium Bdellovibrio bacteriovorus through genome-scale metabolic modeling

In this study we analyze the growth-phase dependent metabolic states of Bdellovibrio bacteriovorus by constructing a fully compartmented, mass and charge-balanced genome-scale metabolic model of this predatory bacterium (iCH457). Considering the differences between life cycle phases driving the growth of this predator, growth-phase condition-specific models have been generated allowing the systematic study of its metabolic capabilities. Using these unprecedented computational tools, we have been able to analyze, from a system level, the dynamic metabolism of the predatory bacteria as the life cycle progresses. We provide solid computational evidences supporting potential axenic growth of B. bacteriovorus’s in a rich medium based its encoded metabolic capabilities. Our systems-level analysis confirms the presence of “energy-saving” mechanisms in this predator as well as an abrupt metabolic shift between the attack and intraperiplasmic growth phases. Our results strongly suggest that predatory bacteria’s metabolic networks have low robustness, likely hampering their ability to tackle drastic environmental fluctuations, thus being confined to stable and predictable habitats. Overall, we present here a valuable computational testbed based on predatory bacteria activity for rational design of novel and controlled biocatalysts in biotechnological/clinical applications. AUTHOR SUMMARY Bacterial predation is an interspecific relationship widely extended in nature. Among other predators, Bdellovibrio and like organism (BALOs) have received recently a great attention due to the high potential applications such as biocontrol agent in medicine, agriculture, aquaculture and water treatment. Despite the increasing interest in of this predatory bacterium its complex lifestyle and growth conditions hamper the full exploitation of their biotechnological properties. In order to overcome these important shortcomings, we provide here the first genome-scale model of a predator bacterium constructed so far. By using the model as a computational testbed, we provide solid evidences of the metabolic autonomy of this interesting bacterium in term of growth, as well as its dynamic metabolism powering the biphasic life style of this predator. We found a low metabolic robustness thus suggesting that Bdellovibrio is more niche-specific than previously thought and that the environmental conditions governing predation may be relatively uniform. Overall, we provide here a valuable computational tool largely facilitating rational design of novel and controlled predator-based biocatalysts in biotechnological/clinical applications.

and irreversible, the predator enters in the prey`s periplasm, where it grows and replicates DNA

76
during the GP using the cytoplasm of the prey cell as a source of nutrients and biomass building 77 blocks. When the prey is exhausted, B. bacteriovorus, grown as a filament, septs into several 78 daughter cells and, through its large array of hydrolytic enzymes, lyses the ghost-prey's outer 79 cell membrane and releases into the medium [9]. Interestingly, host-independent (HI) mutants 80 of Bdellovibrio strains have been developed under laboratory conditions. These HI predators are 81 able to grow axenically (without prey) in a rich-nutrient medium mimicking the dimorphic 82 pattern of elongated growth, division and differentiation [10]. It is worth noticing that the axenic 83 growth of these mutant strains is given by a mutation in the host interaction (hit) locus, which 84 has been described as being involved in regulatory and/or scaffold elements [11]. This argues in 85 favor of this mutation having no metabolic (enzymatic) impact. In fact, the main metabolism of 86 these HI derivatives should not have suffered changes with respect to the wild type Bdellovibrio 87 strains. 151 bacteriovorus HD100 and the automatic model Seed. Manual curation is required to accurately fine-tune the information contained in the metabolic model and several steps of network validation and analysis are required to finally obtain the metabolic model iCH457. B) Generation of condition-specific models: iCHAP and iCHGP. The general model iCH457 was 155 constrained based on nutrient availability (minimal and rich in silico media), biological role 156 (ATP production or biomass generation) and transcriptomic available data (Karunker et al.,157 2013)*. GIM 3 E algorithm was used to construct the condition-specific models.
158 159 iCH457 includes 457 ORFs, which represent 13 % of the coding genes in the genome, whose physiological evidence provided by Nguyen and col. and Muller and col. [27,28]

216
Supporting this computational analysis, radiotracer studies showed that strain 109J mainly 217 utilized host nucleosides monophosphate during intraperiplasmic growth, however it was also 218 able to synthesize its own pool of nucleotides [35,36]. This phenomenon has been traditionally 219 explained in the context of an "energy-saving" mechanism. Similarly, this mechanism has also 220 been reported, and validated by our in silico analysis, for phospholipid assimilation and the but not of amino acids or vitamins whose availability depends exclusively on the prey.

241
Likewise, detailed analysis of transport systems included in the model suggests B.

242
bacteriovorus' ability to obtain oligopeptides through prey proteins cleavage and use them as its 243 main source of carbon, nitrogen and energy during GP. 249 bacteriovoru and the complex environment provided by the prey, in terms of nutrients, prove 250 challenging when using classical validation workflows based on single nutrient sources.

251
Therefore, for iCH457 validation, we took advantage of spontaneous HI Bdellovibrio strains 252 developed under laboratory conditions. Such HI strains exhibit a similar lifecycle when growing 253 in a rich medium to the wild type strain growing inside the intraperiplasmic space of the prey

254
[40]. Indeed, because the HI phenotype has been attributed to putative regulatory and/or 255 scaffold mechanisms rather than to metabolic genes (enzymes) [41,42], these HI strains are 256 supposed to possess metabolic capabilities identical to those of the parental strains. Thus, for the 257 GEMs validation process, including potential carbon sources and biomass generation rates, we 258 decided to use data from HI Bdellovibrio strains for iCH457 validation.

259
Specifically, we validated the predictive capabilities of the iCH457 by comparing in silico 260 results with experimentally determined biomass production and growth rates of the HI strain B.

261
bacteriovorus 109 Davis [43]. The in silico growth rates were calculated using minimal medium 262 supplemented with selected carbon sources (S1 Text). iCH457 was very precise predicting growth rate on five different carbon sources, with an accuracy close to 100% in the case of glutamate, glutamine and succinate, and 70% for pyruvate and lactate ( Fig 4A).

285
Beyond the availability to predict growth rates, it is valuable to assess the model's ability to 286 predict the maximum amount of biomass produced from known concentrations of given carbon 287 and energy sources. Similar high accuracy was found regarding the predictability of biomass 288 production between in silico and experimental data (Kendall`s coefficient τ = 0.88) (Fig 4B). It 289 is noteworthy that the in silico analysis provided in these evaluations largely confirmed the 290 prey-independent metabolic states, thus shedding light on the predator's potentially autonomous metabolism. These results are in good agreement with the large amount of HI derivative strains 292 isolated previously [47] and the recent description of the metabolic response of AP cells in NB 293 medium to synthesize and secrete proteases [48]. Therefore, the obligate predatory lifestyle of 294 B. bacteriovorus should be questioned, at least from a metabolic point of view.

295
Overall, the high accuracy exhibited by iCH457 encourages us to use the model to characterize 296 and better understand the metabolic states that underline the biphasic growth cycle of B.

299
It is well-known that the environmental conditions and natural habitat of a given bacterium 300 largely influence its evolutionary traits, including processes of genome expansion/reduction.

301
Therefore, and taking advantage of iCH457, it would be interesting to address from a 302 computational perspective whether the genome content of the predator has been influenced by 303 its complex lifestyle. We identified a set of essential reactions in iCH457. The network 304 reaction(s) associated with each gene was individually "deleted" by setting the flux to 0 and 305 optimizing for the biomass function. A reaction was defined as essential if after constrained it 306 the growth rate decreased to less than 10% of wild type model. To properly contextualize the 307 reaction essentiality analysis, we compared our results with those from some free-living 308 organisms such as P. putida KT2440 (iJN1411) [46], E. coli strain K-12 MG1655 (iJO1366)

309
and Geobacter metallireducens , as well as with other bacteria that also possess 310 intracellular stages during their growth cycles, such as Yersinia pestis CO92 (iPC815), essential reactions and the size of the metabolic network or the microorganism's lifestyle. The

315
number of essential reactions found ranged from 214 to 419, with Y. pestis and P. putida being number of these essential reactions for the δ-proteobacteria, B. bacteriovorus and G.

319
This rate could be related with the lack of a secondary metabolism in this bacterial group, which 320 should be explored in depth in order to increase the computational value of the results.

328
Potentially, the 38 shared reactions would be part of the hypothetical essential metabolic 329 network. Overall, the reactions found in the shared essentiality group are related with cell 330 envelope, nucleotide, and cofactors (Table S6).

345
[20], changes that must be strongly determined by the microenvironment.

346
B. bacteriovorus AP cells are exposed to extracellular environment with highly diluted 347 concentration of nutrients, but during GP the predator finds a very rich environment inside of 348 the prey. As a consequence, it is reasonable to assume that the predator could hardly find aim of this phase is to grow, which implies a highly active metabolism (catabolism and 359 anabolism) supporting fast biomass generation. In fact, recent transcriptomic analyses have models were constructed by constraining iCH457 in terms of: i) nutrient availability, ii) 364 biological objective, and iii) gene expression profile. As a first step, based on the environmental 365 conditions, we defined two different in silico media: e.g., minimal and rich medium for AP and 366 GP, respectively (Text S1). Secondly, focusing on biological role, we used different biological 367 objectives for simulating AP and GP phases. Thus, under AP and GP, ATP production and 368 biomass production were selected as differential objective functions, respectively. Finally, in 369 order to constrain even more the solution space in each model, data from RNA-seq analyses 370 collected during AP and GP [49] were integrated into the metabolic model by using GIM3E

371
[54]. GIM3E is an algorithm which minimizes the use of reactions whose encoding gene 372 expression levels are under a certain threshold and finds a flux distribution consistent with the 373 target function (biomass generation for GP or ATP production for AP). Following this 374 workflow, we constructed two new models (iCHAP and iCHGP), mimicking AP and GP growth 375 phases, respectively (Fig 2B).

376
The number of reactions of each specific-condition model was significantly reduced (from 1001 377 to 810 and 841 in AP and GP, respectively). This significant reduction involves reduced 378 solution spaces, and thus likely more accurate predictions. As could be inferred given the 379 difference in biological objectives in each phase, the condition-specific models were 380 significantly different regarding the specific metabolic content (Fig 2B and Table S7). For  ( Table S7). In other 385 words, we found that while the unique enzymes present during AP were mainly involved in 386 energy production and cell survival, during GP the reactions were largely involved in anabolic 387 pathways including biosynthesis of biomass building blocks. potential differences in the metabolic states between AP and GP. Subsequently, we assessed the 391 most probable carbon flux distribution between the two condition-specific models to reveal 392 integrated information about the predator's metabolism (Fig 6). Thereby the behavior during AP 393 seems to follow a balanced oxidative metabolism aimed at energy production, including intense 394 flux across TCA and oxidative phosphorylation. On the contrary, no significant fluxes were 395 predicted across anaplerotic and biosynthetic pathways including gluconeogenesis, pentose 396 phosphate, and lipid biosynthesis, which suggests negligible participation of these metabolic 397 hubs during AP. Interestingly, a completely inverse metabolic scenario was predicted under GP.

398
Firstly, this specific model predicted key energetic metabolic pathways being partially inactive

499
The genome-scale metabolic model of B. bacterivorous HD100 (iCH457) was constructed using 500 standardized protocols for metabolic reconstruction [22,59], and is detailed in Fig 2A. Tables S1 and   518 S2. FBA is by far the most popular approach for analyzing constraint-based models and it is used in 521 many applications of GEMs. FBA uses optimization of an objective function to find a subset of 522 optimal states in the large solution space of possible states that is shaped by the mass balance 523 and capacity constraints. In FBA, the solution space is constrained by the statement of a steady-524 state, under which each internal metabolite is consumed at the same rate as it is produced [45].

525
The conversion into a mathematical format can be done automatically by parsing the 526 stoichiometric coefficients from the network reaction list e.g. using the COBRA toolbox [55].

548
[46] was used as a template for the biomass function of iCH457. However, data from B.

549
bacteriovorus were added when available (e.g. nucleotide composition -from genome 550 sequence). The detailed calculation of biomass composition is provided in Table S3.

551
Generation of growth phase-specific models: iCHAP and iCHGP 552 A given metabolic reconstruction is defined by the metabolic content contained in the genome 553 and thus is unique for the target organism. However, it is possible to construct different 554 condition-specific models by applying additional constraints such as condition-specific data 555 (including physiological), gen/protein expression and flux data, etc.

556
To construct condition-specific metabolic models we incorporated these additional constraints 557 to the model by means of a stepwise procedure including condition specific: i) biomass, ii) 558 nutrient availability and iii) gene expression data ( Fig 2B). Firstly, the objective function was 559 adjusted to the biological role of AP and GP. ATP maintenance and biomass equations were 560 selected as objective functions for AP and GP, respectively. In addition, different in silico media 561 were designed for each phase, simulating the availability of nutrients in each growth phase (S1 constrain even further the solution space us GIM3E [70]. GIM3E builds reduced models by 564 removing the reactions not available in the expression dataset while preserving model 565 functionality. It should be noted that GIM3E considers which genes are expressed or not, but 566 not the modifications in mRNA levels under different experimental conditions. A given gene 567 was considered expressed when its RNA levels in the RNA-seq analysis fell within the first 568 quartile, which is ≥ 10 RPKM using the available dataset [49].

569
The distribution of possible fluxes in the specific-condition models was calculated using the reference flux value. In order to determine the effect of a single reaction deletion, all the reactions associated with 574 each gene in iCH457 were individually suppressed from the matrix S. FBA was used to predict 575 the mutation growth phenotype. The singleReactionDeletion function implemented in the 576 COBRA Toolbox [55]

584
The associations between essential reactions and each bacterium were represented building a 585 bipartite network. For visualization we use Gephi software (0.9.2). These essential reactions and 586 the bacterial models were clustered in a heatmap using the pheatmap (v. 1.0.12) package within 587 the R environment (http://www.R-project.org).