A gap-filling algorithm for prediction of metabolic interactions in microbial communities

The study of microbial communities and their interactions has attracted the interest of the scientific community, because of their potential for applications in biotechnology, ecology and medicine. The complexity of interspecies interactions, which are key for the macroscopic behavior of microbial communities, cannot be studied easily experimentally. For this reason, the modeling of microbial communities has begun to leverage the knowledge of established constraint-based methods, which have long been used for studying and analyzing the microbial metabolism of individual species based on genome-scale metabolic reconstructions of microorganisms. A main problem of genome-scale metabolic reconstructions is that they usually contain metabolic gaps due to genome misannotations and unknown enzyme functions. This problem is traditionally solved by using gap-filling algorithms that add biochemical reactions from external databases to the metabolic reconstruction, in order to restore model growth. However, gap-filling algorithms could evolve by taking into account metabolic interactions among species that coexist in microbial communities. In this work, a gap-filling method that resolves metabolic gaps at the community level was developed. The efficacy of the algorithm was tested by analyzing its ability to resolve metabolic gaps on a synthetic community of auxotrophic Escherichia coli strains. Subsequently, the algorithm was applied to resolve metabolic gaps and predict metabolic interactions in a community of Bifidobacterium adolescentis and Faecalibacterium prausnitzii, two species present in the human gut microbiota, and in an experimentally studied community of Dehalobacter and Bacteroidales species of the ACT-3 community. The community gap-filling method can facilitate the improvement of metabolic models and the identification of metabolic interactions that are difficult to identify experimentally in microbial communities.


Introduction
Microorganisms form the most abundant group of living organisms on our planet. In nature microorganisms do not live in isolation, but in close association with one another, forming microbial communities. The value of microbial communities for biotechnology, the environment and human health has been profound due to their potential for use in industrial bioprocesses for the production of valuable chemicals [1], their role in biogeochemical cycles [2], and their effects on human health through the human microbiome [3].
The study of microbiomes has been extensively limited to the taxonomic classification of species and their correlation with different phenotypes. However, such correlations do not offer any mechanistic explanation on how the interactions among microbes and their environment form the observed phenotypes of the ecosystem [4]. A way to elucidate metabolic interactions in microbial communities comes from the use of a set of mathematical and computational techniques, called constraint-based methods, that make use of genome-scale metabolic models [5,6]. Constraint-based methods have been used extensively for the study of the metabolic functions of individual microorganisms as well as for strain design in metabolic engineering [7]. In the context of microbial communities, methods like SteadyCom [8], Opt-Com [9], d-OptCom [10], DMMM (Dynamic Multispecies Metabolic Modeling) [11], and COMETS (Computation Of Microbial Ecosystems in Time and Space) [12] give the opportunity to evaluate growth rates and metabolic interactions of community members under various conditions. Genome-scale metabolic models (GSMMs) can be reconstructed automatically from isolated genomes or pangenomes of specific organisms with a variety of tools [13], like Model-SEED [14] and Kbase [15], that create gene-protein-reaction (GPR) associations. However, genomes are often fragmented and contain misannotated genes [16], while the databases with information about enzyme functions and biochemical reactions used in the reconstruction process are not highly curated [13]. These problems lead to the creation of models with metabolic gaps, mass and charge imbalances and thermodynamic infeasibilities. Metabolic gaps are solved with the use of gap-filling methods, which are an indispensable part of the reconstruction process of organism-specific metabolic models [17]. The first published gap-filling algorithm was GapFill [18], which was formulated as a Mixed Integer Linear Programming (MILP) problem that identified dead-end metabolites and added reactions from the MetaCyc database [19] in the metabolic network. Based on this algorithm more efficient and userfriendly gap-filling tools have been developed [20,21]. Some platforms for metabolic network reconstruction and curation, like gapseq [22] and AMMEDEUS [23], make use of more computationally efficient gap-filling algorithms formulated as Linear Programming (LP) problems.
Other methods [24][25][26], including gapseq [22] and CarveMe [27], take into account genomic or taxonomic information in order to decide which biochemical reactions to add to the metabolic network. Other notable methods are OMNI [28] and GrowMatch [29], that try to maximize the consistency of the model with experimentally observed fluxes and growth rates respectively, while OptFill [30] is a method formulated to simultaneously solve metabolic gaps and thermodynamically infeasible cycles. Overall, most of the gap-filling methods to date are constraint-based methods that resolve metabolic gaps by adding biochemical reactions from a reference database, like ModelSEED [31], MetaCyc [19], KEGG [32], or BiGG [33], to the metabolic reconstruction of a specific organism.
Curated GSMMs of individual organisms can be combined in order to create a community model. However, GSMMs of organisms that naturally live in complex microbial communities, cannot be easily curated individually, and they often do not realistically represent the organism's metabolic potential after the gap-filling process is completed. This problem is mainly created because of the restricted amount of physiological information that can be collected experimentally for members of complex microbial communities. Microorganisms cannot be easily cultivated individually due to their complex metabolic interdependencies with other community members. This lack of physiological information is especially acute in the case of metagenomes from the environment or from enrichments. In this paper, we propose a gap-filling algorithm that resolves metabolic gaps in microbial communities, while considering metabolic interactions in the community. The proposed community gap-filling method combines incomplete metabolic reconstructions of microorganisms that are known to coexist in microbial communities and permits them to interact metabolically during the gap-filling process. Therefore, community gap-filling can be a useful method not only for resolving metabolic gaps of GSMMs, but also for predicting non-intuitive metabolic interdependencies in microbial communities. The accuracy of alternative community gap-filling strategies has already been explored with the help of metabolic models of a bacterial photoautotroph-heterotroph consortium composed of a Thermosynechococcus elongatus and a Meiothermus ruber strain, reconstructed using the KBase platform [34]. Here, we are offering an explicit formulation for an algorithm that builds compartmentalized metabolic models of microbial communities from GSMMs of individual microorganisms, and ensures decreased solution times for the community gap-filling problem. Considering the unexplored metabolic potential of many natural microbial communities, as well as the difficulty to produce highly-curated metabolic reconstructions for microorganisms, we believe that this method will benefit the study of complex microbial consortia.
The described community gap-filling method was applied to three independent case studies that demonstrate its ability to restore growth in metabolic models and predict both cooperative and competitive metabolic interactions between them by adding the minimum possible number of biochemical reactions from a reference database to the models. The three case studies involve a community of two Escherichia coli strains created for testing purposes, and the more complex communities of Bifidobacterium adolescentis with Faecalibacterium prausnitzii, and Dehalobacter with Bacteroidales species, which have interesting applications.
First, we demonstrate that our method successfully restores growth in a synthetic community comprised of two auxotrophic Escherichia coli strains: an obligatory glucose consumer and an obligatory acetate consumer. This community represents the well-known phenomenon of acetate cross-feeding that emerges among E. coli strains which grow in homogeneous environments containing glucose as the sole carbon source [35,36].
The community gap-filling method was also used to study the codependent growth of Bifidobacterium adolescentis and Faecalibacterium prausnitzii, two well-known bacterial members of the human gut microbiome, the most diverse group of microorganisms found in the human body [37,38]. One of the main functions of the intestinal microbiota is the anaerobic saccharolytic fermentation of complex carbohydrates such as dietary fibers [39], which leads to the production of short chain fatty acids (SCFAs) like acetate, propionate and butyrate. SCFAs are known for their various beneficial effects on the mammalian gut and metabolism [40,41], and in particular butyrate is known for its ability to function as an energy source for the epithelial cells of the gut and works against oxidative stress and inflammation [42]. A major butyrate producer in the gut of healthy adults is the bacterium Faecalibacterium prausnitzii [43,44]. The presence of F. prausnitzii has been connected with anti-inflammatory effects in mice [45], while its abundance is significantly decreased in some diseases of the gastrointestinal tract [46][47][48]. For these reasons, it is believed to have the ability to function as a probiotic and as a marker for inflammatory bowel diseases and colon cancer [49]. F. prausnitzii has been identified as a commensal, strictly anaerobic bacterium of the human gut that performs saccharolytic fermentation. It can utilize a variety of carbon sources from glucose, fructose, and fructo-oligosaccharides to complex molecules such as pectin and N-acetylglucosamine [50], and its typical fermentation products are butyrate, formate, lactate, and acetate [51]. F. prausnitzii is known for developing both competitive and syntrophic interactions with other species that live in the gut, as it competes for common carbon sources, but also has the ability to consume acetate produced by other species and convert it to butyrate. These interactions have been observed in cocultures of F. prausnitzii with bacterial species of the genus Bifidobacterium [52,53]. Bifidobacterial species belong to the phylum Actinobacteria, and are commensal, obligate anaerobes of the human gut, that perform saccharolytic fermentation. They can metabolize a variety of carbohydrates [54], and their most common fermentation products are acetate, formate, lactate, succinate, and ethanol [55][56][57]. The cogrowth of the species Faecalibacterium prausnitzii and Bifidobacterium adolescentis has been studied both experimentally [58,59] and computationally [60], and it has been shown that their metabolic interaction has the potential to enhance butyrate production [58]. For the community of F. prausnitzii and B. adolescentis, the community gap-filling method managed to predict metabolic interactions and SCFA production that were previously reported [58].
Finally, the community gap-filling method was used to replicate experimental results for the interaction between Dehalobacter sp. CF and Bacteroidales sp. CF50, which are the two main members of the ACT-3 community. The ACT-3 community is a microbial community found in soils and underground waters that are usually contaminated with chlorinated compounds. The study of this community has garnered scientific interest in recent years, as it has the ability to degrade chlorinated environmental pollutants [61][62][63]. The ACT-3 community is mainly composed of bacterial species that belong in the genera Dehalobacter, with relative abundance up to 80%, Bacteroides, and Clostridium [64]. The dechlorinating ability of the community mainly comes from the ability of Dehalobacter species to perform dehalogenation, i.e., anaerobic respiration with the use of chloroform or other chlorinated compounds as electron acceptors [65][66][67][68][69]. In this study, we use our metabolic reconstructions for Dehalobacter sp. CF and Bacteroidales sp. CF50, in order to simulate experimental results for the cogrowth of the two most abundant species of the ACT-3 community [64] and improve our models.

The community gap-filling method
The general principle behind the proposed method is demonstrated in Fig 1. Our algorithm creates a compartment for each organism of the simulated microbial community, by combining the biochemical reactions of the organism's metabolic model with those that come from a reference database. In this study, we created a curated database using the biochemical reactions available in BiGG [33]. While continuing to use the BiGG nomenclature, we removed all the biomass equations, reactions found only in eukaryotic organisms, and reactions with mass imbalances. Then, the organism compartments were allowed to exchange metabolites with one another and with their environment through a common metabolite pool where extracellular compounds can accumulate or deplete. In this way, each organism can synthesize biomass only by using its intracellular metabolites, and metabolites defined in the growth medium or produced by the other organisms of the community.
As we can see in Fig 1, our community model consists of the organism compartments and an exchange compartment which is the common metabolite pool. This community model is used to formulate an MILP problem that predicts the minimum number of reactions that need to be added from the reference database to the community model in order to achieve a userdetermined growth rate for the organisms that make up the community. The mathematical The community gap-filling algorithm is an MILP problem with the objective to add the minimum number of database reactions to the community model in order to restore biomass production in the individual metabolic models, while it satisfies some constrains for the reaction fluxes. In each compartment of the community model, the metabolite pools are assumed to be in steady state. The addition of each database reaction is controlled by a binary variable, which takes the value of 1 if the database reaction is added to the community model, and 0 otherwise.
where I n and J n represent the number of metabolites and reactions, respectively, in the n th microorganism compartment, while I c represents the exchanged metabolites in the common metabolite pool. Eq (1) is the objective function of the MILP problem which minimizes the total number of biochemical reactions that are added from the database to the metabolic models of the organisms composing the community. Eqs (2)-(7) are the constraints of the optimization problem. Specifically, Eqs (2) and (7) show that the metabolite pools belonging to each organism compartment and to the common exchange compartment respectively, are at steady state. Eq (3) shows the constrained fluxes of the reactions originally belonging to the organism models. The lower and upper bounds for the model reactions are selected in order to represent the growth of the organism under the desired conditions. Eq (5) represents the constraints for the fluxes through the biomass reactions of the organism models. The minimum threshold for the growth rate of each organism is defined based on existing information for the growth of the microorganism under the simulated conditions. Eq (4) implies that the fluxes of the reactions coming from the database are constrained between a lower and an upper bound, but a binary variable controls whether each database reaction will be added to the corresponding model. The binary variables, which are also used in the objective function, are defined in Eq (6). Each binary variable takes the value 1 if the database reaction is added to the community model, and the value 0 otherwise. It is noted that all the reactions from the database are initially considered reversible. Then, the minimum and maximum possible fluxes for these reactions are calculated from Flux Variability Analysis (FVA) [70] when the organism compartment is made. The FVA results are used as constraints for the database reaction fluxes, which also gives the opportunity to remove the reactions with both minimum and maximum flux zero. Using this approach, the solution space and time of the final gap-filling problem is reduced.
It is noted that the reaction fluxes in each one of the N organism compartments are reported per gram dry weight of the cells of the respective organism. In order to correct the mass balance in the community, we are discussing in S3 Appendix an alternative community gap-filling formulation which takes into account the relative abundances of the organisms in the microbial community.
The simulations were performed on a machine with two 22-core Intel(R) Xeon(R) Gold 6238 @ 2.10GHz and 768GB of RAM, running CentOS 7. For the generation of the results presented in this paper, the MILP problem was formulated and solved in MATLAB 2017b with CPLEX Interactive Optimizer 12.8 and CobraToolbox 3, and alternative solutions for the problem were calculated with the populate procedure of CPLEX and an upper time limit of two hours. The alternative solutions are used in each of the presented case studies in order to identify reactions added to the metabolic models of the participating microorganisms as well as metabolic interactions that appear more frequently in the solutions.

Sources of metabolic models
Toy E. coli community. S1 Appendix explains how we used the E. coli core model from BiGG [71] for the creation of two E. coli strains, one with the ability to use glucose as a substrate and another with the ability to use only acetate excreted from the first strain. S1 and S2 Tables contain information about reactions that were deleted from our core model and constraints for the exchange reactions that were implemented in order to create the models of the E. coli glucose utilizer and the E. coli acetate utilizer respectively. In this study, the selected lower bounds for the growth rates of the E. coli glucose utilizer and the E. coli acetate utilizer, were 0.9 and 0.09 h -1 , respectively, based on existing information about acetate cross-feeding between E. coli strains [35,72,73]. We also created an E. coli strain with the ability to consume glycerol by deleting and constraining reactions from our E. coli core model as shown in S3 Table. Then, we made a microbial community of three organisms with the models of the E. coli glucose utilizer, which was allowed to import or export glycerol, the E. coli acetate utilizer, and the E. coli glycerol utilizer, whose lower growth rate bound was set to 0.09 h -1 . This community was used for measuring the solution time of our community gap-filling algorithm as described in the Results section.
Bifidobacterium adolescentis and Faecalibacterium prausnitzii community. For our simulations, we downloaded the metabolic models for the strains Bifidobacterium adolescentis ATCC 15703 and Faecalibacterium prausnitzii A2-165 from the Virtual Metabolic Human Database [74]. The models are part of the AGORA version 1.03 [75], a collection of metabolic reconstructions of 773 members of the human gut microbiota. Both models are known to contain gaps in the pathways for the biosynthesis of some amino acids and vitamin B, as well as in their central carbon metabolism [74]. In this study we simulate the ability of the two models to use glucose as a substrate and exchange acetate and amino acids under anaerobic conditions. The applied constraints for the exchange reactions of the two models can be seen in S4 and S5 Tables respectively. The lower bounds for the growth rates of the B. adolescentis ATCC 15703 and the F. prausnitzii A2-165, were set to 0.6 and 0.4 h -1 , respectively. The constraints used for the exchange reactions of the organic acids, the amino acids, and the growth rates are based on experimental observations of B. adolescentis and F. prausnitzii strains cogrowth [58], and the study of the metabolic potential of the strain F. prausnitzii A2-165 [51].
ACT-3 community. For this case study, we reconstructed the metabolic models of the two most abundant species of the ACT-3 community, Dehalobacter sp. CF and Bacteroidales sp. CF50. S2 Appendix contains information about the reconstruction process for the two models. We allowed Dehalobacter sp. CF to consume chloroform and Bacteroidales sp. CF50 to consume lactate, while the two models grow anaerobically in the same medium and are allowed to exchange organic acids and amino acids. Based on the experimental study for the coculture [64], after 18 days, the community consumes 0.5 mM of chloroform and 0.75 mM of lactate, while the total cell density is around 6�10 7 cells�mL -1 , and the relative abundance of the Dehalobacter in the community is 66% on average. This information was used in order to calculate the uptake rates of chloroform and lactate by the community as shown in S6 Table. The reaction constraints for the two models are shown in S7 and S8 Tables. For our simulations, we used 0.1 and 0.01 d -1 as lower bounds for the growth rates of the Dehalobacter sp. CF and the Bacteroidales sp. CF50, respectively. All the constraints applied for the exchange reactions and the growth rates are representing experimental data from the cogrowth of the Dehalobacter and Bacteroidales species [64].

Results
We used our community gap-filling method in order to add biochemical reactions to metabolic models of organisms that contain incomplete pathways. Since the method simulates the cogrowth of organisms that are known to coexist in microbial communities, it is used for identifying both expected and unexpected metabolic interactions in the community. In this paper, we tested the potential of the method on an artificial E. coli community, and then, we applied it to two communities with significance for the health of the human gut and for the bioremediation of soil and water, respectively. For each microbial community, we discuss the best solution calculated by the algorithm, i.e. the solution that adds the minimum number of biochemical reactions to the models, as well as the patterns emerging from the ten best alternative solutions suggested by the algorithm. The MATLAB files with all the solutions calculated by the community gap-filling for each one of the studied microbial communities can be found in S1-S4 Files. The files of all the used models and the used community media are available on our GitHub.
A summary of the characteristics of the used metabolic models before and after the application of the community gap-filling can be seen in Table 1. Table 2 compares the number of reactions that are added to the metabolic models by the community gap-filling method and by gap-filling methods for individual GSMMs. For each metabolic model the individual-organism gap-filling methods add more reactions than the community gap-filling method, with the exception of the E. coli glucose utilizer model which was able to synthesize biomass before the gap-filling process, as we can see from Table 1. The function fastGapFill [20] from CobraToolbox 3 [21] adds the most reactions to all the models, since it tries to resolve all the blocked reactions and dead-end metabolites in a metabolic network. The reactions that are added to each one of the metabolic models by the individual-organism gap-filing methods are reported in S9-S14 Tables. Table 1. Characteristics of the metabolic models used in the study before and after the application of the community gap-filling method. The algorithm simulated the aerobic growth of a community of an E. coli glucose utilizer and an E. coli acetate utilizer on glucose, the anaerobic growth of a community of the strains B. adolescentis ATCC 15703 and F. prausnitzii A2-165 on glucose, and the anaerobic growth of a community of the species Dehalobacter sp. CF and Bacteroidales sp. CF50 on media with lactate and chloroform.

Model
Before

Toy E. coli community
In the beginning, we tested if our community gap-filling method can restore growth in an artificial microbial community of an E. coli glucose utilizer and an E. coli acetate utilizer strain, that is allowed to grow aerobically on glucose. Our community was made after the deletion of 12 reactions from the original models of E. coli core metabolism (Fig 2) as discussed in S1 Appendix. As expected, the community gap-filling algorithm, instead of adding back all the knockouts that we made, suggested the addition of the minimum possible number of reactions in the community in order to restore its function. The first solution calculated by the algorithm suggested the addition of one reaction to the E. coli glucose utilizer model and two reactions to the E. coli acetate utilizer model (Fig 2). More specifically, the pyruvate oxidase (POX) reaction was added to the model of the E. coli glucose utilizer to produce acetate from pyruvate. The additions to the model of the E. coli acetate utilizer include citrate synthase (CS), which was one of the initial knockouts, and fructose-6-phosphate utilizing phosphoketolase (PKETF), which converts acetyl-phosphate to fructose-6-phosphate. Apart from the reactions that were added from the database to the models, the community gap-filling algorithm suggested the activation of reactions that were already present in the models, but did not carry flux in the FBA solution of the original core model. More specifically, in order to sustain growth in the glucose utilizer the reactions of 6-phosphogluconate dehydratase (EDD) and 2-dehydro-3-deoxy-phosphogluconate aldolase (EDA) were activated to produce pyruvate from gluconate 6-phosphate, while pyruvate dehydrogenase (PDH) was also activated to convert pyruvate to acetyl-CoA. In the acetate utilizer the preexisting reactions of acetyl-CoA synthetase (ACS) and acetate kinase (ACKr) were activated in order to convert acetate to acetyl-CoA and acetylphosphate respectively, and malic enzyme (ME1) converted malate to pyruvate. The fluxes for all the reactions of the community, as well as a summative table of the reactions that were added from the database to the community can be found in S15 Table. A post gap-filling FBA for the two models shows that all the biomass precursors are replenished in both models. However, both of the models grow at suboptimal rates compared to the original core model (S1 Appendix and Table 1). For the glucose utilizer, the suboptimal growth rate of 0.9 h -1 was due to the export of acetate that led to a reduction in resources directed towards growth, whereas for the acetate utilizer the growth rate of 0.09 h -1 was caused by the use of reduced acetate as the only substrate. The fluxes from the ten best solutions calculated by the community gap-filling algorithm for the exchange reactions of the community (S16 Table) show that all the solutions predicted the ability of the E. coli glucose utilizer model to uptake glucose and produce acetate that is used by the E. coli acetate utilizer model for aerobic or anaerobic growth. We can see that the acetate utilizing strain either shares oxygen with the glucose utilizing strain or ends up growing anaerobically due to oxygen depletion by the glucose utilizing strain in different solutions. Moreover, all the reactions that were added from the database to the community in the ten best solutions (S17 Table), carry realistic fluxes with values ranging from -10 to 15 mmol�gDW -1 �h -1 .
We further used our artificial E. coli community in order to estimate the computational efficiency of our method. One major innovation of our community gap-filling method is the use of FVA before the solution of the main MILP problem. FVA is performed in order to identify the minimum and maximum fluxes that are possible for every database reaction in each organism compartment of the community. The fluxes calculated by FVA are used as constraints for the database reactions before building the community model. As a result, the solution space of the community gap-filling MILP problem is smaller and therefore, it can be solved faster. Many of the gap-filling methods to date solve MILP problems, but have not made an effort to reduce the solution space. From a computational perspective, the search of optimality can be tedious for MILP problems, and it is known that smaller feasible solution spaces can accelerate the solution of the optimization problem. Regarding the community gap-filling algorithm, the solving time increases when bigger databases are used and more importantly when the metabolic models of more organisms are added to the problem. As we can see from S1 Fig, the solving time of the MILP problem after the application of FVA is increasingly smaller as more organisms are added in the community, compared to the problem that was formulated immediately without the use of FVA. These results suggest that the FVA preprocessing step can reduce the solution time up to 84% (from 28 min to 4.5 min for a microbial community of two organisms, and from 83 min to 18 min for a microbial community of three organisms). It is noted that in order to measure the solution time of our MILP problem for a microbial community of three organisms, we added an E. coli glycerol utilizer strain to our artificial E. coli community (see Materials and methods section). The community gap-filling algorithm restored growth in this three-member community, and predicted that the E. coli glucose utilizer model provide acetate and glycerol to the models of the E. coli acetate utilizer and the E. coli glycerol utilizer respectively. The fluxes of the community reactions according to the first solution calculated by the algorithm are reported analytically in S18 Table.

Bifidobacterium adolescentis and Faecalibacterium prausnitzii community
In this case study, we applied the community gap-filling method to a community made from the models of the strains Bifidobacterium adolescentis ATCC 15703 and Faecalibacterium prausnitzii A2-165. The interaction between these two species has been studied experimentally and is considered beneficial since it promotes the production of the SCFA butyrate, which has antinflammatory effects in the gut [58]. Here, we simulated the anaerobic cogrowth of B. adolescentis and F. prausnitzii with glucose as the sole carbon source. None of the models can synthesize biomass in the simulated conditions, and the two models are allowed to produce SCFAs and exchange amino acids. The model of F. prausnitzii was permitted to either uptake or excrete acetate.
The community gap-filling algorithm restored community growth by predicting both competitive and cooperative metabolic interactions and by adding database reactions to the models. The first calculated solution added five reactions to the model of B. adolescentis ATCC 15703 and eight reactions to the model of F. prausnitzii A2-165 (Table 3 and S19 Table). The reactions added to the models of B. adolescentis and F. prausnitzii participated in the biosynthesis of cofactors and amino acids necessary for growth. The first solution predicted that B. adolescentis and F. prausnitzii share the available glucose from the community medium, while F. prausnitzii uses part of the acetate exported from B. adolescentis (Fig 3). Moreover, both models export CO 2 , B. adolescentis exports lactate and formate in addition to acetate, and F. prausnitzii exports butyrate, lactate and formate (Fig 3). This correlates with experimental observations [58]. Interestingly the model of B. adolescentis ATCC 15703 uses the bifid shunt pathway, which is centered around the key enzyme frutose-6-phosphate erythrose-4-phosphate-lyase (F6PE4PL) in our model. The bifid shunt pathway is unique in Bifidobacteria species that commonly use it for the efficient conversion of carbohydrates, like glucose, to acetate and lactate [55,56]. The biosynthesis of butyrate from the model of F. prausnitzii A2-165 happens through the reaction of butyryl-CoA:acetate CoA-transferase (BTCOAACCOAT) that converts acetate to butyrate. Beyond the expected behavior of the community regarding the use of carbon and energy sources, and the production of SCFAs, the best solution also predicts the exchange of numerous amino acids between the two community members (Fig 3). After adding the reactions suggested by the community gap-filling algorithm, FBA for the two models shows that the growth is restored, with growth rates of 0.6 h -1 and 0.4 h -1 under the simulated conditions, for the models of B. adolescentis and F. prausnitzii respectively.
We also evaluated the ten best alternative solutions calculated by the community gap-filling algorithm and found several interesting interactions. First, in almost all the solutions (S20 Table), the models of B. adolescentis and F. prausnitzii shared the glucose provided in the community medium. Also, four out of ten solutions predicted that F. prausnitzii consumes the acetate produced by B. adolescentis, while the rest of the solutions predicted that F. prausnitzii consumes glucose from the media and produces acetate (S2 Fig). This behavior is consistent with previous studies that show the ability of F. prausnitzii A2-165 to function both as a consumer and a producer of acetate depending on the conditions [51]. However, to our current knowledge, acetate production from F. prausnitzii in the presence of B. adolescentis has not been observed experimentally. For this reason, more detailed research into the conditions that could affect the acetate production in cocultures of F. prausnitzii with B. adolescentis could yield interesting results.
We can see that in most of the solutions in which F. prausnitzii consumes acetate, it also produces CO 2 (S20 Table and S2 Fig, Solutions 1, 3, 5), while it does not produce CO 2 when it produces acetate (S20 Table and S2 Fig, Solutions 4, 6, 8, 9, 10). This observation is in agreement with existing information about the acetate consumption and production from F. prausnitzii A2-165 [51]. It is noted that in most of the solutions, CO 2 production from F. prausnitzii is accompanied by CO 2 consumption from B. adolescentis, while in the rest of the cases the reverse CO 2 crossfeeding is observed. The exchange of CO 2 between the two models could be an indication that F. prausnitzii and B. adolescentis can use CO 2 fixation to cover their carbon needs especially in environments with limited amounts of carbohydrates. This possibility is supported by an experimental study that shows that the presence of CO 2 in the medium promotes the anoxic growth of B. adolescentis species [76]. However, it is unclear if CO 2 fixation that leads to acetogenesis can be performed through the Wood-Ljungdahl pathway in B. adolescentis and in F. prausnitzii, as it happens in other species of the gut microbiome [77,78]. Furthermore, the production of SCFAs from the models complies with what is expected from literature [58] being that for most of the alternative solutions B. adolescentis produces acetate, lactate, ethanol and formate, while F. prausnitzii produces butyrate, fromate and succinate (S20 Table and S2 Fig). Finally, we observed patterns in the amino acid exchanges with alanine, leucine, serine and tryptophan transferred from B. adolescentis to F. prausnitzii, and aspartate and cysteine transferred from F. prausnitzii to B. adolescentis in the majority of the alternative solutions (S20 Table and S3 Fig).
Apart from the patterns that emerged regarding the exchanged metabolites in the community, the alternative solutions to the problem also indicate a consistency in the reactions that are added from the database to the model of F. prausnitzii A2-165 (S21 Table). However, this is not the case for B. adolescentis, since the reactions added from the database to the model of B. adolescentis ATCC 15703 are not the same through different solutions (S21 Table). Even though most of the added reactions carry realistic fluxes, some of the reactions added to B. adolescentis carry unrealistically high fluxes (in the order of 10 2 mmol�gDW -1 �h -1 ) and participate in thermodynamic infeasible cycles (S21 Table, Solutions 1 and 4, reaction PDBL_3).

ACT-3 community
In this case study, we simulated the cogrowth of Dehalobacter sp. CF and Bacteroidales sp. CF50 that represent the most abundant bacterial species of the dehalogenating ACT-3 community [64]. Our simulation is set up in order to represent the conditions of a previous experimental study on the interaction between Dehalobacter and Bacteroidales [64]. More specifically, our community is allowed to grow anaerobically on chloroform, that can be used by the model of Dehalobacter sp. CF as an energy source, and lactate, that can be used by the model of Bacteroidales sp. CF50 as a carbon and energy source. The two models are allowed to exchange organic acids and amino acids. We show that the growth in the community of Dehalobacter sp. CF and Bacteroidales sp. CF50 can be restored by the community gap-filling algorithm.
The first solution calculated by the community gap-filling algorithm suggested the addition of one reaction to the model of Dehalobacter sp. CF and two reactions to the model of Bacteroidales sp. CF50 (Table 4 and S22 Table). More specifically, the reaction of calcium transport via the ABC system (CA2abc) was added in the model of Dehalobacter sp. CF. The model uses the export of Ca 2+ through the ABC system as a proton exchange pump to generate ATP. The additions to the model of Bacteroidales sp. CF50 include the reactions of ammonia transport (NH4ti) and anthranilate synthase (ANS2) that are used in the model in order to import ammonia and use it to produce pyruvate from chorismate. Moreover, the first solution also predicts the ability of Bacteroidales sp. CF50 to uptake lactate from the medium and produce hydrogen, CO 2 , acetate, and malate that are used by Dehalobacter sp. CF. In parallel, Dehalobacter sp. CF consumes the hydrogen produced by Bacteroidales sp. CF50 and chloroform from the medium, and produces chlorine and dichloromethane as it performs dechlorination. In addition, Dehalobacter sp. CF provides pyruvate to Bacteroidales sp. CF50. Overall, we see that the gap-filling algorithm was able to predict the experimentally observed behavior of the community [64], and it also predicted the exchange of amino acids between the two community members, which remained elusive in the experimental study. The exchanges between the two models are depicted in Fig 4. A post gap-filling analysis shows that all biomass precursors are replenished in both models, and FBA calculated a growth rate of 0.1 d -1 for Dehalobacter sp. CF, and 0.01 d -1 for Bacteroidales sp. CF50.
An analysis of the ten best solutions calculated by the community gap-filling algorithm shows that the model of Bacteroidales sp. CF50 consistently uptakes lactate, the only carbon and energy source in the medium, and ferments it into CO 2 , hydrogen, acetate, malate, fumarate and succinate, while the model of Dehalobacter sp. CF consistently performs dechlorination by reducing chloroform and producing dichloromethane and chlorine (S23 Table). Moreover, all the alternative solutions predicted the ability of Dehalobacter sp. CF to consume part of the acetate and all the malate and hydrogen produced by Bacteroidales sp. CF50, while providing Bacteroidales sp. CF50 with pyruvate (S23 Table and S4 Fig).
In most of the solutions Dehalobacter sp. CF consumes part of the CO 2 produced by Bacteroidales sp. CF50 (S23 Table and S4 Fig, Solutions 1-5, 8, 10). This exchange could indicate that

Reaction Formula
Reaction (BiGG ID)

PLOS COMPUTATIONAL BIOLOGY
Dehalobacter sp. CF performs CO 2 fixation, which is possible since the species is known to possess the Wood-Ljungdahl pathway [64]. Furthermore, we observed patterns in the amino acid exchanges between the two community members, with arginine, asparagine, isoleucine, lysine, phenylalanine, threonine, and tyrosine transferred from Bacteroidales sp. CF50 to Dehalobacter sp. CF, and alanine and glutamate transferred from Dehalobacter sp. CF to Bacteroidales sp. CF50 in the majority of the alternative solutions (S23 Table and S5 Fig).
More information about the quality of the solutions calculated by the community gap-filling algorithm can be retrieved by closer inspection of the reactions that were added from the database to the models in the ten best solutions. We can see that in most of the solutions (S24 Table, Solutions 1-6, 8, 10), even though the reactions added to the model of Bacteroidales sp. CF50 have realistic fluxes, those added to the model of Dehalobacter sp. CF carry unrealistically high fluxes (close to the infinite reaction bounds). For example, the already discussed reaction of calcium transport via the ABC system (CA2abc), which is used for ATP production in the model, has unrealistically high fluxes and forms a cycle with the reaction calcium transport in/ out via proton antiporter (CAt4) that already exists in the Dehalobacter sp. CF model (S24 Table, Solutions 1, 2, 3).
In order to further investigate the ability of our community gap-filling method to properly simulate the cogrowth of Dehalobacter sp. CF and Bacteroidales sp. CF50, we repeated the simulation of the community after opening the constraints for all the exchange reactions of the two models. It is noted that if we set infinite lower and upper bounds for all the exchange reactions of the two models, then the community gap-filling algorithm calculates irrationally high fluxes for some of the exchange reactions and does not predict with high accuracy the experimentally identified metabolic interactions in the community. However, if we set a proper order of magnitude for constraining the exchanges of organic acids and amino acids in the community, as shown in S25 Table, our algorithm results to more realistic solutions. In this case, the first solution of the community gap-filling algorithm adds one reaction to the model of Dehalobacter sp. CF, which stimulates a proton pump used for ATP production, and only one reaction to the model of Bacteroidales sp. CF50, which participates in the amino acid metabolism of the model (S26 Table). A quick view of the ten best solutions calculated by the community gap-filling algorithm (S27 Table) shows that the model of Bacteroidales sp. CF50 ferments lactate to CO 2 , hydrogen, acetate, and malate, which are consumed by the model of Dehalobacter sp. CF that performs dechlorination and provides pyruvate to the Bacteroidales sp. CF50, as expected. The model of Dehalobacter sp. CF also demonstrates the ability to produce fumarate (S27 Table, Solutions 1-3, 6-9) which is consumed by the model of Bacteroidales sp. CF50. Regarding the amino acid exchanges, some slightly modified patterns are observed with isoleucine, phenylalanine, proline, threonine, and tyrosine transferred from Bacteroidales sp. CF50 to Dehalobacter sp. CF, and alanine, glutamate, and tryptophan transferred from Dehalobacter sp. CF to Bacteroidales sp. CF50 (S27 Table). From the ten best solutions of the community gap-filling algorithm (S28 Table, Solutions 1,2,4,5,7,8,9), we can see once again that the reactions added to the model of Dehalobacter sp. CF carry unrealistically high fluxes.

Discussion
Our results demonstrate the ability of the community gap-filling method to resolve metabolic gaps and restore growth in metabolic models by adding the minimum possible number of biochemical reactions from a reference database to the models and activating already existing reactions of the models, while the models are allowed to interact metabolically. In the first test case study, the algorithm restored community growth and predicted the exchange of acetate between the two E. coli strains. For the community of B. adolescentis and F. prausnitzii in the second case study, the algorithm calculated the competitive consumption of glucose present in the common medium, the syntrophic exchange of acetate, CO 2 and amino acids, and the production of SCFAs. The algorithm also identified the exchange of organic acids, amino acids, CO 2 and H 2 for the community of Dehalobacter and Bacteroidales. In all three case studies, the metabolic exchanges were not strictly forced with constraints, but they emerged from the algorithm. Moreover, we proved that the community gap-filling algorithm adds less reactions to the metabolic models used in our three case studies compared to individual-organism gap-filling methods and therefore, it is less prone to false positive predictions about the metabolic functions of GSMMs. We also showed that the implementation of FVA in our algorithm resulted in significantly decreased solution time of the MILP community gap-filling problem.
In general, the community gap-filling method could facilitate the study of complex communities where the existing information for the involved species is incomplete. A method like this can be used at the last steps of the metabolic reconstruction process, and it can also become part of the efforts for the creation of a microbiome modeling toolbox [79]. However, like every other constraint-based method, our community gap-filling method has some weaknesses. One of the most prominent challenges is that the performance of the algorithm depends highly on the quality of the input models and databases being used. More specifically, the absence of some important reactions from the models can lead to poor predictive power, while unconstrained reaction directionality, especially for exchange reactions, can affect the predicted metabolic interactions and give birth to flux-carrying cycles between the community members. The algorithm performs better with highly curated models, but even then it shows preference over specific pathways. This bias comes from the fact that the process of manual curation and gap-filling for a model is based on the use of specific media and requires knowledge about the growth of the organism under specific conditions. In addition, the composition and the quality of the database reactions is important. Ideally the database should contain balanced biochemical reactions that represent a diversity of metabolic functions observed in prokaryotic organisms, or any type of organisms that represents the models being used. Regarding the solutions of the algorithm that predict reactions with unrealistically big fluxes and reactions that form thermodynamically infeasible cycles, they can either be excluded completely from the analysis of results as they are not trustworthy, or they could be avoided by further constraining the reactions of the metabolic models based on thermodynamic information [80,81]. From the perspective of computational efficiency, the total running time of the algorithm can be further decreased with the use of newer and faster parallel implementations of FVA, such as VFFVA [82].

Conclusion
In this work, we presented an algorithm that performs gap-filling while permitting metabolic interactions in microbial communities. Metabolites produced from one organism can either be released to the environment or used by another member of the community. Therefore, one of the key characteristics of the community gap-filling method is that it attempts to identify if interspecies exchanges of metabolites can compensate for insufficient metabolic capabilities of the community members, instead of just adding biochemical reactions from reference databases to the metabolic models of the community members. Our results showed that the algorithm can successfully predict both cooperative and competitive metabolic interactions in microbial communities and comply with experimental measurements and observations. With the ability to predict elusive metabolic interactions, the community gap-filling method can be used to generate hypotheses about potential metabolic relationships among community members. This direction is useful not only for the improvement of metabolic models, but also for understanding functions at the community level and providing valuable information for selective media composition that can enable the isolation of pure cultures for microorganisms.
The community gap-filling method can be further applied on communities with more than two members as it was demonstrated with the toy E. coli community. A way to use the community gap-filling method for the study of microbial communities could be the following. The community gap-filling method-preferably as formulated in S3 Appendix where the mass balance for the metabolites in the community is ensured-can be applied for the simulation of a microbial community on different media, while the constraints for all the exchange reactions of the participating models in the community are open and set to a proper order of magnitude. In this way, we can generate plenty of data on the function of the community in different environments. Evaluating these results to identify the reactions added to the models, as well as the metabolic interactions between the models, can offer an insight on which reactions are necessary for the metabolic reconstructions and which metabolic interdependencies are prevalent in the community. Such predictions can be further tested experimentally and can be used for improving gene-annotations by identifying gene functions, not based on sequence similarities, but rather on a growth restoration strategy.  (XLSX) S15 Table. Reaction fluxes from the best solution for the toy E. coli community and reactions added to the models. (XLSX) S16 Table. Fluxes of exchange reactions from the ten best solutions for the toy E. coli community.

Supporting information
(XLSX) S17 Table. Fluxes of added database reactions from the ten best solutions for the toy E. coli community.
(XLSX) S18 Table. Reaction fluxes from the best solution for the three-member toy E. coli community and reactions added to the models. (XLSX) S19 Table. Table. Fluxes of exchange reactions from the ten best solutions for the toy E. coli community (formulation with relative abundances). (XLSX) S31 Table. Fluxes of added database reactions from the ten best solutions for the toy E. coli community (formulation with relative abundances). (XLSX) S32 Table. Reaction fluxes from the best solution for the community of Dehalobacter sp. CF and Bacteroidales sp. CF50 and reactions added to the models (formulation with relative abundances). (XLSX) S33 Table. Fluxes of exchange reactions from the ten best solutions for the community of Dehalobacter sp. CF and Bacteroidales sp. CF50 (formulation with relative abundances). (XLSX) S34 Table. Fluxes of added database reactions from the ten best solutions for the community of Dehalobacter sp. CF and Bacteroidales sp. CF50 (formulation with relative abundances). (XLSX) S1 Fig. Time needed for the solution of the MILP problem with and without the prior use of FVA for the reduction of the solution space, depending on the number of organisms in the community. As the number of organisms that make up the microbial community increases, the community gap-filling algorithm needs increasingly more time to solve the MILP problem that restores growth in the community. However, the increase in solving time is significantly reduced with the use of our community gap-filling method that reduces the solution space of the MILP problem by performing FVA for each organism compartment of the community before formulating the optimization problem. (The presented measurements were performed with the models from the toy E. coli community). (TIF)