Inferring microbial interactions with their environment from genomic and metagenomic data

Microbial communities assemble through a complex set of interactions between microbes and their environment, and the resulting metabolic impact on the host ecosystem can be profound. Microbial activity is known to impact human health, plant growth, water quality, and soil carbon storage which has lead to the development of many approaches and products meant to manipulate the microbiome. In order to understand, predict, and improve microbial community engineering, genome-scale modeling techniques have been developed to translate genomic data into inferred microbial dynamics. However, these techniques rely heavily on simulation to draw conclusions which may vary with unknown parameters or initial conditions, rather than more robust qualitative analysis. To better understand microbial community dynamics using genome-scale modeling, we provide a tool to investigate the network of interactions between microbes and environmental metabolites over time. Using our previously developed algorithm for simulating microbial communities from genome-scale metabolic models (GSMs), we infer the set of microbe-metabolite interactions within a microbial community in a particular environment. Because these interactions depend on the available environmental metabolites, we refer to the networks that we infer as metabolically contextualized, and so name our tool MetConSIN: Metabolically Contextualized Species Interaction Networks.


Introduction
Microorganisms have profound impacts on ecosystems ranging from the human gut to forest soils to plant root systems.In humans, recent advances in technology have created a plethora of works describing the differences in microbial community, 1/21 arXiv:2307.02624v1[q-bio.QM] 5 Jul 2023 composition, and function between diseased patients and healthy controls [1][2][3][4][5][6][7], clearly demonstrating that microbial communities play an important role in human health.Likewise, environmental microbial communities have been found to affect biogeochemical cycling in soil [8], leading to changes in plant decomposition and soil carbon sequestration that effect the amount of greenhouse gases in the atmosphere.Even in plants, rhizosphere microbial communities affect growth and resilience [9] as well as response to drought [10].
To understand and predict the effects of microbial communities on their environment, we must first understand how these communities assemble and interact.Biotic interactions between microbes are a driving force in community assembly, with both positive and negative interactions between microorganisms creating different community compositions.Moreover,the ability for non-resident microorganisms to invade the community is also largely controlled by biotic interactions [11], which makes it critical to understand these interactions to accurately predict treatment success for microbiome manipulation.It is well established that community structure is important in determining the impact of the microbiome on its host environment.For example, disease-free asymptomatic individuals will have pathogenic bacteria in their microbiome [7], suggesting that community structure and microbial interactions affect the host-microbe relationship.
The advent of modern sequencing and metabolic pathway analysis has led to an effort to organize this data into useful models of microbes and microbial communities.These models, which represent mathematically the internal network of chemical reactions within a cell's metabolism are called genome-scale metabolic models (GSMs) [12,13].GSMs and the constraint-based reconstruction and analysis (COBRA) methods that make use of them have shown growing promise in predicting and explaining the structure and function of microbial communities [14][15][16][17].However, the complexity of these models means that analysis is often based only on simulation, and is very sensitive to parameters and other assumptions.For example, many modern community metabolic modeling methods seek to predict co-culture growth or biomass at chemostatic equilibrium using artificial community-wide constraints [18][19][20].On the other hand, dynamic methods that use predictions about growth rate and metabolite consumption to construct a dynamical system suffer from dependence on unknown metabolic parameters and initial conditions, as well as heavy computational cost [21][22][23].In fact, most tools for community modeling only provide predictions of species growth rates and metabolite consumption, without providing an understanding of the fundamental interactions that lead to these predictions [24,25].Some qualitative insight into the systems is possible using simulated knock-out experiments [20,26] or simplifying the system [27].However, new methods for qualitative analysis of community metabolic models are needed.
An interaction network provides an interpretable object that can be used to characterize a microbial community in more depth than composition alone [28], and suggest keystone taxa and other functional properties of the community [29,30].These advantages, and the apparent importance of microbial interactions, have led to the use of network inference and analysis for understanding important phenomena including disease treatment [31] and human impact on the climate [32].The most commonly used method for network inference involves computing the propensity for microbes to appear together in a sample, most commonly defined by co-occurrence frequency, correlation, or covariance [33,34].More sophisticated methods for inferring associations between microbes include the use of regression-based and probabilistic models [28,35] or fitting to time-longitudinal data [36].Additionally, some modern methods have sought to combine mechanistic hypotheses with statistical network building using machine learning approaches by incorporating "background knowledge" of known microbial interactions [37] or using simple microbial characteristics along with a set of known interactions [38].
GSMs and COBRA modeling provide an attractive avenue for a "bottom up" approach to network building from underlying metabolic mechanism [39].This can be done using simulated knock-out experiments [20], but this approach suffers from a focus on direct microbe-microbe interactions, which lead to models that lack the complexity of full metabolite mediated networks [40,41].While differences in networks across meta-groups may provide insight, these networks in general provide few avenues for prediction and design.Patterns in network structure cannot be directly related to function without further study, and networks built in this way cannot account for dynamically changing interactions across perturbations in the environment.
In this manuscript, we present a method for inferring interactions between microbes and metabolites within a microbial community by leveraging genome-scale metabolic models (GSMs).This method requires only some method of constructing GSMs as well as an estimate of the metabolic environment of the community.GSMs can be built as long as a genome can be assigned to each member of the community, using automated construction methods such as CarveME [42] or modelSEED [43].Assigning genomes to community members in a sample can be done with genomic or metagenomic data, or if that data is not available, a less accurate assessment can be done by matching amplicon sequence data with previously characterized genomes.
Our method is based on Flux balance analysis (FBA), which allows us to infer microbial growth and exchange of metabolites with the environment.These can be combined into a dynamical system, called dynamic flux balance analysis (DFBA) which in turn can be represented as a sequence of networks.Simulation of dynamic flux balance analysis requires the solution to a linear optimization problem at each time-step.These solutions can be found without repeated optimization by using a basis for an initial solution, which allows us to find new solutions as the problem constraints change simply by solving a linear system of equations.This means that we can reformulate the dynamical system as an ordinary differential equation (ODE) that has solutions that match the solution to the DFBA problem for some time interval.Finally, this ODE system can be naturally interpreted as a network of interactions between microbes and metabolites, and also provides a network of interactions between the metabolites that is mediated by the microbial metabolisms of the community.Figure 1 provides a graphical summary of the method.

Background Dynamic flux balance analysis
Advances in genetic sequencing have led to the construction of genome scale models (GSMs) of the metabolic pathways of microbial cells, and to methods to analyze and draw insight from such large scale models [12].Constraint based reconstruction and analysis (COBRA) is used to model steady state fluxes ψ i through a microorganism's internal metabolic reactions under physically relevant constraints [12].Flux balance analysis (FBA) is a COBRA method that optimizes some combination of internal reaction fluxes which correspond to increased cellular biomass, subject to the constraint that the cell's internal metabolism is at equilibrium.
Precisely, flux balance analysis assumes that cell growth and metabolic flux can be Internal fluxes ϕ maximize growth: max(γ • ϕ) Exchange constrained by metabolite availability: Cell at chemical equilibrium: Reactions constrained: determined by solving the following linear program [23]: where the matrices Γ * , Γ † together represent the stoichiometry of the cell's metabolism, the vector ψ represents the flux through the cell's internal reactions, the objective vector γ encodes the cell's objective, exchange constraints c 1 , c 2 are determined in part by available external metabolites and internal constraints d 1 , d 2 are known.Exchange rates v j of metabolite j between the cell and its environment are in turn determined by internal flux according to v = Γ * ψ.For convenience, we define a vector to be the vector of all of the problems constraints.
Solutions to FBA provide a rate of increase of biomass which can be interpreted as a growth rate for a cell.Furthermore, FBA solutions allow us to compute the vector v, which represents metabolite exchange between the cell and an external metabolite pool.By assuming that constraints on nutrient exchange reactions within the metabolic network are functions of the available external metabolites, the coupled system of microbe and environment can be modeled.For a community of microbes x = (x 1 , ..., x p ) in an environment defined by the concentration of nutrients y = (y 1 , ..., y m ) this model has the form [23]: with ψ i determined separately for each organism according to a linear program of the form eq. ( 1).This system is referred to as dynamic flux balance analysis (DFBA).Note that this is a metabolite mediated model of the community, meaning that the coupling of the growth of the separate microbes is due to the shared pool of metabolites y.

Piece-wise smooth representation
Simulation of the dynamical system given by eqs.( 3) and ( 4) can be accomplished by leveraging the fundamental theorem of linear programming [23], which states that if eq. ( 1) has an optimal solution, then it has an optimal solution that can be represented as the solution to an invertible system of linear equations [44].This means that there is some invertible matrix B and index set B such that is an optimal solution to the linear program (where for ease of notation we substitute c = c B ).The key observation allowing efficient forward simulation of eqs.( 3) and ( 4) is that as the constraints c 1 (y), c 2 (y) vary, the matrix B does not change.In other words, there is some time interval such that we can replace the linear program eq. ( 1) with the linear system of equations for some time-interval, where c i is a subset of the bound functions c i .At the end of this time interval, the solution to eq. ( 6) stops obeying the problem constraints, and new B i must be chosen.Putting this together, we can define a sequence of time intervals [t 0 , t 1 ), [t 1 , t 2 ), ..., [t n−1 , T ) such that solutions to the system defined by dynamic FBA for a community (eqs.( 1), ( 3) and ( 4)) are solutions to the system of ODEs on the interval [t k , t k+1 ) for some invertible matrices B k i .The challenge of efficient forward simulation of DFBA is then in finding the matrices B k i , which may be non-unique.In previous work, we presented a method for choosing the set of B k i that allow forward simulation so that t k+1 > t k , and created a python packaged called SurfinFBA for simulation.Furthermore, we have recently improved this method to increase the length of the time intervals [t k , t k+1 ) (see appendix).This improvement is packaged with the MetConSIN package, which includes SurfinFBA.

MetConSIN network construction
The dynamical system defined by DFBA (eqs.( 1), ( 3) and ( 4)) can be simulated but is difficult to interpret and analyze, especially when accounting for uncertainty in initial conditions and bound functions c 1 i (y), c 2 i (y).However, eqs. ( 7) and ( 8) suggest that the system can be interpreted as a network of interactions between microbes and metabolites on the time interval [t k , t k+1 ) with only mild assumptions on c 1 i (y), c 2 i (y).DFBA therefore implies a sequence of interaction networks representing the dynamics of a microbial community.Furthermore, for a fixed metabolic environment (i.e.fixed y) DFBA provides an interaction network representing a snapshot of community metabolic activity.
Without loss of generality, we may construct eq. ( 1) so that the forward direction (i.e.positive flux) of each of the first m reactions transports one of the m environmental metabolites into the cell.Then we assume that c 2 (y) has the form with non-decreasing c 2 ji (y j ), and that c 1 i (y) = c 1 i is constant.In plain language, we assume that the fluxes of reactions that transport environmental metabolites into the cell are bounded by the availability of the corresponding environmental metabolites, and the other reactions have constant bounds.
Under this assumption, for time-interval k, eq. ( 7) can be rearranged into the form where C k i is a constant that we refer to as intrinsic growth, and the a k ij are combinations of entries in γ i and (B k i ) −1 .Likewise, eq. ( 8) can be rearranged into the form where D k il is a constant, and the b k ijl are entries of the matrix Γ * i (B k i ) −1 .We may now interpret these ODEs as networks of interactions term by term.Equation ( 10) can be interpreted as growth of a microbial population proportional to the population biomass, with growth rate modified by the environmental metabolites y j .This is similar to a generalized Lotka-Volterra model [41], and can be naturally represented by a set of network edges pointing from a metabolite to the microbe with weights a k ij .The terms in eq. ( 11) are slightly more complicated to interpret.The terms D k il x i represent some effect of the microbe i on the available biomass of j over the time interval [t k , t k+1 ) which only changes with the biomass x i of microbe i over this interval.This effect is the result of growth pathways that do not depend on metabolite availability, and may be 0. Additionally, when j = l, the term b k ill x i c 2 il (y l ) can be interpreted as pairwise interactions between microbe i and metabolite l, e.g.consumption of a carbon source.These two sets of terms can be represented by a set of network edges pointing from the microbe to the metabolite, with weights D k il + b k ill .The remaining terms represent interactions that involve two metabolites and one microbe.In the formalism of interaction network theory (see [45][46][47][48]), these remaining terms can be interpreted as reactions of the form 6/21 if b ijl < 0 (and so the interaction increases the available biomass of metabolite l), meaning that microbe i and metabolite j interact to form metabolite l.This means, e.g., that metabolite l is created as a byproduct of the metabolism of metabolite j by microbe i.In this case, we again represent the interaction as a network edge from microbe i to metabolite l, but now annotate the edge with the information that this interaction is mediated by metabolite j.Finally, we may do the same if b ij > 0, although we note now that this represents a non-autocatylitic interaction, meaning that the available biomass of metabolite l is reduced independent of the current available biomass.While this seems counter-intuitive, it arises when metabolite l is consumed in some metabolic pathway but it is not the rate-limiting external metabolite for that pathway.In fact, when enough of the biomass is consumed so that metabolite l becomes rate-limiting, the system will transition to the next interval [t k+1 , t k+2 ) and the ODEs eqs.( 10) and (11) will change.This transition ensures that the non-negative orthant is forward invariant for the DFBA system, meaning the system will non achieve a non-physical state of negative biomass.The mapping from the ODEs to a microbe-metabolite network is summarized in table 1.

Metabolite interaction network construction
In addition to the microbe-metabolite interaction network described above, the last set of interactions suggest a second network can be formed which includes only the metabolites.In the microbe-metabolite network, the terms b k ijl x i c 2 ij (y j ) in eq. ( 11) describe how microbe i effects the available biomass of metabolite l as mediated by metabolite j.We may instead interpret this as the metabolite j effecting the available biomass of metabolite l through reactions carried out by microbe i.This interpretation suggests a network of edges labeled by the microbe whose metabolism contributes the edge.

Microbe interaction network construction
While dynamic FBA can be written as a series of microbe-metabolite interaction networks, researchers are often interested in the emergent interactions between microbes themselves.MetConSIN provides a simple heuristic for inferring these interactions based on the competitive and cross-feeding interactions of the microbe-metabolite network.The heuristic is as follows: to determine the effect of microbe X i on the growth of microbe X j , we find the set of all paths of length two of the form with weights in the microbe-metabolite network w il and w lj .If w il < 0, meaning that X i consumes or otherwise depletes Y l , while w lj > 0, e.g.Y l is a limiting resource for the growth of X j , then this pair of interactions can be interpreted similarly to competition between X i and X j , although this competition does not need to be symmetric.Conversely, if w il > 0 and w lj > 0, then the presence of X i will increase the growth of X j through cross-feeding.We therefore take as the composite edge weight wij of the emergent interaction X i wij − − → X j to be the sum over all such paths of the products of the weights of the edges in the path: wij = l:Xi→Y l →Xj w il w lj . (13)

Bacterial Isolation
Ten bacterial isolates were originally isolated using serial dilution from soils collected in Utah [49]  Tryptic soy Medium (TSB), or Nutrient medium at 30oC for 24-72 hours depending on the strain.Single colonies were then transferred in their respective growth medium and grown for 24-72 hours at 30oC while shaking at 250rpm.Bacterial biomass was harvested from overnight cultures by centrifugation and High Molecular Weight (HMW) DNA extractions were completed using the Qiagen MagAttract HMW DNA Kit (Qiagen, Hilden, Germany) following manufacturer's protocol.Two of the bacterial isolates required an additional clean-up after DNA extraction which was completed using the Qiagen PowerClean Pro Clean-up Kit (Qiagen, Hilden, Germany) following manufacturer's protocol.

Library Preparation
DNA library preparation and sequencing was completed at the LANL Genomics Facility as described in detail below.DNA initial quantification was done using Qubit High sensitivity ds DNA kit (Invitrogen).DNA integrity was assessed on Tape Station using gDNA Screen tape (Agilent Technology).DNA purity ratios were determined on NanoDrop 1 spectrophotometer (ThermoScientific).1ug of genomic DNA for each sample was sheared using g-Tubes (Covaris, USA).Shearing parameters were chosen according to the DNA integrity of a particular isolate.All but two samples were sheared the following way: shear 2 min at 7,000 rpm, flip the tube, and shear 2 min at 7,000 rpm.More fragmented samples were sheared using the following parameters: shear 2 min at 3,500 rpm, flip the tube, and shear 2 min at 3,500 rpm.Sheared DNA was collected and purified using AMPure PB beads (Pacific Bioscience, USA) as per PacBio protocol.The quality and quantity of the purified DNA were assessed using the TapeStation and Qubit as described above.SMRT bell templates were constructed according to the PacBio protocol using Express Template Prep Kit 2.0.First, DNA underwent damage repair, end repair and A-tailing.It was followed by the barcoded overhang adapter ligation and purification with 0.45X volume of AMPure PB beads (Pacific Bioscience, USA).The barcoded samples were pooled in the equimolar amount according to the volumes provided in the PacBio Microbial Multiplexing Calculator.The pooled SMRT bell library was quantified using the Qubit DNA HS kit (Invitrogen) and the average fragment size was determined on Bioanalyzer using the DNA HS kit (Agilent).The conditioned sequencing primer v. 4 was annealed and Sequel II DNA Polymerase 2.0 was bound to the SMRT bell library.The template/ DNA polymerase complex was diluted and purified with 1.2X volume of AMPure PB beads (Pacific Bioscience, USA).Priestia megaterium strain s92 1404 16 Table 2.The 10 bacterial genomes that were used to demonstrate the method were isolated from soil samples and sequenced with PacBio.Genomes were assembled using the procedure detailed in the methods section and are available on the sequence read archive database.We constructed GSMs for these genomes using modelSEED, which are available in the Examples directory of the project repository.Full details of the sequencing results can be found in S2 Table.
instrument, using 1 SMRT cell 8M and Sequencing chemistry 2.0, 30 hour movies were recorded.

Sequencing
The raw PacBio reads were converted to PacBio HiFi reads using the "CCS with Demultiplexing" option in SMRTLink 11.0.0.146107.This resulted in a total of 1,532,731 HiFi reads for a total yield of 8.3 Gbp.The median read quality was Q37 with a mean read length of 5,409 bp.The reads were assembled using Flye v.2.9-b1768.Putative number of plasmids were estimated by looking at the assembly_info.txtfiles output by Flye.This file indicates if a contig is circular and/or a repeat.Contigs that were indicated as circular but not a repeat, as well as under 500 Kbp were assumed to be plasmids.Contigs that had the same attributes but were over 500 Kbp were assumed to be a complete bacterial chromosome.Then, the assemblies were annotated using Prokka v.1.14.6.The taxonomy of the genomes were derived using gtdbtk v.1.5.0.

Genome-Scale Model Reconstruction & MetConSIN Simulation
Genome-scale models for the 10 bacterial genomes were created using modelSEED [43] within the KBase computational platform [50].The models were gap-filled with a complete media.The resulting models were used to test the MetConSIN simulation method, with models labeled according to the genome ID of the corresponding bacterial genomes.For the clarity of the network figures, we label the nodes corresponding to each model with the unique 1-or 2-digit integer that appears in the genome ID.Table 2 lists the IDs, classification, and node labels for the 10 models, and the supplemental file S2 Table contains details of the sequencing results.

Results & Discussion
MetConSIN provides analysis of the dynamic flux balance analysis (DFBA) system by inferring a set of interaction networks from that system.To demonstrate this utility, we simulated the growth of 10 taxa isolated from soil using DFBA, and used MetConSIN to construct the series of interaction networks that the community behaved according to over the course of the simulation.Figure 2 shows the simulated growth of genome-scale models of all 10 taxa in the simulated community on a finite media in an aerobic environment, all of which reached stationary phase when glucose was depleted.The community grew through a set of 3 distinct time-intervals, each with a corresponding species-metabolite and metabolite-metabolite network.These networks, as well as the time-weighted variance between them are shown in fig. 3 and fig. 4. The two major transitions in the simulation both involved a series of basis-changes, meaning that one or more microbes altered their connectivity in the network.The first transition occurred when the model of genome bc1012 altered its connectivity three times in rapid succession.In the second transition, models of genomes bc1010, bc1002, bc1015, bc1003, bc1001, bc1009, and bc1012 all altered their network connectivity, with bc1015 and bc1001 doing so twice.We display which microbe changed its network connectivity at each transition with the style and color of the dashed lines in fig. 2 that indicate the time-points at which the transitions occurred.
The two sets of networks provide a mechanistic explanation of the microbial growth and metabolic activity of the community.These networks tell us which microbes are consuming and producing environmental metabolites, as well as how the environmental metabolites effect cell growth.For any time interval, an edge from a metabolite to a microbe has a non-zero edge weight if and only if the simulated growth rate of the microbe is a function of the concentration of the metabolite during that interval.Inspection of the network reveals that only a few such edges exist, even though many metabolites are depleted by microbes.This is because only a subset of the constraints of flux balance analysis determine the growth rate, as indicated by the basic index set B that is used to solve DFBA.In other words, only rate-limiting metabolites appear as source nodes in the network.
Inspecting the network can reveal interesting time-dependent interactions.For example, we notice in fig. 4 that for t ∈ [0.24, 0.42), community metabolism of D-Glucose causes consumption of Fumarate and production of Succinate.This interaction is very strong during this interval, which lies between two time-points at which the model of genome bc1012 changes its network connections, so we might guess that this interaction is mediated by that model.MetConSIN provides edge data for each edge in the network, including in the case of the metabolite-metabolite networks which microbe mediated the interaction.Inspection of this output reveals that, indeed, the model for genome bc1012 mediates the interactions between D-Glucose, Fumarate, and Succinate.
MetConSIN's analysis provides an avenue for using dynamic FBA to infer how microbes interact and how these interactions vary with community composition and over time.For example, we can infer from MetConSIN that the ten taxa whose genomes we isolated from soil behave antagonistically due to competition for resources, as seen in fig. 5 (a) and (b).
MetConSIN's microbe-microbe interactions are based on a simple heuristic meant to identify competition and cross-feeding.This works well if an interaction between two microbes is based on a single metabolite, but simply summing the interactions is likely not the best approximation.In future work, we plan to define a more rigorous simplification of the metabolite-mediated system as a direct microbe interaction system and characterize the error of this simplification.
Our ten-member community showed only negative interactions in part because the genome-scale models that we used include the core metabolism of each taxa, making competition easy to identify, but do not include many details on the production of secondary metabolites.Secondary metabolites are compounds produced by bacteria that do not have a direct role in cell growth, but can have a profound impact on 10/21 community organization [51][52][53].Genome-scale modeling often focuses on the core metabolism and growth of an organism, meaning that these metabolites are often missing.This omission is a major challenge for any method that seeks to use GSMs to study microbial ecology.For MetConSIN to incorporate interactions mediated by secondary metabolites, the GSMs used must already include pathways that produce these metabolites.Furthermore, FBA constraints must be carefully chosen so that models do not simply ignore secondary metabolites in favor of immediate growth.As genome-scale models improve to include secondary metabolite production, MetConSIN can likewise be improved to infer interactions from secondary metbaolites.
We observe antagonistic interactions in all of the subsets of the community that we simulated in isolation, but the strength of the competition may vary in with different community composition.Indeed, fig. 5 (c) shows that the implied relationships emerging from competition for resources are not the same in a five-model subset of the community as when these five models are simulated as part of the larger community of 10 models.
The species-metabolite and metabolite-metabolite networks provided by MetConSIN offer mechanistic insight into the metabolic activity of microbial communities, including identification of how metabolic connections change with community composition.In fig.6, we investigate how the strengths of the various connections one model, bc1001, had in networks produced by MetConSIN for various communities involving bc1001.For example, when grown in simulated coculture with bc1016 and bc1009, bc1001 tended to form weaker network connections than when grown in other combinations.Interestingly, when bc1001 was grown in simulated coculture with bc1015 and bc1009, it formed stronger connections compared to when simulated with bc1016 and bc1009, even though switching bc1015 for bc1016 had little effect in other combinations.These connection differences are a possible mechanistic explanation for differential metabolic activity between communities, and suggest a course of further investigation into the metabolic impact of the various combinations of modeled organisms.
DFBA provides a model for the population dynamics of microbial communities by leveraging genomic data.This means that dense time-longitudinal data is not required for simulation.Despite this important advantage, the usefulness of the DFBA model is limited.This is because a thorough qualitative analysis of the resulting dynamical simulation is often impractical due to the system's complexity.MetConSIN achieves an important step forward in analyzing DFBA simulations by organizing the complexity of DFBA into a sequence of interaction networks, which are more familiar and readily understood.This tool therefore gives researchers the power to infer important characteristics of the dynamic metabolic activity of a microbial community from genomic data.
MetConSIN depends on dynamic flux balance analysis and the genome-scale metabolic models that define that system.While this does mean that MetConSIN is essentially limited by the quality of the GSMs used, it also means that MetConSIN provides a method by which to assess the quality of these models.With high-quality GSMs, MetConSIN provides the ability to create qualitative predictions about community metabolic activity which can be used to generate testable hypotheses.MetConSIN can created testable hypotheses about the (1) resource competition and (2) community assembly dynamics in our synthetic communities.Furthermore, with MetConSIN, the accuracy of these hypotheses can be used to judge the usefulness of and ways to improve the underlying GSMs.
Ultimately, MetConSIN provides a rigorous interpretation of DFBA that emerges directly from the dynamics of the system.This tool is an important step in increasing the utility of genomic data and COBRA methods in the study of microbial communities and their impact on their environment.The 10 taxa isolated from soil samples all grow to stationary phase at varying rates.The community grows in 3 distinct time-intervals, each with a specific interaction network associated.The dotted lines indicate time-points at which SurfinFBA required a new basis for forward simulation of at least one taxa.When only one new basis was needed, the color of the dotted line corresponds to which taxa required a new basis.A black dotted line (e.g. at t = 0.226) indicates that no new basis was computed, but step size needed to be reduced.Only ten metabolites varied in simulated biomass by more than 1%; five were produced by the simulated community and five were consumed.The remaining environmental metabolites did not vary in simulated biomass by more than 1%, and so are not shown.All metabolites were initially set to the same value and an aerobic environment was simulated by constant inflow of oxygen.See the Example folder of the project repository for exact simulated conditions.) from soil and modeled using ModelSEED behaves according to a series of 3 interaction networks.One powerful application of MetConSIN is an inspection of how the community changes its qualitative behavior as time proceeds.To illustrate the differences, we show here the networks corresponding to each significant time interval as well as a network with edges weighted by the variance of the edges across those intervals.In the three networks corresponding to time intervals, edge color represents the sign of the interaction, with red representing negative and blue representing positive, while edge thickness represents interaction strength.In the variance-weighted network, edge color represents the sign of the time-averaged interaction, while edge thickness represents variance of that interaction across the time intervals.The 10 microbial taxa and the metabolites that are significantly produced or consumed are colored to match fig.  ) as represented by the emergent interactions between metabolites due to microbial metabolisms.This network is not constant over time, and changes with the species-metabolite networks.These networks make clear the importance of D-Glucose (green triangle) and Oxygen (red arrowhead) as rate-limiting metabolites in the community's metabolic activity.Here, we show the network for each significant time-interval, as well as a network with edges weighted by the variance of the edges across those intervals.In the three networks corresponding to time intervals, edge color represents the sign of the interaction, with red representing negative and blue representing positive, while edge thickness represents interaction strength.In the variance-weighted network, edge color represents the sign of the time-averaged interaction, while edge thickness represents variance of that interaction across the time intervals.
(a) Average interaction of a five-member community.
(b) Average interaction of a ten-member community.

Fig 1 .
Fig 1. MetConSIN reformulates dynamic flux balance analysis (represented by the top-left figure) as a series of smooth ordinary differential equations (top-right).This dynamical system provides an model of a microbial community and its environment (represented by the bottom-left figure) as a series of metabolite mediated networks (bottom-right).
Simulated growth of 10 taxa isolated from soil samples.Metabolites produced in simulation.Metabolites consumed in simulation.

Fig 2 .
Fig 2.The 10 taxa isolated from soil samples all grow to stationary phase at varying rates.The community grows in 3 distinct time-intervals, each with a specific interaction network associated.The dotted lines indicate time-points at which SurfinFBA required a new basis for forward simulation of at least one taxa.When only one new basis was needed, the color of the dotted line corresponds to which taxa required a new basis.A black dotted line (e.g. at t = 0.226) indicates that no new basis was computed, but step size needed to be reduced.Only ten metabolites varied in simulated biomass by more than 1%; five were produced by the simulated community and five were consumed.The remaining environmental metabolites did not vary in simulated biomass by more than 1%, and so are not shown.All metabolites were initially set to the same value and an aerobic environment was simulated by constant inflow of oxygen.See the Example folder of the project repository for exact simulated conditions.

Fig 3 .
Fig 3. A community of 10 taxa (detailed in table2) from soil and modeled using ModelSEED behaves according to a series of 3 interaction networks.One powerful application of MetConSIN is an inspection of how the community changes its qualitative behavior as time proceeds.To illustrate the differences, we show here the networks corresponding to each significant time interval as well as a network with edges weighted by the variance of the edges across those intervals.In the three networks corresponding to time intervals, edge color represents the sign of the interaction, with red representing negative and blue representing positive, while edge thickness represents interaction strength.In the variance-weighted network, edge color represents the sign of the time-averaged interaction, while edge thickness represents variance of that interaction across the time intervals.The 10 microbial taxa and the metabolites that are significantly produced or consumed are colored to match fig.2, and the remaining environmental metabolites are shown as unlabeled blue nodes.

Fig 4 .
Fig 4.MetConSIN can infer the metabolic activity of the 10 member community (detailed in table 2) as represented by the emergent interactions between metabolites due to microbial metabolisms.This network is not constant over time, and changes with the species-metabolite networks.These networks make clear the importance of D-Glucose (green triangle) and Oxygen (red arrowhead) as rate-limiting metabolites in the community's metabolic activity.Here, we show the network for each significant time-interval, as well as a network with edges weighted by the variance of the edges across those intervals.In the three networks corresponding to time intervals, edge color represents the sign of the interaction, with red representing negative and blue representing positive, while edge thickness represents interaction strength.In the variance-weighted network, edge color represents the sign of the time-averaged interaction, while edge thickness represents variance of that interaction across the time intervals.

Fig 5 .Fig 6 .
Fig 5.The models of the ten genomes isolated from soil behave antagonistically.Figure (a) shows the time-averaged antagonistic interactions of a subset of five models, while figure (b) shows the time-averaged antagonistic interactions of the ten models simulated together.Figure (c) shows the difference in the time-averaged interactions common to the two networks, with line width corresponding to absolute difference, and color corresponding to signed-difference, where bluer shades represent a stronger interaction in the five model simulation and redder shades represent a stronger interaction in the ten model simulation.
The complex was sequenced on PacBio Sequel II