MASSpy: Building, simulating, and visualizing dynamic biological models in Python using mass action kinetics

Mathematical models of metabolic networks utilize simulation to study system-level mechanisms and functions. Various approaches have been used to model the steady state behavior of metabolic networks using genome-scale reconstructions, but formulating dynamic models from such reconstructions continues to be a key challenge. Here, we present the Mass Action Stoichiometric Simulation Python (MASSpy) package, an open-source computational framework for dynamic modeling of metabolism. MASSpy utilizes mass action kinetics and detailed chemical mechanisms to build dynamic models of complex biological processes. MASSpy adds dynamic modeling tools to the COnstraint-Based Reconstruction and Analysis Python (COBRApy) package to provide an unified framework for constraint-based and kinetic modeling of metabolic networks. MASSpy supports high-performance dynamic simulation through its implementation of libRoadRunner: the Systems Biology Markup Language (SBML) simulation engine. Three examples are provided to demonstrate how to use MASSpy: (1) a validation of the MASSpy modeling tool through dynamic simulation of detailed mechanisms of enzyme regulation; (2) a feature demonstration using a workflow for generating ensemble of kinetic models using Monte Carlo sampling to approximate missing numerical values of parameters and to quantify biological uncertainty, and (3) a case study in which MASSpy is utilized to overcome issues that arise when integrating experimental data with the computation of functional states of detailed biological mechanisms. MASSpy represents a powerful tool to address challenges that arise in dynamic modeling of metabolic networks, both at small and large scales.


Introduction
The availability of genome sequences and omic data sets has led to significant advances in metabolic modeling at the genome scale, resulting in the rapid expansion of available genomescale metabolic reconstructions [1].The attainability of new data has also led to the generation of new metabolic modeling software tools that can process these data [2][3][4].One of the most broadly used metabolic modeling software suites, COnstraint-Based Reconstruction and Analysis (COBRA) [5], provides a scalable framework that is invaluable for the contextualization and analysis of multi-omic data, as well as for understanding, predicting, and engineering metabolism [6][7][8][9][10][11][12][13][14][15][16].While several methods have been developed that allow COBRA models to integrate certain data types to model long timescale dynamics [17][18][19], COBRA models are inherently limited by the flux-balance assumption.
Kinetic modeling methods use detailed mechanistic information to model dynamic states of a network [20].The inclusion of multiple detailed enzymatic mechanisms presents challenges in formulating and parameterizing stable large-scale kinetic models.Further, additional issues arise when integrating incomplete experimental data into metabolic reconstructions, necessitating the need for approximation methods to gap fill missing values that satisfy the thermodynamic constraints imposed by the system [21,22].
Various efforts have been made to bridge the gap between constraint-based and kinetic modeling methods in order to address the challenges associated with dynamic modeling [21][22][23][24].One such methodology is the Mass Action Stoichiometric Simulation (MASS) approach, in which mass action kinetics are used to construct condition-specific dynamic models [24][25][26][27][28].The MASS modeling approach provides an algorithmic, data-driven workflow for generating in vivo kinetic models in a scalable fashion [29].The MASS methodology can be used in tandem with COBRA methods for both steady-state and dynamic analyses of a metabolic reconstruction in a single workflow.MASS models can incorporate the stoichiometric description of enzyme kinetic mechanisms and have been used to explicitly compute fractional states of enzymes, providing insight into regulation mechanisms at a network-level [26,30].The MASS modeling framework has been implemented in the MASS Toolbox [31], but is limited by its reliance on a commercial software platform (Mathematica).
freely available for academic use, with solvers and installation instructions found at their respective websites.Images for Docker containers that include the MASSpy software can be downloaded online from the DockerHub Registry (https://hub.docker.com/r/sbrg/masspy),or they can be built locally to include the licensed commercial optimization solvers.Instructions for MASSpy installation, including instructions for using MASSpy with Docker, are found in the repository README or in the documentation (https://masspy.readthedocs.ioand S2 File).The data, scripts, and instructions needed to reproduce results of the presented examples are also available on GitHub (https://github.com/SBRG/MASSpy-publication)and in the supplement (S3 File).
Here, we detail the Mass Action Stoichiometric Simulation Python (MASSpy) package, a versatile computational framework for dynamic modeling of metabolism.MASSpy expands the modeling framework of the COnstraint-Based Reconstruction and Analysis Python (COBRApy) [32] package by integrating dynamic simulation and analysis tools to facilitate dynamic modeling.Further, MASSpy contains various algorithms designed to address and overcome the issues that arise when incorporating experimental data and biological variation into dynamic models with detailed mechanistic information.By addressing the issues associated with integrating physiological measurements and biological mechanisms in dynamic modeling approaches, we anticipate that MASSpy will become a powerful modeling tool for modeling dynamic behavior in metabolic networks.The source code for MASSpy is freely available on GitHub (https://github.com/SBRG/MASSpy),and the documentation-along with tutorials-can be found at ReadTheDocs.(https://masspy.readthedocs.io).

Developing in Python
The MASSpy software package (S1 File) is written entirely in Python 3, an interpreted objectoriented high-level programming language that has become widely adopted in the scientific community due to its clean syntax with straightforward semantics that make it intuitive to learn [33] and its unique features for high-level scientific computing (e.g., a flexible interface to compiled languages such as C++ [34]).The open-source nature of Python avoids the inherent limitations associated with costly commercial software [35].Consequently, developing in Python provides access to a growing variety of open-source scientific software libraries, including key data science packages of the SciPy ecosystem [33,36].MASSpy relies extensively on these packages for numerical computation (NumPy [37]), symbolic mathematics (SymPy [38]), high-performance data structures that easily transfer and export content across packages (Pandas [39]), and data visualization (Matplotlib [40]).IPython (Jupyter) notebooks [41,42] provide an interactive Python environment for developing and executing various applications.Furthermore, containerization of applications through Docker [43] facilitates their exchange and deployment using portable containers bundled with standardized computational environments.See Table 1 for a list of scientific packages integrated into the MASSpy package and how they are utilized for various purposes within the MASSpy framework.

Building on the COBRApy framework
To facilitate the integration of constraint-based and dynamic modeling frameworks, MASSpy utilizes the COBRApy package [32] as a foundation to build upon and extend in order to support dynamic simulation and analysis capabilities.MASSpy derives several benefits from building on the COBRApy framework, including exploiting the direct inclusion of various COBRA methods already implemented in Python.The inclusion of COBRA methods is made simple using Python inheritance behavior; the three core COBRApy classes (Metabolite, Reaction, and Model) serve as the base classes for three core MASSpy classes (MassMetabolite, MassReaction, and MassModel) as described in the MASSpy documentation (https://masspy.readthedocs.ioand S2 File).Inheritance behavior maintains the functionality of the COBRA methods for gene deletion studies by preserving the Boolean representation of Gene-Protein-Reaction (GPR) rules that indicate whether the genes essential for an enzyme-catalyzed reaction are expressed in the organism [49].Consequently, all methods for COBRApy objects readily accept the analogous MASSpy objects as valid input, preserving the commands and conventions familiar to current COBRApy users.COBRApy is a popular software platform preferred by many in the COBRA community [5]; therefore, preserving COBRApy conventions aids in the adoption of MASSpy among those users.Inheriting from the COBR-Apy classes additionally allows for easy conversion between COBRApy and MASSpy objects without loss of relevant biochemical and numerical information.These two features of Python inheritance are critical in maintaining functionality for COBRApy implementations of various flux-balance analysis (FBA) algorithms in MASSpy.

Adding dynamic simulation capabilities
The creation and simulation of dynamic models requires deriving a set of ordinary differential equations (ODEs) from the stoichiometry of a reconstructed network and assigning kinetic rate laws to each reaction in the network [21].MASSpy utilizes SymPy [38] to represent reaction rates and differential equations as symbolic expressions.All MassReaction objects use mass action kinetics to automatically generate a default rate law; however, user-defined rate laws can be assigned to replace mass action rate laws with alternative equations, such as those derived using Michaelis-Menten kinetics.All MassMetabolite objects generate their associated differential equation by combining the rates of reactions in which they participate and contain the initial conditions necessary to solve the system of ODEs.[32] Escher A Python interface to the Escher visualization package.Used to build, share, and embed visualizations of metabolic pathway and node maps.[44] libRoadRunner High-performance and portable SBML simulation engine for systems biology.Provides dynamic simulation and NLEQ steady state determination methods for SBML compatible models.
[45] libSBML A Python interface for reading and writing models in SBML.[46] Matplotlib Fundamental package for visualization in Python.Provides comprehensive plotting tools for visualizing observations, including dynamic simulation results and phase portraits [40] NumPy Fundamental Python package for numerical computation and data science.Provides efficient array/matrix data types, operations, and random number generation when sampling. [37] Optlang Formulation of optimization problems using symbolic expressions and native Python algebra syntax.Provides a common Python interface for various optimization solver backends. [47] Pandas High-performance data structures and analysis tools for data science in Python.[ To solve the system of ODEs, the MASSpy Simulation class employs libRoadRunner [45], a high-performance Systems Biology Markup Language (SBML) [50] simulation engine that is capable of supporting most SBML Level 3 specifications.The libRoadRunner utilizes a Just-In-Time (JIT) compiler with an LLVM JIT compiler framework to compile SBML-specified models into machine code, making the libRoadRunner simulation engine appropriate for solving large models effectively.Although libRoadRunner has a large suite of capabilities, it is currently used for two purposes in MASSpy: the steady-state determination via NLEQ1 and NLEQ2 global newton methods [51], and dynamic simulation via integration of ODEs through deterministic integrators, including CVODE solver from the Sundials suite [52].Because libRoadRunner requires models to be in SBML format, the Simulation object exports models into SBML format before compiling them into machine code via libRoadRunner.

Model import, export, and network visualization
MASSpy utilizes two primary formats for the import and export of models: SBML format and JavaScript Object Notation (JSON).MASSpy currently supports SBML L3 core specifications [53] along with the FBC [54] and Groups [55] packages, providing support for both constraint-based and dynamic modeling formats.Although SBML is necessary to utilize libRoa-dRunner, there are a number of additional benefits obtained by supporting SBML.In addition to being a standard format among the general systems biology community [50], SBML is a widely used model format specifically among members of the COBRA modeling community [5].
MASSpy also provides support for importing and exporting models via JSON, a text-based syntax that is useful for exchanging structured data between programming languages [56].The MASSpy JSON schema is designed for interoperability with Escher [44], a pathway visualization tool designed to visualize various multi-omic data sets mapped onto COBRA models.The interoperability with Escher is exploited by MASSpy to provide various pathway and node map visualization capabilities.

Mechanistic modeling of enzyme regulation
The reconstruction of all microscopic steps performed by an enzyme (an "enzyme module") represents the full stoichiometric description of an enzyme using mass action kinetics [27].MASSpy facilitates the construction of enzyme modules through the EnzymeModule, Enzyme-ModuleForm, and EnzymeModuleReaction classes, which inherit from the MassModel, Mass-Metabolite, and MassReaction classes, respectively.The EnzymeModule class contains methods and attribute fields to aid in the construction of EnzymeModules based on the steps outlined for constructing enzyme modules in Du et al. [27].Given the number and complexity of possible enzymatic mechanisms [57], MASSpy also provides the ability to group relevant objects into different user-defined categories, such as active/inactive states and different enzyme complexes.The EnzymeModuleDict objects are used to represent enzyme modules once merged into a larger model, preserving user-defined categories and other information relevant to the construction of the EnzymeModule, such as total enzyme concentration.More details can be found in the MASSpy documentation (S2 File).

Ensemble sampling, assembly, and modeling
Ensemble approaches are used to address various issues concerning parameter uncertainty and experimental error in metabolic models [22].Ensemble modeling refers to the assembly of dynamic models with similar mechanistic formulations that span the feasible kinetic solution space.Ensemble modeling allows for the range of possible network phenotypes to be explored, proving to be an valuable tool when parameterization is incomplete or unknown, as is often the case with kinetic models.MASSpy enables ensemble modeling approaches through the use of Markov chain Monte Carlo (MCMC) sampling of fluxes and concentrations [58,59].The flux sampling capabilities can be derived from the COBRApy package and employ two different hit-and-run sampling methods: one with a low memory footprint [60] and another with multiprocessing support [61].To sample metabolite concentrations, MASSpy employs a ConcSolver object to populate the optimization solver with constraints for thermodynamically feasible concentration ranges [28,62,63] and two hit-and-run sampling methods for concentrations were implemented in MASSpy with algorithms analogous to those for flux sampling.MASSpy provides several built-in methods for ensemble generation from sampling data.Once generated, the ensemble of models can be loaded into the MASSpy Simulation object, simulated, and visualized using built-in ensemble visualization and analysis methods.Additional details can be found in the MASSpy documentation (S2 File).

Results
We have conducted a validation, a feature demonstration, and a case study that exemplify how MASSpy features combine to facilitate the dynamic modeling of metabolism (Fig 1).We validated MASSpy as a modeling tool by describing mechanisms of enzyme regulation using enzymes modules to replicate select results presented in Yurkovich et al. [24].A variety of features are demonstrated through a workflow in which an ensemble of stable kinetic models is generated through MCMC sampling to examine biological variability while satisfying the thermodynamic constraints imposed by the network.In the case study, we integrated COBRA and MASS modeling methodologies to create a kinetic model of E. coli glycolysis from a metabolic reconstruction, providing novel insight into functional states of the proteome and activities of different isozymes.See Table 2 for a comparison of explicitly supported MASSpy features with those of other dynamic modeling tools.

Validation as a modeling tool through enzyme regulation in MASS models
Here, we demonstrated MASSpy as a modeling tool and the MASSpy implementation of enzyme modules by replicating the results produced by Yurkovich et al. [24].The authors used the MASS Toolbox [31] to elucidate the systems-level effects of allosteric regulation in the glycolytic pathway of red blood cells (RBCs).Enzyme modules for hexokinase, phosphofructokinase, and pyruvate kinase were reconstructed using the mechanistic formulations of ligandbinding events [26,27] and merged with a MASS model of RBC metabolism comprised of the glycolytic pathway, the Rapoport-Luebering (RL) shunt, and hemoglobin binding events for oxygen and 2,3-diphosphoglycerate [29].Enzyme modules were merged into the glycolytic pathway in different combinations, and dynamic simulations were subsequently performed to characterize the interactions of the three allosterically regulated kinases when responding to disturbances in ATP utilization.
To validate MASSpy as a modeling tool, we used MASSpy to replicate key observations produced by the MASS Toolbox as presented in Yurkovich et al. [24].Using MASSpy, we reconstructed the RBC metabolic model and the enzyme modules for the three key regulatory kinases as previously described.We integrated the reconstructed enzyme modules into the glycolytic model to introduce varying levels of regulation.Because enzyme modules were constructed and parameterized for the steady-state conditions of the MASS model, addition of an enzyme module to a MASS model was a straightforward and scalable process.The overall reaction representation for the enzyme in the MASS model was removed and replaced with the set of reactions that comprise the microscopic steps of the enzyme module (Fig 2).

PLOS COMPUTATIONAL BIOLOGY
We performed dynamic simulations, subject to physiologically relevant perturbations, to provide a fine-grained view of the concentration and flux solution profiles for individual enzyme signals and qualitatively represent the systemic effects of additional regulatory mechanisms.We then used MASSpy visualization methods to replicate key results [24] (S3 File).Through this case study, we have demonstrated how enzyme modules were constructed from enzymatic mechanisms in MASSpy, and we validated MASSpy as a dynamic modeling tool by exploring previously reported systems-level effects of regulation [24].See S3 File for all data and scripts associated with the validation, including kinetic parameters for all three enzyme modules.

Demonstration of features through ensemble sampling, assembly, and modeling
Many ensemble modeling approaches utilize sampling methods to approximate missing values and quantify uncertainty in metabolic models [21,22,28,59,64,65].To demonstrate the sampling and ensemble handling capabilities of MASSpy, we utilized MCMC sampling with an ensemble modeling approach to assess the dynamics for a range of pyruvate kinase enzyme modules (Fig 3).Using RBC glycolysis and hemoglobin as the reference model [24], we used MCMC sampling to generate 15 candidate flux states and 15 candidate thermodynamically Explicitly supported features (via "helper methods" or "helper classes") are those that are directly provided by the tool or software package itself and do not include those that could be piped in using third-party packages or tools (i.e., directly calling a function from the package/tool as opposed to using other Python packages to achieve the goal).
https://doi.org/10.1371/journal.pcbi.1008208.t002feasible concentration states, allowing for variables to deviate from their reference state by up to 80 percent.This procedure resulted in 225 candidate models that represent all possible combinations of flux and concentration states.We calculated pseudo-elementary rate constants (PERCs) and simulated models to steady state.We simulated a 50% increase in ATP utilization to mimic a physiologically relevant disturbance, such as increased shear stress due to arterial constriction [66].Out of the 225 models, 10 models were discarded due to instability, determined by an inability to reach a steady state through simulation.
We then reconstructed enzyme modules for pyruvate kinase [24] for the remaining 215 models.Numerical values of rate constants for each enzyme were determined using the SciPy implementation of a trust-region method for nonlinear convex optimization [67].Without knowledge of physiological constraints, the numerical solutions for rate constants produced by optimization routine were not guaranteed to be physiologically feasible.Therefore, we integrated these enzyme modules into their MASS models and simulated with and without the ATP utilization increase to ensure they had stable steady states before and after the perturbation.All remaining 215 models were able to successfully reach stable steady states and subsequently assembled into the ensemble for dynamic simulation and analysis.
The time-course results for the ensemble energy charge deviation were plotted with a 95% confidence interval.From these results, it can be seen that the mean energy charge decreased at most about 25% from its baseline value (Fig 3C).Steady state analysis of the pyruvate kinase enzyme modules after the disturbance revealed a tendency for the enzyme to remain in an active state, with a median value of approximately 61%.The differences in candidate flux and concentration states resulted in an interquartile range between 36-88% of total pyruvate kinase, with all variations of pyruvate kinase in the ensemble maintaining at least 10% activity.Furthermore, examination of the relative flux load through the R i,AP forms showed that most of the flux load was carried by the R 2,AP and R 3,AP reaction steps while a minuscule fraction was carried by the R 0,AP reaction step, regardless of variation.However, the variations had an effect on whether R 2,AP carried more flux than R 3,AP , and whether the remaining flux was predominantly distributed through the R 1,AP or the R 4,AP reaction step.Through this demonstration, we have demonstrated how MASSpy sampling facilitated the assembly and simulation of an ensemble to characterize the dynamic response of a key regulatory enzyme and quantify its functional states after a physiologically relevant disturbance.See S3 File for all data and scripts associated with the ensemble modeling demonstration.

Case study: Computing functional states of the E. coli proteome
Here, we illustrated unique features of MASSpy in a workflow to compute the functional states of the proteome, providing insight into distribution of catalytic activities of enzymes for different metabolic states.We utilized COBRA and MASS modeling methodologies to incorporate omics data into a metabolic reconstruction of E. coli, formulating a kinetic model containing all microscopic steps for each enzymatic reaction mechanism of the glycolytic subnetwork.Once formulated, we interrogated the model to examine the shift in thermodynamic driving force for E. coli on different carbon sources and to compare the activities of different isozymes, exemplifying the utility of MASSpy.
To construct a kinetic model of the glycolytic subnetwork, we integrated steady-state data for growth on glucose and pyruvate carbon sources from Gerosa et al. [68] into the iML1515 genome-scale metabolic reconstruction of E. coli K-12 MG1655 [69].For each carbon source, we fixed the growth rate for iML1515 and performed FBA using a quadratic programming objective to compute a flux state that minimized the error between known and computed fluxes.For the irreversible enzyme pairs of phosphofructokinase/fructose 1,6-bisphosphatase (PFK/FBP) and pyruvate kinase/phosphoenolpyruvate synthase (PYK/PPS), individual flux measurements were each increased by 10% of the net flux for the enzyme pair without changing the overall net flux value to ensure presence of the enzyme as seen in proteomic data [70].
Once flux states were obtained for each carbon source, the glycolytic subnetwork was extracted from iML1515.Flux units were converted from grams of cellular dry weight into molar units using volumetric measurements of E. coli obtained from Volkmer and Heinemann [71], and equilibrium constants obtained from eQuilibrator [72] through component contribution [73] were set for each reaction.Concentration growth data from Gerosa et al. [68] was integrated into the model and minimally adjusted for thermodynamic feasibility; for metabolites missing concentration data, an initial value of 0.001 M was provided before adjustments through sampling.Concentrations were sampled within an order of magnitude of their current value to produce an ensemble of 100 candidate models for each growth condition.Metabolite sinks were added to the model to account for metabolite exchanges between the modeled subnetwork and the global metabolic network outside of the scope of the model.The relevant growth data for each carbon source, volumetric measurements, and equilibrium constants are found in the S1 Data.
For each model in the ensemble, enzyme modules were constructed for each reaction using a nonlinear parameter fitting package (https://github.com/opencobra/MASSef)and kinetic data extracted from the literature.Additional isozymes of phosphofructokinase (PFK), fructose 1,6-bisphosphatase (FBP), fructose 1,6-bisphosphate aldolase (FBA), phosphoglycerate mutase (PGM), and pyruvate kinase (PYK) were also constructed, bringing the total amount of enzyme modules to 17. Fluxes through individual isozymes were set by splitting the steady state flux between the major and minor isozyme forms.After integrating all enzyme modules into the network, each model was simulated to steady state and exported for analysis.
The Gibbs free energy for each enzyme-catalyzed reaction and fractional abundance of each enzyme form were calculated.Sensitivity analysis of the flux split between the isozyme forms revealed that the Gibbs free energy and fractional abundance of enzyme forms did not show significant variation for either carbon source (S1 Fig); therefore, remaining analyses were done with 75% and 25% of the flux assigned to major and minor isozyme forms, respectively.
Comparison of the glucose and pyruvate growth conditions revealed that the free energy of the reversible reactions remained close to equilibrium, changing from one metabolic state to another as the thermodynamic driving force shifts according to the carbon source, as seen in reversible reactions triose phosphate isomerase (TPI), glucose 6-phosphate dehydrogenase (GAPD), phosphoglycerate kinase (PGK), phosphoglycerate mutase (PGM), and enolase (ENO) (Fig 4A).The reaction pairs, PFK/FBP and PYK/PPS, maintained their flux directions to form a futile cycle across conditions.Steady state analysis of the isozyme fractional abundances elucidated a preference for a specific enzyme state conserved among growth conditions for PFK1, FBA2, and PYK1 (Fig 4C ).Steady state analysis also revealed that the isozyme pairs of PFK, FBP, and PYK primarily existed in their product-bound form, a reflection of the metabolite concentration levels found in S2 Data.Specifically, the relatively high concentrations of fructose 1,6-diphosphate (FDP) observed for glucose growth conditions and of adenosine triphosphate (ATP) observed for pyruvate growth conditions contributed to significant differences between enzyme product and reactant concentration levels, creating the conditions favorable to the product-bound enzyme forms.Both the major and minor PGM isozymes have a similar distribution in their enzyme forms.The fractional abundance of all enzyme states in the glycolytic subnetwork can be found in S2 Fig.Through this case study, we have demonstrated that MASSpy can be used to gain insight into the distribution of functional states for the glycolytic proteome in E. coli without prior knowledge of enzyme concentrations.See S3 File for all data and scripts associated with the case study, including microscopic steps and kinetic parameters for all enzyme modules.

Discussion
We describe MASSpy, a free and open-source software implementation for dynamic modeling of biological systems.MASSpy expands the COBRApy framework, leveraging existing methods familiar to COBRA users combined with kinetic modeling methods to form a single, intuitive framework for constructing and interrogating dynamic models.In addition to enabling dynamic simulation, MASSpy contains tools for facilitating the reconstruction and analysis of enzyme modules, MCMC sampling and ensemble modeling capabilities, interfacing with packages for pathway visualization (Escher [44]), and exchanging models in SBML format (libSBML [46]).Taken together, the presentation of the MASSpy software package has several important implications for practitioners of dynamic metabolic simulation.
MASSpy provides several benefits over existing modeling packages (Table 2).While MASS models provide an algorithmic approach for generating dynamic models that has already proven useful in several metabolic studies [24,[26][27][28][29], a formal implementation of the MASS framework has only existed on a commercial software platform (Mathematica).MASSpy primarily utilizes the MASS approach and therefore integrates a suite of tools into its framework for addressing issues specific to MASS modeling.Unlike other packages for traditional kinetic modeling, MASSpy incorporates both COBRA methods and MCMC sampling methods for estimating missing values for several data types.MASSpy's seamless integration with COBR-Apy offers a vast array of constraint-based and dynamic modeling tools within a single opensource framework.Although alternative python-based packages for constraint-based modeling exist (e.g., PyFBA [2]), the COBRApy package is the most widely used Python package for constraint-based modeling by the COBRA community [5].Furthermore, MASSpy contains unique capabilities to facilitate the construction and analysis of detailed enzyme modules (i.e., microscopic steps), which allow for the dynamics of transient responses to be observed in situations in which the quasi-steady state and quasi-equilibrium assumptions cannot be applied.By directly expanding the COBRApy framework for MASSpy, current COBRApy users will find that MASSpy provides procedures and protocols that they may be familiar with, and allows members of the COBRA community to directly integrate new tools into their existing workflows.
The open source nature of Python also enables users to utilize their preferred tools while integrating MASSpy capabilities, creating workflows suited to their specific needs.For example, users familiar with PyBioNetFit [74], PyDREAM [75], or PyMC3 [76] can implement these packages for parameter fitting and optimization routines, then update their MASSpy models with the resulting parameter sets.Through manipulation of data via Pandas [39], a wide variety of additional interactive statistical visualizations are available through the Altair visualization package [77], while dynamic network visualizations in iPython notebooks [41,42] can be accomplished through PyVIPR [78].By capitalizing on the availability of other systems biology and data science tools in the Python ecosystem, users maintain the freedom to use their preferred tools and gain flexibility to determine how MASSpy is be best utilized for their particular needs.
MASSpy is primarily built for deterministic simulations of a metabolic model and thus may face limitations for other uses.For example, a package like PySCeS [79] could be utilized to perform stochastic simulations.Users who often analyze sensitivity may prefer Tellurium and its implementation of libRoadRunner [45,80] for explicit support of metabolic control analysis (MCA) workflows; however, MASSpy does contain similar MCA methods through its own implementation of libRoadRunner.Other dynamic modeling packages offer certain features not available in MASSpy, such as a graphical user interface (COPASI [81]) or a rule-based modeling approach (PySB [82]): see Table 2 for a comparison of MASSpy's software features with other existing dynamic modeling packages.MASSpy's use of SBML facilitates the transfer of models to other software environments, if desired [53].
Taken together, we have described MASSpy, a Python-based software package for the reconstruction, simulation, and visualization of dynamic metabolic models.MASSpy provides a suite of dynamic modeling tools while leveraging existing implementations of constraintbased modeling tools within a single, unified framework.The case studies presented here validate MASSpy as a modeling tool and demonstrate how the combination of constraint-based and kinetic modeling features support data-driven solutions for various dynamic modeling applications.We anticipate that the community will find MASSpy to be a useful tool for dynamic modeling of metabolism.

Software availability and requirements
The source code for MASSpy is available online (https://github.com/SBRG/MASSpy)and in the S1 File under an MIT license.MASSpy is also hosted as a Python package on the Python Package Index (https://pypi.org/project/masspy/).All required external dependencies integrated and utilized by MASSpy are also available on the Python Package Index (https://pypi.org/) and are licensed under their respective licensing terms.Both the Gurobi Optimizer (Gurobi Optimization, Houston, TX) and the IBM CPLEX Optimizer (IBM, Armonk, NY) are freely available for academic use, with solvers and installation instructions found at their respective websites.Images for Docker containers [43] that include the MASSpy software can be downloaded online from the DockerHub Registry (https://hub.docker.com/r/sbrg/masspy),or they can be built locally to include the licensed commercial optimization solvers.Instructions for MASSpy installation, including instructions for using MASSpy with Docker, are found in the repository README or in the documentation (https://masspy.readthedocs.ioand S2 File).The data, scripts, and instructions needed to reproduce results of the presented examples are also available on GitHub (https://github.com/SBRG/MASSpy-publication)and in the supplement (S3 File).

Documentation
The documentation for MASSpy is available online (https://masspy.readthedocs.io)and in the S2 File.Good documentation is vital to the adoption and success of a software package; it should teach new users how to get started while showing more experienced users how to fully capitalize on the software's features [5,83].For new users, MASSpy provides several simple tutorials demonstrating the usage of MASSpy's features and its capabilities.The MASSpy documentation also contains a growing collection of examples that demonstrate the use of MAS-Spy, including examples of workflows, advanced visualization tutorials, and in-depth textbook [29] examples that teach the fundamentals for dynamic modeling of mass action kinetics (S2 File).

Improvements and community outreach
The MASSpy package is designed to provide various dynamic modeling tools for the openCO-BRA community; therefore a substantial portion of future development for MASSpy will be tailored toward fulfilling the needs of the COBRA community based on user feedback and feature requests.New MASSpy releases will utilize GitHub for version control and adhere to Semantic Versioning guidelines (https://semver.org) in order to inform the community about the compatibility and scope of improvements.Examples of potential improvements for future releases of MASSpy include bug fixes, additional SBML compatibility, new import/export formats, support for additional modeling standards, explicit support for additional libRoadRunner simulation capabilities, and implementation of additional algorithms relevant to MASS modeling approaches.As the systems biology field continues to address challenges in dynamic models of metabolism, MASSpy will continue to expand its collection of modeling tools to support data-driven reconstruction and analysis of mechanistic models.

Fig 1 .
Fig 1. Overview of MASSpy features.(A) MASSpy expands COBRApy to provide constraint-based methods for obtaining flux states.(B) Thermodynamic principles are utilized by MASSpy to sample concentration solution spaces and to evaluate how thermodynamic driving forces shift under different metabolic conditions.(C) MASSpy enables dynamic simulation of models to characterize transient dynamic behavior and contains ensemble modeling methods to represent biological uncertainty.(D) Network properties such as relevant timescales and system stability are characterized by MASSpy using various linear algebra and analytical methods.(E) MASSpy contains built-in functions that enable the visualization of dynamic simulation results.(F) Mechanisms of enzymatic regulation are explicitly modeled in MASSpy through enzyme modules, enabling computation of catalytic activities and functional states of enzymes.https://doi.org/10.1371/journal.pcbi.1008208.g001

Fig 2 .
Fig 2. Enzyme modules are explicit representations of enzymatic regulatory mechanisms.(A) The reaction catalyzed by pyruvate kinase is replaced with the stoichiometric description of the enzymatic mechanism.The steady state values obtained after simulating a 50% increase of ATP utilization are mapped onto a metabolic pathway map drawn using Escher [44].The colors represent flux values and range from red to purple to gray, with red indicating higher flux values and gray indicating lower flux values.(B) Enzyme modules provide a network-level perspective of regulation mechanisms by plotting systemic quantities against fractional states of enzymes as described in Yurkovich et al. [24].(C) The different signals of the enzyme module can be observed to provide enzyme-level resolution of the regulatory response.https://doi.org/10.1371/journal.pcbi.1008208.g002

Fig 3 .
Fig 3. A MASSpy workflow for ensemble creation and modeling using MCMC sampling.A typical ensemble modeling workflow using MASSpy to generate and assemble an ensemble of stable kinetic models for dynamic simulation and analysis.(A) The solution spaces for fluxes and concentrations are sampled using MCMC sampling to generate data for candidate model states.Rate constants are obtained through parameter fitting for elementary rate constants and computation of PERCs.(B) Sampling data is integrated into the candidate models, and models are subsequently filtered based on their stability to assemble the ensemble of stable dynamic models.Once assembled, the ensemble is used to study biological variability in the network of interest through (C) dynamic simulation and (D) analysis.https://doi.org/10.1371/journal.pcbi.1008208.g003

Fig 4 .
Fig 4. Comparison of free energy and isozyme fractional abundances for carbon sources.(A) The Gibbs free energy represents the thermodynamic driving force, shifting the metabolic state depending on the carbon source.(B) The glycolytic subnetwork extracted from E. coli iML1515 consists of 12 reactions represented by the 17 enzyme modules.(C) The fractional abundance for each enzyme form can be computed and compared for the different isozyme pairs, providing insight into how the catalytic activity is distributed across the isozymes in glucose and pyruvate growth conditions.The fractional abundances for all enzymes can be found in the supplement (S2 Fig). https://doi.org/10.1371/journal.pcbi.1008208.g004 equilibrium constants from eQuilibrator used in the case study to construct the E. coli glycolytic subnetwork.(XLSX) S2 Data.Steady state concentration data in the E. coli glycolytic subnetwork.Includes the steady state concentration data for all metabolites and enzymes in the E. coli glycolytic subnetwork in the case study.(XLSX)