COBRAme: A computational framework for genome-scale models of metabolism and gene expression

Genome-scale models of metabolism and macromolecular expression (ME-models) explicitly compute the optimal proteome composition of a growing cell. ME-models expand upon the well-established genome-scale models of metabolism (M-models), and they enable a new fundamental understanding of cellular growth. ME-models have increased predictive capabilities and accuracy due to their inclusion of the biosynthetic costs for the machinery of life, but they come with a significant increase in model size and complexity. This challenge results in models which are both difficult to compute and challenging to understand conceptually. As a result, ME-models exist for only two organisms (Escherichia coli and Thermotoga maritima) and are still used by relatively few researchers. To address these challenges, we have developed a new software framework called COBRAme for building and simulating ME-models. It is coded in Python and built on COBRApy, a popular platform for using M-models. COBRAme streamlines computation and analysis of ME-models. It provides tools to simplify constructing and editing ME-models to enable ME-model reconstructions for new organisms. We used COBRAme to reconstruct a condensed E. coli ME-model called iJL1678b-ME. This reformulated model gives functionally identical solutions to previous E. coli ME-models while using 1/6 the number of free variables and solving in less than 10 minutes, a marked improvement over the 6 hour solve time of previous ME-model formulations. Errors in previous ME-models were also corrected leading to 52 additional genes that must be expressed in iJL1678b-ME to grow aerobically in glucose minimal in silico media. This manuscript outlines the architecture of COBRAme and demonstrates how ME-models can be created, modified, and shared most efficiently using the new software framework.

to be particularly important in E. coli for computing an accurate metabolic/proteomic state using proteomics data. We suspect similar observations would be seen in other organisms.
For non metabolic macromolecules such as ribosome, mRNA, tRNA and RNA polymerase, the coupling constrains coefficients (' ') are derived by essentially back-calculating the individual rates using a measured RNA-to-Protein ratio from Scott et. al. 2010 and measured mRNA, tRNA and rRNA fractions. The coupling constraints coefficients for these macromolecules are derived in detail in O'Brien et. al. 2013. Applying these constraints results in a final nonlinear optimization problem (NLP) shown below. COBRAme reformulates these coupling constraints to embed them directly into the reaction which they are used as follows: The biomass_dilution constraint is discussed below.

Previous ME-model Coupling Constraint Implementation
The previous iterations of ME-models applied coupling constraints using three separate reactions. An example for a generic "enzymatic reaction" could be represented as follows: Enzyme Priming: enzyme Where is the coupling coefficient applied to "coupling", a metabolite (constraint) which effectively determines the minimal rate that the third dilution coupling reaction ( 2) must proceed. For previous ME-model implementations, this coupling constraint was given a "_constraint_sense" in COBRApy of 'L' (less than or equal to) meaning that for this toy example 0 = 1 and 2 ≥ · 1.

COBRAme Coupling Constraint Implementation
With COBRAme it is assumed that the optimal ME solution will never dilute more enzyme than is required by the coupling constraint thus constraining 2 = · 1 and allowing us to combine the implementation of the constraint with the reaction that uses the enzyme to give.
The coupling constraints and coefficients were derived as in O'Brien et. al. 2013. As stated above, however, these were implemented in the current study as equality constraints . Effectively, this means that each ME-model solution will give the computed optimal proteome allocation for the in silico conditions. Previous ME-model formulations have applied the constraints as inequalities thus allowing the simulation to overproduce macromolecule components. While overproduction is seen in vivo in cells, this phenomenon would not be selected as the optimal solution. Furthermore, using inequality constraints greatly expands the size of the possible solution space significantly increasing the time required to solve the optimization. Reformulating the model using equality constraints thus resulted in a reduced ME-matrix with the coupling constraints embedded directly into the reaction in which they are used.

Biomass Dilution Constraints
For metabolic models (M-models), the biomass objective function has been used to represent the amount of biomass production that is required for the cell to double at a specified rate. The metabolites represented in the biomass function are typically building blocks for major macromolecules (e.g. amino acids and nucleotides), cell wall components and cofactors. The coefficients of the biomass objective function are determined from empirical measurements from a cell growing at a measured rate. Since ME-models explicitly compute the predicted amount of RNA, protein, cofactors, etc. necessary for growth, this concept has to be modified for ME-models. This is accomplished via the biomass_dilution variable (reaction), which contains a biomass constraint (pseudo metabolite) that represents the mass produced by the synthesis of each functional RNA or protein. This reaction essentially ensured that the ME-model can only produce biomass at the rate it is being diluted (via growth and division).

Implementation
For each transcription or translation reaction in an ME-model an amount of a biomass constraint (pseudo metabolite) is created with a stoichiometry equal to the molecular weight of the mRNA or protein being made (in kDA). The below figure shows an example of this where a translation reaction produces both the catalytic protein as well as the protein biomass constraint. The formed protein_biomass constraint is a participant in the overall ME-model Biomas_ Dilution reaction which restricts the total production of the major biomass components to equal the rate at which biomass is diluted (i.e. the cell's growth rate, ).
Some biomass constituents do not have a mechanistic function in the ME-model (e.g. cell wall components, DNA and glycogen). These metabolites are included in the biomass_dilution reaction indentical to the M-model biomass reaction . The COBRAme codebase is constructed with the intention of including all of the metabolic processes and complexity associated with gene expression, while still giving a final ME-model that is not prohibitively difficult to use and interpret. To accomplish this COBRAme separates the information associated with each cellular process from the actual ME-model reaction in which the process is modeled. We call these two major python classes ProcessData and MEReaction, respectively. The logic behind each of these classes is briefly presented below, and a description of the class attributes and properties is presented here in the form of a UML Diagram.

ProcessData
The previous ME-models for E. coli and T. Maritima were reconstructed by first establishing a database containing information about all gene expression processes known in the organism being model. This included, for instance, enzyme complex subunit stoichiometries or the E. coli transcription unit architecture. Then, when building the MEmodel, the database was queried to obtain any relevant information and incorporate this into the appropriate reactions. For the COBRAme formulation, this database was replaced by the ProcessData "information storage" class. The ProcessData class generally consists of attributes which use simple python types (string, dictionary, etc.) to describe features of a biological process.
The use of the ProcessData class has the advantage of: 1. Simplifying the process of querying information associated with a cellular function 2. Allowing edits to this information which can easily be applied throughout the model without rebuilding from scratch 3. Enabling additional computations to be performed and seamlessly accessed as ProcessData class methods The ProcessData classes are broken into the following subclasses:

MEReaction
COBRAme compartmentalizes the major reaction types into their own MEReaction classes. Each of these classes contains a single update function which effectively reads in the appropriate ProcessData types, applies the coupling constraints on the macromolecules, and assembles these components into a complete model reaction. This allows changes made to the ProcessData describing a particular cellular process to easily be incorporated into the reactions which it was used.

Overview
Using the major classes described above, reconstructing a ME-model can then be broken down into three steps: 1. Define and construct all necessary ProcessData objects 2. Link the ProcessData to the appropriate MEReaction instance

Overview
COBRAme is constructed entirely over COBRApy. This means that ME-model reactions will have all of the same properties, methods, and functions as a COBRApy reaction. However, one key difference between M and ME models is that many reactions involved in gene expression are effecively templates that are constructed identically but vary based on characteristics of the gene being expressed. For example, a gene with a given nucleotide sequence is always translated following the same rules provided by the codon table for that organism.
In order to facilliate the template nature of many gene expression reactions, COBRAme reactions are constructed and their components are manipulated through the use of ProcessData classes. These act as information vessels for holding the information assocatied with a cellular process in simple, standard datatypes such as dictionaries and strings.
This tutorial will go step-by-step through the process of creating a generic enzyme catalyzed reaction (i.e. metabolic reaction): → which requires the formation and coupling of complex_ab in order to proceed.
In order for this reaction to carry flux in the model we will additionally need to first add the corresponding: 1. Transcription reactions

Complex formation reactions
Once these are added we will add in the synthesis of key macromolecular components (ribosome, RNA polymerase, etc.) and show how they are coupled to their respective reactions. The derived coupling coefficients will also be described. For more on the derivation of the coupling coefficients, reference the supplemental text of O'brien et. al. 2013

Initializing new ME-Models
When applying some constraints in the ME-model, metabolite properties are required. For instance, to calculate the total biomass (by molecular weight) produced by a particular macromolecule, the amino acid, nucleotide, etc. molecular weights are required. To enable these calculations, all metabolites from iJO1366, along with their metabolite attributes are added to the newly initialized ME-model.
Further the reactions from iJO1366 will be added to the ME-model to demonstrate ME-model solving procedures. The ME-model contains a "global_info" attribute which stores information used to calculate coupling constraints, along with other functions. The specifics of each of these constraints will be discussed when they are implemented.

Adding Reactions without utility functions
We'll first demonstrate how transcription, translation, tRNA charging, complex formation, and metabolic reactions can be added to a model without using any of the utility functions provided in cobrame.util.building.py. The second half of the tutorial will show how these utility functions can be used to add these reactions.
The basic workflow for adding any reaction to a ME-model using COBRAme occurs in three steps:

Add Transcription Reaction
Add TranscribedGene metabolite to model Transcription reactions is unique in that they occur at a transcription unit level and can code for multiple transcript products. Therefore the nucleotide sequence of both the transcription unit and the RNA transcripts must be defined in order to correctly construct a transcription reaction.

Update TranscriptionReaction
TranscriptionReaction.update(verbose=True) Creates reaction using the associated transcription data and adds chemical formula to RNA products This function adds the following components to the reaction stoichiometry (using 'data' as shorthand for cobrame.core.processdata.TranscriptionData): Note: the RNA_polymerase complex is not included in the reaction. This will be added later This reaction now produces a small amount of the a mRNA_biomass metabolite (constraint). This term has a coefficient corresponding to the molecular weight (in ) of the RNA being transcribed. This constraint will be implemented into a _ reaction with the form: A mathematical description of the biomass constraint can be found in Biomass Dilution Constraints in ME-Model Fundamentals.
Note: This is not a complete picture of transcription because the RNA polymerase is missing.

Incorporate RNA Polymerase
For the purposes of this tutorial, we'll skip the steps required to add the reactions to form the RNA_polymerase. The steps however are identical to those outlined in add enzyme complexes below class cobrame.core.component. RNAP(id) Metabolite class for RNA polymerase complexes. Inherits from cobrame.core.component.Complex

Add Translation Reaction Add TranslationData to model
In order to add a TranslationData object to a ME-model the user must additionally specifify the mRNA id and protein id of the translation reaction that will be added. This information as well as a nucleotide sequence is the only information required to add a translation reaction.

Parameters
• id (str) -Identifier of the gene being translated, typically the locus tag

Update TranslationReaction
TranslationReaction.update(verbose=True) Creates reaction using the associated translation data and adds chemical formula to protein product This function adds the following components to the reaction stoichiometry (using 'data' as shorthand for cobrame.core.processdata.TranslationData): where: • and 0 are phenomenological parameters from Scott et. al. 2010 that describe the linear relationship between the observed RNA/protein ratio of E. coli and its growth rate ( ) where: is the molecular weight of an average mRNA nucleotide. is the fraction of total RNA that is mRNA is the molecular weight of an average amino acid where: • and 0 are phenomenological parameters from Scott et. al. 2010 that describe the linear relationship between the observed RNA/protein ratio of E. coli and its growth rate ( ) where: is the mass of rRNA per ribosome. is the fraction of total RNA that is rRNA is the molecular weight of an average amino acid • , is the rate of translation for • , is number of amino acids in peptide translated from Note: The above reactions do not provide a complete picture of translation in that it is missing charged tRNAs to facillitate tRNA addition.
Below, we'll correct this by adding in an tRNA charging reaction. class cobrame.core.processdata.tRNAData(id, model, amino_acid, rna, codon) Class for storing information about a tRNA charging reaction.
• model (cobrame.core.model.MEModel) -ME-model that the tRNAData is associated with • amino_acid (str) -Amino acid that the tRNA transfers to an peptide • rna (str) -ID of the uncharged tRNA metabolite. As a best practice, this ID should be prefixed with 'RNA + _'

Add tRNAChargingReaction to model
And point tRNAChargingReaction to tRNAData class cobrame.core.reaction.tRNAChargingReaction(id) Reaction class for the charging of a tRNA with an amino acid Parameters id (str) -Identifier for the charging reaction. As a best practice, ID should follow the template "charging_tRNA + _ + <tRNA_locus> + _ + <codon>". If tRNA initiates translation, <codon> should be replaced with START. This reaction creates one generic_charged_tRNA equivalement that can then be used in a translation reaction The coefficient for RNA_d and lys__L_c are defined by: where: • and 0 are phenomenological parameters from Scott et. al. 2010 that describe the linear relationship between the observed RNA/protein ratio of E. coli and its growth rate ( ) where: is molecular weight of an average tRNA. is the fraction of total RNA that is tRNA is the molecular weight of an average amino acid • ℎ , is the rate of charging for Note: This tRNA charging reaction is still missing a tRNA synthetase which catalyzes the amino acid addition to the uncharged tRNA.

Incorporate tRNA Synthetases
. The synthetase coupling was reformulated from O'brien et. al. 2013 enable more modularity in the ME-model. A more complete mathematical description of the tRNA synthetase coupling constraints can be found in the tRNA.ipynb

Add tRNAs to Translation
Here we take advantage of an additional subclass of ProcessData, called a SubreactionData object. This class is used to lump together processeses that occur as a result of many individual reactions, including translation elongation, ribosome formation, tRNA modification, etc. Since each of these steps often involve an enzyme that requires its own coupling constraint, this process allows these processes to be lumped into one reaction while still enabling each subprocess to be modified.
TranslationData objects have an subreaction_from_sequence method that returns any subreactions that have been added to the model and are part of translation elongation (i.e. tRNA). Since no tRNA-mediated amino acid addition subreactions have been added to the model, the below call returns nothing.
Below, we add the SubreactionData (excluding enzymes) for the addition of an amino acid using information from the E. coli codon table. The charge tRNA does not act as an enzyme in this case because it's coupling is handled in the tRNAChargingReaction Add Subreactions for tRNA addition to model class cobrame.core.processdata.SubreactionData(id, model)

Add Complex Formation Reaction Add ComplexData to model
For COBRAme models, the reaction gene-protein-reaction rule (GPR) is replaced with a metabolite representing the synthesis of the enzyme(s) catalyzing a reaction. This metabolite is formed explicitly in a ME model by seperate reaction to transcribe the gene(s) and translate the protein(s) the compose the complex.
class cobrame.core.processdata.ComplexData(id, model) Contains all information associated with the formation of an functional enzyme complex.
This can include any enzyme complex modifications required for the enzyme to become active.

Parameters
• id (str) -Identifier of the complex data. As a best practice, this should typically use the same ID as the complex being formed. In cases with multiple ways to form complex '_ + alt' or similar suffixes can be used. •

Update ComplexFormation reaction
ComplexFormation.update(verbose=True) Creates reaction using the associated complex data and adds chemical formula to complex metabolite product.
This function adds the following components to the reaction stoichiometry (using 'data' as shorthand for cobrame.core.processdata.ComplexData):

Apply modification to complex formation reaction
Many enzyme complexes in an ME-model require cofactors or prosthetic groups in order to properly function. Information about such processes are stored as ModificationData.
For instance, we can add the modification of an iron-sulfur cluster, a common prosthetic group, by doing the following: Created <Complex complex_ba at 0x7f4bf69a8b38> in <ComplexFormation formation_complex_ab at 0x7f4bf5f After adding modification

Add Metabolic Reaction Add StoichiometricData to model
MetabolicReactions require, at a minimum, one corresponding StoichiometricData. StoichiometricData essentially holds the information contained in an M-model reaction. This includes the metabolite stoichiometry and the upper and lower bound of the reaction. As a best practice, StoichiometricData typically uses an ID equivalent to the M-model reaction ID.
So first, we will create a StoichiometricData object to define the stoichiometry of the conversion of a to b. Only one StoichiometricData object should be created for both reversible and irreversible reactions class cobrame.core.processdata.StoichiometricData(id, model) Encodes the stoichiometry for a metabolic reaction.

Add MetabolicReaction to model
The StoichiometricData for this reversible reaction is then assigned to two different MetabolicReactions (Due to the enzyme dilution constraint, all enzyme catalyzed reactions must be reverisble; more on this later). The MetabolicReactions require: -The associated StoichiometricData -The reverse flag set to True for reverse reactions, False for forward reactions -Enzyme for reaction (discussed later, dafault=65) These fields are then processed and the actual model reaction is created using the MetabolicReaction's update() function class cobrame.core.reaction.MetabolicReaction(id) Irreversible metabolic reaction including required enzymatic complex This reaction class's update function processes the information contained in the complex data for the enzyme that catalyzes this reaction as well as the stoichiometric data which contains the stoichiometry of the metabolic conversion being performed (i.e. the stoichiometry of the M-model reaction analog) Parameters id (str) -Identifier of the metabolic reaction. As a best practice, this ID should use the following template (FWD=forward, REV=reverse): "<StoichiometricData.id> + _ + <FWD or REV> + _ + <Complex.id>" keff float -The turnover rete (keff) couples enzymatic dilution to metabolic flux reverse boolean -If True, the reaction corresponds to the reverse direction of the reaction. This is necessary since all reversible enzymatic reactions in an ME-model are broken into two irreversible reactions

Update MetabolicReactions
MetabolicReaction.update(verbose=True) Creates reaction using the associated stoichiometric data and complex data.
This function adds the following components to the reaction stoichiometry (using 'data' as shorthand for cobrame.core.processdata.StoichiometricData): Note: the and complex_ab is not included in the reaction since no complex has been associated to it yet

Associate enzyme with MetabolicReaction
The ComplexData object created in the previous cell can be incorporated into the MetabolicReaction using code below.
Note: the update() function is required to apply the change. Forward reaction (before update): a --> b Forward reaction (after update): a + 4.27350427350427e-6 * mu complex_ab --> b Reverse reaction (before update): b --> a Reverse reaction (after update): b + 4.27350427350427e-6 * mu complex_ab --> a The coefficient for complex_ab is determined by the expression which in its entirety represents the dilution of an enzyme following a cell doubling. The coupling constraint can be summarized as followed • , is the flux through the metabolic reaction • is the turnover rate for the process and conveys the productivity of the enzyme complex. Physically, it can be thought of as the number of reactions the enzyme can catalyze per cell division.
By default the for a MetabolicReaction is set to 65 but this can be changed using the code below.

Adding Reactions using utility functions
Add reactions using some of the utility functions provided in cobrame.util.building.py

Transcription
Using the utility functions to create the TranscribedGene metabolite has the advantage of forcing the assignment of sequence, strand and RNA_type.

Parameters
• me_model (cobrame.core.model.MEModel) -The MEModel object to which the reaction will be added • locus_id (str) -Locus ID of RNA product. The TranscribedGene will be added as "RNA + _ + locus_id" • left_pos (int or None) -Left position of gene on the sequence of the (+) strain • right_pos (int or None) -Right position of gene on the sequence of the (+) strain • seq (str) -Nucleotide sequence of RNA product. Amino acid sequence, codon counts, etc. will be calculated based on this string.

Translation
add_translation_reaction assumes that the RNA and protein have the same locus_id. It creates the appropriate TranslationData and TranslationReaction instance, links the two together and updates the TranslationReaction.
cobrame.util.building.add_translation_reaction(me_model, locus_id, dna_sequence, up-date=False) Creates and adds a TranslationReaction to the ME-model as well as the associated TranslationData A dna_sequence is required in order to add a TranslationReaction to the ME-model

Parameters
• me_model (cobra.core.model.MEModel) -The MEModel object to which the reaction will be added • locus_id (str) -Locus ID of RNA product. The TranslationReaction will be added as "translation + _ + locus_id" The TranslationData will be added as "locus_id" • dna_sequence (str) -DNA sequence of the RNA product. This string should be reverse transcribed if it originates on the complement strand.

Complex Formation
Alternatively, ComplexData has a create_complex_formation() function to create the sythesis reaction following the naming conventions. It contains an update() function which incorporates changes in the ComplexData

ComplexData.create_complex_formation(verbose=True) creates a complex formation reaction
This assumes none exists already. Will create a reaction (prefixed by "formation") which forms the complex  Reaction Properties The division of the ME-model into MEReaction and ProcessData classes allows the user to essentially have access to the entire database of information used to construct the model. The following will show how this can be leveraged to easily query, edit and update aspects of the model reactions.

Metabolic Reactions
These are the ME-model representations of all reactions in the metabolic reconstruction, in this case iJO1366 In Through a MetabolicReaction, the user has direct access to the StoichiometricData, ComplexData and keff used to construct the reaction. 3.20942472231275e-6 * mu GAPDH-A-CPLX + e4p_c + h2o_c + nad_c --> 4per_c + 2.0 h_c + nadh_c This reactions is formed using the StoichiometricData (E4PD) and ComplexData (GAPDH-A-CPLX) with keff As a best practice, the ComplexData and StoichiometricData themselves should not be changed. If these need changed then a new MetabolicReaction should be created.

Edit upper and lower reaction bounds
The upper and lower bounds can be edited through the stoichiometric data and updated to the metabolic reaction Important: do not change the upper and lower bounds of a MetabolicReaction directly. If this is done than the change will be overwritten when the update function is ran (shown below) Editing the reaction bounds of the StoichiometricData, however, will edit the bounds of the forward and reverse reaction, as well as any instances of the reaction catalyzed by isozymes

TranscribedGene (RNA) metabolite properties
Transciption occurs via operons contained within the organisms genome or transcription unit (TU). This means that often, a transcribed region will code for multiple RNAs. The E. coli ME-model has 4 possible RNA types that can be transcribed: -mRNA -tRNA -rRNA -ncRNA (noncoding RNA) mRNAs can then translated directly from the full transcribed TU, while rRNA, tRNA and ncRNA are spliced out of the TU by endonucleases. In these cases, in order to know which bases need excized, the RNA metabolites (Tran-scribedGene) themselves have to store information such as: -DNA strand, left and right genome position to identify which TU the RNA is a part of -RNA type to determine whether it needs excised from the TU -nucleotide sequence to determine bases that do/do not need excised if not mRNA and the RNA mass for biomass constraint

TranscriptionReaction/TranscriptionData properties
Each TranscriptionReaction in a COBRAme ME-model is associated with exactly one TranscriptionData which includes everything necessary to define a reaction. This includes: -subreactions To handle enzymatic processes not performed by RNA polymerase -RNA Polymerase Different RNA polymerase metabolite for different sigma factors -RNA Products TUs often contain more than one RNA in sequence -Nucleotide sequence The TranscriptionData for TU containing the gene above is shown below This reaction currently uses a subreaction called Transcription_normal_rho_dependent to account for the elongation factors etc. associated with transcription. This TU also requires a rho factor to terminate transcription. These complexes can be removed from the reaction by running the following This poses a problem where, if RNA_b3201 and RNA_b3202 are not required in equal amounts, the model will become infeasible. To accound for this, all RNAs have a demand reaction associated with them. mRNA_biomass is consumed for each demand reaction with a coefficient equal to the molecular weight of each RNA (in kDa). This prevents the model from overproducing RNA to increase biomass production, and therefore growth rate, in some instances. More on the implications of the biomass constraint can be found in ME-Model Fundamentals As is, this reaction produces two mRNAs so no nucleotides are excised. If one or both is changed to a stable RNA (rRNA, tRNA or ncRNA) bases will be excised. Changing RNA_b3202 to an rRNA and updating the transcription reaction causes both of the RNAs to now be excised from the TU, as indicated by the nucleotide monophosphates that appear in the products. This is not a complete picture because this process is catalyzes by an endonuclease, whose activity can be incorporated as ModificationData.
Updating the reaction after adding these processes incorporates

Translation Reactions
In

TranslatedGene (Protein) metabolite properties
For COBRAme ME-models, proteins are translated directly from mRNA metabolites not from TUs. This means that all information required to construct a TranslationReaction can be found in its TranslationData therefore no extra information is contained in a TranslatedGene object.

TranslationReaction/TranslationData properties
Each TranslationReaction in a COBRAme ME-model is associated with exactly one TranslationData which includes everything necessary to define the reaction. This includes: -subreactions To handle enzymatic processes not performed by ribosome and incorporate tRNAs -mRNA ID of mRAN being translated -term_enzyme Enzyme that catalyzes translation termination -Nucleotide sequence The TranslationData for a TranslationReaction is shown below

ME-model Saving and Loading
There are currently 3 methods that can be used to save/load an ME-model using COBRAme: 1. As a full JSON file 2. As a reduced JSON file 3. As a pickle file

As a full JSON file
This is the recommended way to save, load and share COBRAme ME-models in full detail. This will include all of the model's functionality and information (MEReaction, ProcessData, etc). It uses a defined JSONSCHEMA found in cobrame.io.
Saving and loading a full ME-model (me_model) as a JSON can be done using: where new_me_model is of type cobrame.MEModel

As a reduced JSON file
Alternatively, ME-models can be saved as a COBRApy model. This storage type loses all the additional information contained in a full ME-model, but retains the stoichiometry of all the reactions. In other words, it behaves like an Mmodel with symbolic mu terms in metabolic coefficients and reaction bounds. Therefore it will give identical solutions compared to the full model, but all additional ME-model functionality will be lost.
Saving and loading a reduced Me-model (me_model) as a JSON can be done using: In where new_me_model is of type cobra.Model

As a pickle file
This is the quickest way to save a ME-model in full detail. It can be accomplished using python's pickle dump/load methods. A ME-model named me_model can be saved follows. This is not a recommended way to save a ME-model when sharing or for use over the long term as it can break when using different software versions.

CHAPTER 6 Coupling Constraint Derivations
This section will show in detail how coupling coefficients for two macromolecules (mRNA and ribosome) are derived. The remaining macromolecule coupling derivations follow a similar approach and logic, therefore they are omitted here. For remaining derivations, reference O'Brien et al, 2013.

Parameters
The parameters for the mRNA coupling coefficient derivations are listed below: P = total cellular protein mass fraction ( g aa gDW cell ) Along with an experical relationship between measured ratio of RNA (R) to Protein (P) For E. coli grown at 37 , (Scott et al., 2010) emperically found 0 = 0.087 and = 4.5 1 ℎ .

Derivation of mRNA coupling coefficients
To derive the mRNA dilution and degradation coupling coefficients, we assume that these processes are coupled together as follows.
where 1 and 2 represent the coupling of degradation to dilution and translation to degredation, respectively.
For the remainder of the mRNA coupling derivation we will abbreviate these reaction rates as v dilution , v degradation and v translation for simplicity.
To find these coupling values, we will need to find v dilution , v degradation and v translation . The dilution of mRNA nucleotides as it is passed on to daughter cells is related to the concentration of mRNA nucleotides and the growth rate as follows: similarly the degradation rate can be found using the first order rate constant of mRNA degradation v degradation = k mRNA deg · [nt mRNA ] the rate of translation / protein synthesis rate in ( ℎ ) can be found using the following. This represents the rate which amino acid are incorporated into protein: The concentration of mRNA nucleotides in units of ( ) can be defined as:

Solving for mRNA coupling coefficients
Solving for each of these coupling terms gives: ·P maa substituting for [mRNA] gives: substitution for gives: Simplifying the above relationship, the coupling of dilution to translation is represented by: Therefore kmRNA = 1 · 2 and:

Units of mRNA coupling
Based on the [nt mRNA ] expression above, the units will be: therefore v degradation will be : and for v translation : and for v dilution :

Applying mRNA coupling to translation
Note that the units for each reaction detailed in the above derivations describe the overall coupling of translation, dilution, and degradation cell-wide. For individual proteins and ME-model translation reactions, we will have: v dilutioni = 1 · 2 · len peptidei len mRNAi · v translationi the length terms are required due to the fact that v dilutioni and v translationi will have units of molmRNA i gDW·hr and molprotein i gDW·hr , respectively. Since: however the length of a peptide will always be 1/3 the length of the mRNA that encodes it (3 nucleotides in a codon) therefore we can replace ( molaa·molmRNA i molprotein i ·molnt ) therefore the final coupling of dilution to translation will be: and similarly for degradation coupling:

Plugging ribosome coupling into a ME-model reaction
The coupling of mRNA synthesis to translation will require considering the sum of the mRNA dilution and degration. When imposed in the ME-model, a translation reaction will look similar to following: x · charged_tRNAs + ( 1 3 with the coupling coefficients substituted: where x and y represents the coupling coefficient for the tRNAs and ribosome (the ribosome coupling is derived below). The reaction will produce nucleotides with a coefficient of 1 3 · 2 since these are the product of mRNA degradation.
Note: There is a minor typo in the O'brien et al., 2013 coupling coefficient derivations where the 1 and 2 expresions are multiplied by 3 instead of 1 3 .

Derivation of ribosome coupling coefficients
Like above, we will derive the coupling between translation and ribosome dilution to daughter cells during cell division. Unlike mRNA, ribosomes and rRNA are stable and we assume they are degraded at a neglible rate v dilution ribosome = 3 · v translationaa protein As for the mRNA coupling derivation above, 3 represent the coupling of translation to ribosome dilution. For the remainder of the ribosome coupling derivation, we will abbreviate these reaction rates as v dilution and v translation for simplicity.
The translation of protein is defined as above in the mRNA coupling derivations: v translation = · P m aa and: The concentration of ribosome in units of ( mol ribosome gDW cell ) : [ribosome] = R · f rRNA m rr plugging in this expression for [ribosome] and solving for 3 gives : plugging in the above empirical expression for :

Applying ribosome coupling to translation
Note that the units for each reaction detailed in the above derivations describe the overall coupling of translation to ribosome dilution on a cell-wide level. For individual proteins, we will have: The length term is required due to the fact that in the ME-model v dilutioni and v translationi will have units of mol ribosome gDW cell ·hr and molprotein i gDW cell ·hr , respectively. Since: therefore plugging this into the final coupling of dilution to translation will be:

Applying ribosome coupling to translation
When further imposing ribosome dilution coupling in the ME-model, a translation reaction will look similar to following: x · charged_tRNAs + ( 1 3 with the coupling coefficients substituted: where x represents the coupling coefficient for the tRNAs.

update(verbose=True)
Creates reaction using the associated complex data and adds chemical formula to complex metabolite product.
This function adds the following components to the reaction stoichiometry (using 'data' as shorthand for cobrame.core.processdata.ComplexData): 1. Complex product defined in self._complex_id 2. Protein subunits with stoichiometery defined in data.stoichiometry 3. Metabolites and enzymes w/ coupling coefficients defined in data.subreactions. This often includes enzyme complex modifications by coenzymes or prosthetic groups. Some components in an ME-model can perform exactly the same function. To handle this, GenericFormation-Reactions are used to create generic forms of these components.
Parameters id (str) -Identifier of the generic formation reaction. As a best practice, this ID should be prefixed with 'metabolite_id + _to_ + generic_metabolite_id' class cobrame.core.reaction.MEReaction(id=None, name='') Bases: cobra.core.Reaction.Reaction MEReaction is a general reaction class from which all ME-Model reactions will inherit This class contains functionality that can be used by all ME-model reactions Parameters id (str) -Identifier of the MEReaction. Should follow best practices of child class add_biomass_from_subreactions(process_data, biomass=0.0) Account for the biomass of metabolites added to macromolecule (protein, complex, etc.) due to a modification such as prosthetic group addition.

• biomass (float) -Initial biomass value in kDa
Returns Initial biomass value + biomass added from subreactions in kDa Return type float add_subreactions(process_data_id, stoichiometry, scale=1.0) Function to add subreaction process data to reaction stoichiometry

Parameters
• process_data_id (str) -ID of the process data associated with the metabolic reaction.
For example, if the modifications are being added to a complex formation reaction, the process data id would be the name of the complex. Overrides method in parent class in order to enable use of optlang interfaces.

Irreversible metabolic reaction including required enzymatic complex
This reaction class's update function processes the information contained in the complex data for the enzyme that catalyzes this reaction as well as the stoichiometric data which contains the stoichiometry of the metabolic conversion being performed (i.e. the stoichiometry of the M-model reaction analog) Parameters id (str) -Identifier of the metabolic reaction. As a best practice, this ID should use the following template (FWD=forward, REV=reverse): "<StoichiometricData.id> + _ + <FWD or REV> + _ + <Complex.id>" keff float -The turnover rete (keff) couples enzymatic dilution to metabolic flux reverse boolean -If True, the reaction corresponds to the reverse direction of the reaction. This is necessary since all reversible enzymatic reactions in an ME-model are broken into two irreversible reactions complex_data Get or set the ComplexData instance that details the enzyme that catalyzes the metabolic reaction. Can be set with instance of ComplexData or with its id.
Returns Complex data detailing enzyme that catalyzes this reaction Return type cobrame.core.processdata.ComplexData stoichiometric_data Get or set the StoichiometricData instance that details the metabolic conversion of the metabolic reaction. Can be set with instance of StoichiometricData or with its id.

update(verbose=True)
Creates reaction using the associated stoichiometric data and complex data.
This function adds the following components to the reaction stoichiometry (using 'data' as shorthand for cobrame.core.processdata.StoichiometricData): 1. Complex w/ coupling coefficients defined in self.complex_data.id and self.keff 2. Metabolite stoichiometry defined in data.stoichiometry. Sign is flipped if self.reverse == True Also sets the lower and upper bounds based on self.reverse and data.upper_bound and data.lower_bound.
Parameters verbose (bool) -Prints when new metabolites are added to the model when executing update() class cobrame.core.reaction.PostTranslationReaction(id) Bases: cobrame.core.reaction.MEReaction Reaction class that includes all posttranslational modification reactions (translocation, protein folding, modification (for lipoproteins) etc) There are often multiple different reactions/enzymes that can accomplish the same modification/function. In order to account for these and maintain one translation reaction per protein, these processes need to be modeled as separate reactions.
Parameters id (str) -Identifier of the post translation reaction add_translocation_pathways(process_data_id, protein_id, stoichiometry=None) Add complexes and metabolites required to translocate the protein into cell membranes.

Parameters
• process_data_id (str) -ID of translocation data defining post translation reaction Get or set PostTranslationData that defines the type of post translation modification/process (folding/translocation) that the reaction accounts for. Can be set with instance of PostTranslationData or with its id.

update(verbose=True)
Creates reaction using the associated posttranslation data and adds chemical formula to processed protein product This function adds the following components to the reaction stoichiometry (using 'data' as shorthand for cobrame.core.processdata.PostTranslationData): SummaryVariables are reactions that impose global constraints on the model.
The primary example of this is the biomass_dilution SummaryVariable which forces the rate of biomass production of macromolecules, etc. to be equal to the rate of their dilution to daughter cells during growth.
RNA is transcribed on a transcription unit (TU) level. This type of reaction produces all of the RNAs contained within a TU, as well as accounts for the splicing/excision of RNA between tRNAs and rRNAs. The appropriate RNA_biomass constrain is produced based on the molecular weight of the RNAs being transcribed Parameters id (str) -Identifier of the transcription reaction. As a best practice, this ID should be prefixed with 'transcription + _' transcription_data Get or set the cobrame.core.processdata.TranscriptionData that defines the transcription unit architecture and the features of the RNAs being transcribed.

update(verbose=True)
Creates reaction using the associated transcription data and adds chemical formula to RNA products Reaction class for the charging of a tRNA with an amino acid Parameters id (str) -Identifier for the charging reaction. As a best practice, ID should follow the template "charging_tRNA + _ + <tRNA_locus> + _ + <codon>". If tRNA initiates translation, <codon> should be replaced with START.
tRNA_data Get and set the cobra.core.processdata.tRNAData that defines the translation of the gene. Can be set with instance of tRNAData or with its id.

update(verbose=True)
Creates reaction using the associated tRNA data This function adds the following components to the reaction stoichiometry (using 'data' as shorthand for cobrame.core.processdata.tRNAData): 1. Charged tRNA product following template: "generic_tRNA + _ + <data.codon> + _ + <data.amino_acid>" 2. tRNA metabolite (defined in data.RNA) w/ charging coupling coefficient 3. Charged amino acid (defined in data.amino_acid) w/ charging coupling coefficient 5. Synthetase (defined in data.synthetase) w/ synthetase coupling coefficient found, in part, using data.synthetase_keff This can include any enzyme complex modifications required for the enzyme to become active.

Parameters
• id (str) -Identifier of the complex data. As a best practice, this should typically use the same ID as the complex being formed. In cases with multiple ways to form complex '_ + alt' or similar suffixes can be used.
• Class for storing information about generic metabolites
Creates generic metabolite and generic reaction, if they do not already exist.

Parameters
• id (str) -Identifier for post translation process. float -Some small peptides are more likely to be folded by certain chaperones. This is accounted for using propensity_scaling.

Generic class for storing information about a process
This class essentially acts as a database that contains all of the relevant information needed to construct a particular reaction. For example, to construct a transcription reaction, following information must be accessed in some way: • nucleotide sequence of the transcription unit • RNA_polymerase (w/ sigma factor)

• RNAs transcribed from transcription unit
• other processes involved in transcription of RNAs (splicing, etc.) ME-model reactions are built from information in these objects.
• model (cobrame.core.model.MEModel) -ME-model that the ProcessData is associated with model Get the ME-model the process data is associated with Returns ME-model that uses this process data If subreaction adds a chemical moiety to a macromolecules via a modification or other means, the biomass contribution of the modification process should be accounted for and ultimately included in the reaction it is involved in. If a stable RNA (e.g. tRNA or rRNA) is coded for in the transcription unit, the transcript must be spliced in order for these to function.
This determines whether the transcription unit requires splicing and, if so, returns the count of nucleotides within the transcription unit that are not accounted for in the RNA products, thus identifying the appropriate introns nucleotides.
• model (cobrame.core.model.MEModel) -ME-model that the tRNAData is associated with • amino_acid (str) -Amino acid that the tRNA transfers to an peptide Add all reactions necessary to produce a dummy reaction catalyzed by "CPLX_dummy".

Parameters
• me_model (cobrame.core.model.MEModel) -The MEModel object to which the content will be added • dna_seq (str) -DNA sequence of dummy gene. Should be representative of the average codon composition, amino acid composition, length of a gene in the organism being modeled • update (bool) -If True, run update functions on all transcription, translation, complex formation, and metabolic reactions added when constructing dummy reactions.
cobrame.util.building.add_m_model_content(me_model, m_model, com-plex_metabolite_ids=None) Add metabolite and reaction attributes to me_model from m_model. Also creates StoichiometricData objects for each reaction in m_model, and adds reactions directly to me_model if they are exchanges or demands.

Parameters
• me_model (cobrame.core.model.MEModel) -MEModel that the MetabolicReaction will be added to • stoichiometric_data_id (str) -ID of the StoichiometricData for the reaction being added • directionality (str) --Forward: Add reaction that occurs in the forward direction -Reverse: Add reaction that occurs in the reverse direction • complex_id (str or None) -ID of the ComplexData for the enzyme that catalyze the reaction being added.
It is assumed that each modification adds one equivalent of the modification metabolite. Multiple Intended to be used as a function for large-scale complex addition. Intended for use when adding all reactions from stoichiometric data for the first time.
For adding an individual reaction use add_metabolic_reaction_to_model()
It's assumed every complex modification occurs spontaneously, unless a modification_enzyme argument is passed.
If a modification uses an enzyme this can be updated after the SubreactionData object is already created Parameters me_model (cobrame.core.model.MEModel)cobrame.util.building.add_transcription_reaction(me_model, tu_name, locus_ids, sequence, update=True) Create TranscriptionReaction object and add it to ME-Model. This includes the necessary transcription data.

Parameters
• me_model (cobrame.core.model.MEModel) -The MEModel object to which the reaction will be added • tu_name (str) -ID of TU being transcribed. The TranscriptionReaction will be added as "transcription_+TU_name" The TranscriptionData will be added as just "TU_name" • locus_ids (set) -Set of locus IDs that the TU transcribes • sequence (str) -Nucleotide sequence of the TU.
• update (bool) -If True, use TranscriptionReaction's update function to update and add reaction stoichiometry Returns TranscriptionReaction for the TU Return type cobrame.core.reaction.TranscriptionReaction cobrame.util.building.add_translation_reaction(me_model, locus_id, dna_sequence, up-date=False) Creates and adds a TranslationReaction to the ME-model as well as the associated TranslationData A dna_sequence is required in order to add a TranslationReaction to the ME-model • me_model (cobra.core.model.MEModel) -The MEModel object to which the reaction will be added • locus_id (str) -Locus ID of RNA product. The TranslationReaction will be added as "translation + _ + locus_id" The TranslationData will be added as "locus_id" • dna_sequence (str) -DNA sequence of the RNA product. This string should be reverse transcribed if it originates on the complement strand.

Parameters
• me_model (cobrame.core.model.MEModel) -The MEModel object to which the reaction will be added • gb_filename (str) -Local name of the genbank file that will be used for ME-model construction • tu_frame (pandas.DataFrame) -DataFrame with indexes of the transcription unit name and columns containing the transcription unit starting and stopping location on the genome and whether the transcription unit is found on the main (+) strand or complementary (-) strand.
If no transcription unit DataFrame is passed into the function, transcription units are added corresponding to each transcribed gene in the genbank file.
• element_types (set) -Transcription reactions will be added to the ME-model for all RNA feature.types in this set. This uses the nomenclature of the genbank file (gb_filename) • verbose (bool) -If True, display metabolites that were not previously added to the model and were thus added when creating charging reactions Creates a TranscribedGene metabolite object and adds it to the ME-model

Parameters
• me_model (cobrame.core.model.MEModel) -The MEModel object to which the reaction will be added • locus_id (str) -Locus ID of RNA product. The TranscribedGene will be added as "RNA + _ + locus_id" • left_pos (int or None) -Left position of gene on the sequence of the (+) strain • right_pos (int or None) -Right position of gene on the sequence of the (+) strain • seq (str) -Nucleotide sequence of RNA product. Amino acid sequence, codon counts, etc. will be calculated based on this string.
• strand (str or None) --(+) if the RNA product is on the leading strand -(-) if the RNA product is on the complementary strand • rna_type (str) -Type of RNA of the product. tRNA, rRNA, or mRNA Used for determining how RNA product will be processed.