Bruno Bienfait and Oliver Sacher are employees of Molecular Networks GmbH owned and managed by Johann Gasteiger. All three persons had been involved in the development of a prototype version of the "Reverse Pathway Engineering" program used for the described study. Presently, no further development and no marketing of the software program is being done. Mengjin Liu, Arjen Nauta and Jan Geurts are employees of FrieslandCampina. All three persons have been involved in developing the "Reverse Pathway Engineering" approach and in the associated bioinformatics analyses. This does not alter the authors′ adherence to all PLOS ONE policies on sharing data and materials.
Conceived and designed the experiments: JMWG. Performed the experiments: ML. Analyzed the data: ML BB OS JG RJS. Contributed reagents/materials/analysis tools: BB OS JG. Wrote the paper: ML AN JMWG.
The incompleteness of genome-scale metabolic models is a major bottleneck for systems biology approaches, which are based on large numbers of metabolites as identified and quantified by metabolomics. Many of the revealed secondary metabolites and/or their derivatives, such as flavor compounds, are non-essential in metabolism, and many of their synthesis pathways are unknown. In this study, we describe a novel approach, Reverse Pathway Engineering (RPE), which combines chemoinformatics and bioinformatics analyses, to predict the “missing links” between compounds of interest and their possible metabolic precursors by providing plausible chemical and/or enzymatic reactions. We demonstrate the added-value of the approach by using flavor-forming pathways in lactic acid bacteria (LAB) as an example. Established metabolic routes leading to the formation of flavor compounds from leucine were successfully replicated. Novel reactions involved in flavor formation, i.e. the conversion of alpha-hydroxy-isocaproate to 3-methylbutanoic acid and the synthesis of dimethyl sulfide, as well as the involved enzymes were successfully predicted. These new insights into the flavor-formation mechanisms in LAB can have a significant impact on improving the control of aroma formation in fermented food products. Since the input reaction databases and compounds are highly flexible, the RPE approach can be easily extended to a broad spectrum of applications, amongst others health/disease biomarker discovery as well as synthetic biology.
Chemical systems biology, a new discipline, linking chemical biology and systems biology, is currently drawing more and more attention
A classical systems biology approach for analyzing cellular processes of microorganisms starts with the reconstruction of biochemical networks based on annotated genomes
Flavor components in various fermented dairy foods are produced by lactic acid bacteria (LAB), which are present in starter cultures. Several reviews have summarized the flavor-forming pathways of LAB, especially the pathways originating from amino acids
In order to address the above mentioned knowledge gaps, we developed a novel approach, Reverse Pathway Engineering (RPE), which uses small molecules i.e. measured flavor compounds as input, and suggests the enzymatic or chemical reactions that can trace them back to known metabolic precursors. Thus, the production routes that synthesize the target compounds can be predicted.
The essential innovative step in the RPE approach is “Retrosynthesis”, an automatic process originally used by organic chemists to find methods to produce a given target compound by applying backward “retro reactions” to the target compound. This process was first posed by E.J. Corey and led to the development of the retrosynthesis program LHASA
In this paper, we present the novel RPE approach combining chemoinformatics (retrosynthesis) and bioinformatics (comparative genomics) to predict unrevealed reactions in metabolic pathways. We describe several cases, illustrating the strength of the approach to predict flavor-forming reactions/ pathways, especially those from leucine and methionine catabolism. Since the branched-chain amino acid degradation pathways, in particular the leucine degradation pathway, are relatively well studied
The flavor compounds used as inputs for the RPE approach were selected on the basis of literature and/or from GC-MS measurements
The BioPath.Database
In order to predict the unknown reactions or pathways leading to flavor formation, a software package called THERESA (THE REtroSynthetic Analyser) for designing organic synthesis reactions was used. THERESA derives a large portion of its knowledge from a reaction database. The extended BioPath.Database described above was utilized as the reaction pool. The combined system is called BioPath.Design as it allows the design of biotechnological processes. For each reaction of the BioPath.Database, BioPath.Design automatically extracted the reaction centers plus the transformation steps and then used them for RPE analyses. After the target structure was submitted, BioPath.Design searched its database of biotransformation rules for all reactions that could be applied to the target structure in the reverse direction (retrosynthesis). If a substructure match was identified, BioPath.Design could build a new complete synthesis reaction by adding the missing co-product(s), copying the atom-atom mapping numbers of the reaction centers to the target compound and generating the structure of the reactants. The predicted reactions were ranked based on the presence of reactants and predicted reactions in the database, as well as the simplicity of structures. The predicted reactions were manually inspected and a candidate reaction was selected based on the information of the corresponding reference reactions found in the database or on prior knowledge. When one of the plausible synthesis reactions was selected, the reactant could be used as the input compound to search for its precursor. This process was repeated iteratively until a known metabolic precursor was found and a complete synthetic route was retrieved. In order to consider the reversibility of biochemical reactions, the reversibility information stored with each reaction of BioPath.Database was used. In case of an irreversible reaction or if the reversibility information was unknown (especially those of the L. plantarum dataset), only one biotransformation rule was created, in case of a reversible reaction two biotransformation rules, one for each direction, were generated.
After a plausible reaction was proposed by BioPath.Design, a list of candidate enzymes which might catalyze the reaction was prepared on the basis of the reference reaction in the BioPath.Database or by inference from the transformation rule of the predicted reaction
We used our previously described comparative genomics approach to predict putative enzymes for reactions
To close the gaps between the target chemical compounds and their metabolic precursors, we developed a novel approach named Reverse Pathway Engineering. The pipeline of the RPE approach, combining the scientific disciplines chemo- and bioinformatics, is shown in
The approach enables a flexible input of target compounds and reaction databases, and can result in an output for various analyses or applications. Using 3-methylbutanoic acid as an input compound, two of the proposed synthetic reactions are shown as an example. The complete retrosynthesis trees can be found in
A leucine catabolism network was reconstructed on basis of several published studies
Three branches of the subsequent degradation of alpha-keto isocaproate (KICA) are indicated: i) conversion to the corresponding aldehyde, alcohol or carboxylic acid via alpha-keto acid decarboxylation (depicted in black) or ii) the oxidative decarboxylation (depicted in blue) or iii) an alternative route resulting in α-hydroxy-isocaproate (HICA), as depicted in gold. The flavor compounds used as input for RPE approach are indicated in italics. The novel predicted reactions are indicated by red dashed arrows. Enzymes names are: BcAT, branched-chain aminotransferase; GDH, glutamate dehydrogenase; HycDH, hydroxyacid dehydrogenase; KdcA, alpha-ketoacid decarboxylase; AlcDH, alcohol dehydrogenase; AldDH, aldehyde dehydrogenase; EstA, esterase A; KaDH, alpha-ketoacid dehydrogenase complex; PTA, phosphotransacylase; ACK, acyl kinase.
Leucine catabolism can be divided into three major parts, as shown in
The main branches of the leucine degradation pathway and the leucine and isoleucine inter-conversion route were used as positive controls for the validation of the RPE approach. The reactions highlighted in red are novel, so-far unrevealed reactions predicted by the RPE approach and will be described in detail in the following sections (
The main flavor products of leucine degradation are 3-methylbutanal, 3-methylbutanol, and 3-methylbutanoic acid, which provide cheesy, malty and sweaty odors, respectively
One of the retrieved synthetic routes of 3-methylbutanol is shown in
The retrosynthesis tree was obtained from BioPath.Design using 3-methylbutanol as input.
When 3-methylbutanoic acid was used as input, three routes were predicted by the RPE approach (
The retrosynthesis trees correspond to the pathways shown in
The inter-conversion route between leucine and isoleucine degradation was also successfully predicted by the RPE approach. This inter-conversion route, in which isovaleryl-CoA is converted to 2-methylbutyric acid, has been proposed to occur in
Besides the two main biosynthesis routes of 3-methylbutanoic acid (i.e. via alpha-keto acid decarboxylation and oxidative decarboxylation) described above, a third route of 3-methylbutanoic acid synthesis from KICA via alpha-hydroxy-isocaproate (HICA) was proposed by RPE (
The second step of the predicted route suggests 3-methylbutanoic acid to be formed from HICA (
The suggested reaction is shown in the upper part. One of the reference reactions is indicated together with the information on the enzyme which catalyzes it.
In order to further substantiate this hypothesis, we performed additional bioinformatics studies. Lactate 2-monooxygenase activity was identified in
The functional equivalents (orthologs) of LOX from
In addition to the reactions catalyzed by enzymes, chemical conversions also play an important role in the process of flavor formation. For example, Bonnarme et al.
When 2-methylpropanal was used as the target compound for RPE, one of the reactions predicted by BioPath.Design was the decarboxylation of alpha-keto-isovaleric acid derived from valine, analogous to the decarboxylation reaction converting KICA to 3-methylbutanal described above. An exact match of this reaction was found by scanning the BioPath.Database, suggesting it has been stored in the database.
A yet unrevealed novel chemical reaction converting KICA to 2-methylpropanal was also predicted, which inter-connects leucine and valine catabolism (
The reference reaction converting alpha-keto-γ-methylthiobutyrate to methylsulfanyl-acetaldehyde was derived from the additional part of the BioPath.Database containing the reactions of the flavor-forming pathways from sulfur-containing amino acid degradation.
Methionine catabolism is known as one of the main flavor-forming pathways, giving rise to various volatile sulfur compounds such as H2S, methanethiol, dimethyl sulfide (DMS), dimethyl disulfide (DMDS), and dimethyl trisulfide (DMTS)
Methanethiol, a compound derived from elimination of methionine catalyzed by a C-S lyase, is regarded as the precursor of DMS whose odor is described as “boiled cabbage, sulfurous”
In order to predict the most plausible synthesis reaction, DMS was used as input for RPE. The reaction of methanethiol accepting a methyl group from S-AdoMet to form DMS and S-adenosyl-L-homocysteine (S-AdoHcy), in line with the second hypothesis, was predicted by BioPath.Design (
In the last decades, high-throughput analytical techniques, such as genomics, proteomics and metabolomics, have developed rapidly
When our approach is compared to one of the previously published tools, Pathcomp
For the novel enzymatic reactions described in this study, the putative enzymes, as well as the candidate genes in LAB were proposed by bioinformatics analyses. The conflicting hypotheses on the synthesis of DMS from previous studies were elucidated in this study. In support of the hypothesis that DMS is formed enzymatically, our
The predicted novel reactions not only increase the resolution of metabolic models, but also provide leads for metabolic engineering. Hydroxy acids such as HICA may have a negative effect on flavor formation since they share the same precursors as other flavor compounds such as 3-methylbutanol and 3-methylbutanoic acid
The RPE approach is not limited to the prediction of enzymatic reactions, as chemical reactions are predicted as well. The predicted chemical reaction for forming 2-methylpropanal from KICA was proposed to take place under the same environmental conditions as the reference reaction which chemically converts KMBA to MTAC
To be noted, the reference reaction of any predicted chemical reaction should be present in the set of reconstructed flavor-forming pathways in the extended BioPath.Database. This shows the importance of compiling a comprehensive and suitable reaction database for the RPE approach. For this reason, an extended BioPath.Database was constructed with additional reactions from the genome-scale metabolic model of
The RPE approach uniquely connects chemical data all the way back to genomic data. This distinguishes it from other retrobiosynthesis methods such as ReBiT (Retro-Biosynthesis Tool)
The RPE approach can be extended to other application fields, thanks to the flexibility of using various reaction databases and target compounds. Possible future applications could be in the field of biomarker discovery. Recently, a metabolomics study described the correlation between human metabolic phenotypes and specific dietary preferences with respect to chocolate consumption
(TIFF)
(TIF)
(TIF)
(TIF)
We are grateful to Prof. Dr. van Berkel, Prof. Dr. van der Oost and Dr. Franssen from Wageningen University and Prof. Dr. Hagen and Prof. Dr. Arends from the Technical University Delft for provided input and discussions.