Construction of an artificial phosphoketolase pathway that efficiently catabolizes multiple carbon sources to acetyl-CoA

The canonical glycolysis pathway is responsible for converting glucose into 2 molecules of acetyl-coenzyme A (acetyl-CoA) through a cascade of 11 biochemical reactions. Here, we have designed and constructed an artificial phosphoketolase (APK) pathway, which consists of only 3 types of biochemical reactions. The core enzyme in this pathway is phosphoketolase, while phosphatase and isomerase act as auxiliary enzymes. The APK pathway has the potential to achieve a 100% carbon yield to acetyl-CoA from any monosaccharide by integrating a one-carbon condensation reaction. We tested the APK pathway in vitro, demonstrating that it could efficiently catabolize typical C1-C6 carbohydrates to acetyl-CoA with yields ranging from 83% to 95%. Furthermore, we engineered Escherichia coli stain capable of growth utilizing APK pathway when glycerol act as a carbon source. This novel catabolic pathway holds promising route for future biomanufacturing and offering a stoichiometric production platform using multiple carbon sources.


Introduction
Glycolysis, a catabolic pathway comprising 10 cascade biochemical reactions, is responsible for converting glucose to pyruvate, which is further decarboxylated to produce acetyl-CoA.This process serves as an essential energy source and supplier of building blocks for life [1,2].Over billions of years, natural selection has evolved intricate networks within glycolysis to maintain metabolic and physiological homeostasis [3][4][5].While intensive efforts have been devoted to redirecting metabolic homeostasis toward desired compounds with the maximum stoichiometry [6][7][8][9][10], the synthesis of cellular metabolites via glycolysis has been evolved with the optimality principles in nature [11,12].Partially engineering the central carbon metabolism is insufficient to unravel the complexities of intracellular catabolic networks [13,14].In this study, we capitalized on the chemical principles to design and implement a de novo pathway called artificial phosphoketolase (APK), which leverages 3 core reactions to convert any ketose into acetyl-CoA.
Among the key enzymes involved in C-C bond cleavage, thiamin diphosphate (ThDP)dependent phosphoketolase (PK) plays a significant role [15,16].PKs have the ability to break the C2-C3 bond of diverse ketoses, generating acetyl-phosphate (AcP) as a product.This AcP can then be further converted to acetyl-CoA by phosphate acetyltransferase (PTA) [17].According to the catalytic mechanisms of PK, it is possible that ThDP could accept any ketose, forming a covalent intermediate that yields AcP and releases a free aldose (Fig 1) [15].At the same time, isomerases enable the interconversion between aldoses and ketoses [18,19].Therefore, we hypothesized that all ketoses could be completely converted to acetyl-CoA through multiple cycles of carbon cleavage by PK and isomerization (Fig 1).Inspired by these natural and hypothetical catalytic processes, we proposed the construction of the APK pathway to facilitate the production of stoichiometric quantities of acetyl-CoA from any sugar.

Results and discussion
Indeed, the PK gene has originated very early in nature and is widely distributed among the 3 kingdoms of life, indicating its extensive application in carbon metabolism throughout evolutionary history (Figs 2A and S1).To assess the activity of PK on different sugars, we selected 23 candidate genes from different species based on the phylogenetic tree analysis (S2 Fig) .However, due to significant evolutionary distances, only 11 out of 23 PK genes were successfully expressed in Escherichia coli.Among these 11 candidates, 7 PKs displayed activity not only on fructose-6-phosphate (F6P) or xylulose-5-phosphate (Xu5P) but also on short-chain ketoses, converting them into AcP (Figs 2B and S3-S5 and S1 Table and S1 Data).Among all of the studied genes, PK from Actinobacteria Bifidobacterium (BbPK) displayed relatively high activity on all tested ketoses.We used quantum-chemical analysis to To further enhance the activity of BbPK on short-chain sugars, we performed directed evolution experiments.Since the formation process of DHEThDP from glycolaldehyde (GALD) and dihydroxyacetone (DHA) differs [20], we independently screened mutants with improved activities using 2 different high-throughput screening methods for GALD and DHA (S13-S16 Figs and S1 Data).Considering that the catalytic center of BbPK forms at the interface of a homodimer [17], we targeted residues located at the active cavity and protein-dimer interface for saturation mutagenesis (Fig 3A).Through screening around 3,060 mutants (S13 and S15 Figs), we identified potential sites that showed remarkable improvement in catalytic activity.For GALD, the potential residues were P136, H142, S440, R524, and S739.While for DHA, the potential residues were Q321, E520, G550, E735, and P136.Based on these potential activityenhancing sites, we implemented random pairwise combinations and iterative saturation mutagenesis.After screening around 3,840 mutants (S14 and S16 Figs), the mutants exhibiting relative higher activities were subsequently sequenced and characterized (Figs 3B and S17 and S5 Table and S1 Data).Surprisingly, the combination mutations did not yield improved activities, and the best mutants were all single-point mutants.The dynamic parameters revealed that PK-E520I and PK-Q321A displayed 2.3-fold and 5-fold higher catalytic efficiency for DHA compared to wild-type BbPK, respectively (S17 Fig and S5 Table ).Additionally, PK-H142N showed 8.5-fold higher catalytic efficiency for glycolaldehyde and 3.6-fold higher catalytic  ).Importantly, the 3 improved mutants exhibited smaller pocket volumes than wild-type BbPK, potentially increasing the likelihood of small substrates reaching a reactive state, which likely contributed to their enhanced activities (Fig 3C).
To validate the functionality of APK pathway based on BbPK, we decided to use the most abundant hexose D-glucose and pentose D-xylose as examples, which would be converted into F6P and Xu5P within the cells.Thus, F6P and Xu5P were selected as substrates for the study.BbPK catalyzed the conversion of F6P and Xu5P into AcP and D-erythorse-4-phosphate/glyceraldehyde-3-phosphate (E4P/G3P) [21].Subsequently, 2 potential pathways for the further conversion of E4P/G3P into AcP were considered, taking into account the order of dephosphorylation and isomerization (S18 Fig) .Considering the substrate promiscuity commonly observed in most phosphatases, it was deemed more efficient to reduce phosphorylated intermediates in the APK pathway.Therefore, E4P was first dephosphorylated and then isomerized into D-EUS (Fig 4A).On the other hand, if G3P were dephosphorylated first, there was no available enzyme to facilitate the isomerization of glyceraldehyde to DHA [22].Hence, we proposed converting G3P into DHAP and subsequently producing DHA through the action of phosphatase, followed by the reaction catalyzed by BbPK (Fig 4B ).
Considering the presence of multiple phosphate intermediates in the APK pathway and the limited selectivity of most phosphatases, it was necessary to identify specific phosphatases with high selectivity for E4P and DHAP.Seven phosphatases from different species were screened for this purpose (S19 Fig).Among them, HAD-like hydrolase (EcHAD) from E. coli displayed the highest specificity for E4P, with activity 2-fold higher than that on F6P (Fig 4C and S1 Data).Additionally, sugar phosphatase from Candida parapsilosis (CpHAD) displayed the highest specificity for DHAP, with activity more than 2-fold higher than that on Xu5P and G3P (Fig 4C and S1 Data).In addition, L-rhamnose isomerase (Ps-LRhI) from pseudomonas stutzeri was used to catalyze the isomerization of D-erythrose [23], and triose phosphate isomerase (EcTIM) from E. coli was used to isomerize G3P.
In addition, when dealing with ketoses with an even number of carbon atoms, they could be completely converted into a half the even number of AcP by PK and isomerase.However, ketoses with an odd number of carbon atoms would retain 1 molecule of formaldehyde at the end, resulting in carbon loss.To address this problem, we proposed the introduction of the formose reaction to condense formaldehyde into glycolaldehyde or DHA [20,24], which would then be further converted into AcP by BbPK (S20 Fig) .The catalytic activities of BbPK on GALD and DHA were compared.Owing to the higher affinity of BbPK for DHA (S5 Table ), formolase (FLS) was chosen to recycle formaldehyde in the system.Since AcP is unstable, we quantified AcP by converting AcP into acetic acid.With an input of 10 mM DHA, the With all the key enzymes available, we successfully assembled the APK pathway in vitro.For the conversion of F6P, the APK pathway included BbPK, EcHAD, and PsLRhI, while for Xu5P, it involved BbPK, EcTIM, CpHAD, and FLS (Fig 4A and 4B).Under optimal reaction conditions, the final yields of acetic acid from F6P and Xu5P reached 84% and 90%, respectively (Fig 4D and 4E and S1 Data), surpassing the 67% yield of the glycolytic pathway [13].Metabolite analysis showed that BbPK rapidly cleaved F6P and Xu5P within the first 2 h (Fig 4D and 4E).However, the conversion of E4P or G3P to AcP was slow due to the low catalytic efficiency of BbPK on GALD, DHA, and D-EUS (S5 Table ).After 10 h, only small amounts of GALD and D-EUS remained in the F6P reaction system, along with traces of formaldehyde and DHA in the Xu5P reaction system.Furthermore, we tested the APK pathway for C1, C2, C3, and C4 carbon source in vitro (Figs 4F and S22 and S1 Data).The results demonstrated that the APK pathway enabled to achieve the nearly stoichiometric synthesis of acetyl-CoA from all tested carbon sources.
To demonstrate the potential of APK pathway in vivo, we performed growth assays of APK pathway in an engineered E. coli stain using glycerol as the carbon source.The glpk gene was knocked out to prevent the production of glycerol 3-phosphate from glycerol, while the dhak gene was knocked out to prevent the formation of DHAP from DHA.The promoter Ptac was used to overexpress pk in the plasmid (S23 Fig) .The results showed that the control strain MG1655ΔglpkΔdhak could not grow on glycerol medium due to the knockout of the related metabolic pathway.However, the growth of the strain MG1655ΔglpkΔdhak harboring the plasmid pBD-PK(SPK) was restored due to the conversion of glycerol to biomass through the APK pathway based on the catalytic activity of PK on DHA (S23 Fig and S1 Data).Although the difference in growth was only noticeable after 20 h due to the low activity of PK, and the growth of SPK was extremely weak, our results demonstrate the potential of APK pathway and its future used in constructing cell factories.
Depending on the versatile PK, it enables the conversion of various carbon sources to AcP and subsequent generation of acetyl-CoA.However, the low activities of BbPK on short-chain ketoses pose a major limitation.In fact, the standard Gibbs free energy change (ΔG') of the BbPK for short-chain ketoses are all thermodynamically favorable, similar with Xu5P or F6P (S6 Table ).The irreversible phosphorylytic cleavage of Xu5P or F6P by PK is both thermodynamically and kinetically favorable, which is one of the primary reasons why PK can shift the carbon flux and improve the carbon yield of acetyl-CoA derivatives in metabolic engineering [8,[25][26][27].Therefore, it is possible to improve the catalytic efficiency of BbPK for short-chain ketoses to enhance the ability of APK pathway for in vivo applications in the future.Hence, the APK pathway provides a simple approach to carbohydrate metabolism, offering advantages not only for utilizing of complex carbon sources but also in terms of atom economy compared to other pathways (S24 Fig and S7 Table ).The design, construction, and test of APK pathway indicate the potential for using multiple carbon sources with higher efficiency in biomanufacturing in the future.

Quantum-chemical analysis for PK from Actinobacteria Bifidobacterium (BbPK)
The computational model was obtained using AlphaFold [28].The complex of BbPK with ThDP and substrate was generated with PyMOL [29].The model contains 218 atoms with a total charge of +1, including the side chains of His64, His553, Glu479, Tyr501, Gln321, Ser440, His142, Gly155, His320, Asn549, Gln546, His548, His97, and the cofactor ThDP.Generally, the glutamate (Glu479) is modeled in protonated state for forming a hydrogen bond with the N1' atom of ThDP.His97 was considered as the most possible candidate of proton donor for dehydration process, and it was modeled in the doubly protonated state.According to the interaction mode of pocket residues, His64, His553, His320, His142, His548, were modeled in their singly protonated states.The structural models of short-chain ketoses were obtained from the PubChem.
All the calculations were performed using the Gaussian 09 package [30] and the B3LYP method.The 6-31G (d, p) basis set was used for the geometry optimizations, and the electronic energies of the stationary points were refined by single-point calculations with the 6-311++G (2d, 2p) basis set.Solvation energies were calculated with the SMD [31] implicit solvent method and a dielectric constant of ε = 4. Previous studies have shown that the effect of the solvation diminishes rapidly with the size of the active site model, rendering the particular value used for the dielectric constant less critical.The zero-point energy corrections were done at the same level of theory as the geometry optimizations.
As used in the cluster approach [32,33], a number of atoms were kept fixed to their crystallographic positions in the geometry optimizations (indicated by asterisks in the S6 Fig) .This coordinate-fixing protocol is very important to avoid large unrealistic movements of the various groups at the active site.This approach deals only with the chemical steps of the enzymatic reactions, implying that the substrate binding and product release are usually not explicitly considered by the calculations.Therefore, an implicit assumption in the current model is that neither of these events are rate-or selectivity-determining.

Molecular docking of substrate to BbPK
The molecular docking tool of GNINA [34] was used to predict the potential interaction of different ligands (DHAP, Eu4P, Xu5P, F6P, D/L-EUS, and DHA) with PK from Actinobacteria Bifidobacterium.A structural model of BbPK was obtained using AlphaFold [28] and a 20-nanosecond molecular dynamics simulation was performed to optimize the side chain conformations.The refined PK structural model was treated as the receptor for docking.The structural models of the ligands and ThDP were obtained from the PubChem.The binding site was determined by the structural alignment to the crystal structure of a known phosphoketolase (PDB ID:3AHE) [15].The docking models were ranked by GNINA's CNN pose score.For each ligand, the top 1 docking model was selected and energy minimization was performed by OpenMM7 [35] with the Amber14 [36] force field to optimize the complex model.

Phosphoketolase selection
To investigate the function of PK genes in distant evolutionary branches, we screened and analyzed all potential PKs in the NCBI database.First, we predicted all potential PKs by search against the nonredundant database with the Pfam domain ID PF03894 (hmmscan-cpu 10domtblout output.txt-E 1e-4 PF03894.hmmNR.fasta) [37].Second, we retrieved all PKs from KEGG database (https://www.genome.jp/entry/pf:xfp).Third, we made a local blastp search using PKs from NCBI as the query sequences, and PKs from KEGG as the BLAST database.After blastp search, 12,185 potential PKs were screened with 3 standards: the best hit is a Dxylulose 5-phosphate/D-fructose 6-phosphate phosphoketolase (XFP, EC:4.1.2.9 4.1.2.22,KO: K01621), the identity is more than 40, and the align length is more than 600.Fourth, all PKs were classified into 4,101 groups by using OrthoMCL with the amino acid identity more than 90 in a group.For each group, we selected a PK gene, which is closest approximation to supposed optimal sequence, consisting of the highest frequency residues in multiple sequences alignment.Using the similar strategy, 23 PKs were screened based on the standard of the identity of 60.

Protein engineering of BbPK
To obtain full mutations, oligonucleotide primers were designed with the degenerate codon NNK, which cover almost all mutations with only 96 clones.Hence, a total of 3,264 clones were screened against 34 single-site saturation mutation libraries.Each single-site saturation mutant library was generated based on PCR.The PCR product was degraded the template with DpnI restriction endonuclease, and then transformed into E. coli BL21 (DE3) competent cells for library construction.Each colony was incubated in 200 μL of LB medium for 24 h to plateau at 37˚C and then transferred to 1 mL of the same medium for protein expression.The cells, which induced by IPTG (isopropyl-β-D-thiogalactopyranoside) and cultured overnight at 16˚C, were harvested by centrifugation.The bacterial pellet was washed and resuspended in reaction buffer (50 mM potassium phosphate buffer, 5 mM GALD or 20 mM DHA (pH 7.4)).After 3 h of reaction at 37˚C, the supernatant was collected by centrifugation for detection of substrate or product.
For GALD, we determined the activity of the mutants by detecting the reduction of substrate.The detection method was as follows: added 120 μL chromogenic reagent to 60 μL sample and heated at 90˚C for 15 min.Subsequently, the residual substrate concentration was measured spectrophotometrically at 650 nm.The chromogenic reagent: 1.5 g diphenylamine was dissolved in 100 mL acetic acid, and then 1.5 mL pure sulfuric acid was added.
For DHA, we screened for highly active mutants by detecting the product formaldehyde.The formaldehyde detection method was as follows: 40 μL sample was mixed with 160 μL chromogenic reagent, and then heated at 60˚C for 10 min.Subsequently, formaldehyde production was measured spectrophotometrically at 440 nm, and 100 mL chromogenic reagent (pH 6.0) contains 25 g ammonium acetate, 3 mL acetic acid, and 0.25 mL acetylacetone solution.

Activity assay and kinetic properties of PK and mutants
The standard reaction mixture (100 μL) contained 50 mM potassium phosphate buffer (pH 7.5), 5 mM MgSO 4 , 1 mM ThDP, 10 mM GALD (DHA or D-EUS), 1 mM ADP, 0.2 mg mL −1 AckA, 5 U hexokinase, 2.5 U Glucose-6-phosphate dehydrogenase, 1 mM NADP + , and 10 mM glucose.Approximately 0.5 mg mL −1 PK was added into the reaction system.The reactions conducted at 37˚C.NADPH was detected spectrophotometrically at 340 nm.Enzyme kinetics with GALD (DHA or D-EUS) as substrate were determined in assays with GALD (DHA or D-EUS) concentrations of 0-110 mM.Kinetic parameters kcat and Km were determined by measuring the initial velocities of the enzymic reaction and curve-fitting according to the Michaelis-Menten equation, using GraphPad Prism 5 software.

Enzyme activity of BbPK for DHA, L-EUS, and D-EUS
The coding gene of phosphoketolase from Actinobacteria Bifidobacterium was ligated into the expression vector pET-28a via NdeІ and XhoI restriction sites.E. coli BL21(DE3) cells carrying recombinant plasmid were inoculated into 5 mL LB (Luria Broth) medium with Kanamycine (100 μg mL −1 ) and cultured overnight at 37˚C, and then scaled up to 800 mL 2YT medium (16 g L −1 Tryptone, 10 g L −1 yeast extract, 5 g L −1 NaCl) containing Kanamycine (100 μg mL −1 ).Gene expression was induced by adding IPTG to a final concentration of 0.5 mM when OD 600 reached 0.6.The cell cultures continued to grow overnight at 16˚C before being harvested by centrifugation at 6,000×g and then was resuspended in 50 mL lysis buffer (50 mM potassium phosphate buffer (pH 7.4), 5 mM MgSO 4 , 0.5 mM ThDP).The bacterial pellet was lysed by using a high-pressure homogenizer (JNBIO, China), and the cell debris was removed by centrifugation at 10,000×g for 60 min at 4˚C.The soluble protein sample was loaded onto a nickel affinity column (GE Healthcare), which was rinsed with 50 mL wash buffer (50 mM potassium phosphate buffer (pH 7.4), 5 mM MgSO 4 , 0.5 mM ThDP, and 50 mM imidazole) and then eluted with 20 mL elution buffer (50 mM potassium phosphate buffer (pH 7.4), 5 mM MgSO 4 , 0.5 mM ThDP, and 200 mM imidazole).The eluted protein was concentrated and dialyzed against lysis buffer (50 mM potassium phosphate buffer (pH 7.4), 5 mM MgSO 4 , and 0.5 mM ThDP) by ultrafiltration with an Amicon Ultra centrifugal filter device (Millipore, USA) with a 30 kDa molecular-weight cutoff.The protein concentration was determined using a BCA Protein Assay Reagent Kit (Pierce, USA) with BSA as the standard.
Activity of PKs on DHA was determined with 1 mg mL −1 purified recombinant protein in 200 μL reaction mixtures.The reaction system comprised 10 mM DHA, 1 mM ThDP, 5 mM MgSO 4 , and 50 mM phosphate buffer (pH 7.4).After incubation at 37˚C for 0.5 h, the reaction was stopped by adding 200 μL of acetonitrile.Acetyl-phosphate in samples is determined by chromogenic and liquid chromatography, the by-product formaldehyde was confirmed by GC-MS.
Formaldehyde detection by GC-MS.Sample derivatization: 100 μL 2.4-dinitrophenylhydrazine solution was added to 100 μL samples, the mixture was performed at 60˚C for 60 min in the dark.Then added 400 μL of n-hexane to the mixed solution, centrifuge at 5,000×g for 2 min.Separated the upper solution and then added appropriate amount of anhydrous sodium sulfate powder, centrifuged at 10,000×g for 10 min.Separated the upper solution and was detected by GC-MS.GC-MS conditions: Electron ionization (EI) GC-MS analyses were performed with a model 7890A GC (Agilent) with a DB-5 fused silica capillary column (30 cm length, 0.25 mm inner diameter, 0.25 μm film thickness) coupled to an Agilent 7200 Q-TOF mass selective detector.Injections were performed by a model 7683B autosampler.The GC oven was programmed from 160˚C (held for 1 min) to 240˚C at 10˚C min −1 , and then held for 5 min; the injection port temperature was 250˚C, and the transfer line temperature was 280˚C.The carrier gas, ultra-high purity helium, flowed at a constant rate of 1.2 mL min −1 .For full-scan data acquisition, the MS scanned from 35 to 550 atomic mass units.Data analysis for GC-MS was performed with Mass Hunter software (Agilent, USA) and NIST Database.
The activity of PKs on L-EUS and D-EUS were determined with 1 mg mL −1 purified recombinant protein in 200 μL reaction mixtures.The reaction system comprised 10 mM L-EUS or D-EUS, 1 mM ThDP, 5 mM MgSO 4 , and 50 mM phosphate buffer (pH 7.4).After incubation at 37˚C for 0.5 h, the reaction was stopped by adding 200 μL of acetonitrile.AcP in samples was detected by chromogenic and liquid chromatography.The by-product glycoaldehyde was confirmed by GC-MS.Chromogenic and liquid chromatography detection of AcP were same as DHA.
Glycoaldehyde was confirmed by GC-MS.Samples were freeze-dried, then 60 μL of 0.2 M PFBOA solution was added and mixed, the reaction was performed at 30˚C for 90 min.A total of 300 μL of n-hexane was added and centrifuge at 5,000×g for 2 min.Separated the upper solution and then added appropriate amount of anhydrous sodium sulfate powder, centrifuged at 10,000×g for 10 min.Separated 100 μL upper solution, then 30 μL N-Methyl-N-(trimethylsilyl)-trifluoroacetamide with 1% trimethylchlorosilane was added and mixed, the reaction system conducted at 30˚C for 30 min.The product was detected by GC-MS.GC-MS conditions: EI GC-MS analyses were performed with a model 7890A GC (Agilent) with a DB-5 fused silica capillary column (30 cm length, 0.25 mm inner diameter, 0.25 μm film thickness) coupled to an Agilent 7200 Q-TOF mass selective detector.Injections were performed by a model 7683B autosampler.The GC oven was programmed from 60˚C (held for 1 min) to 100˚C at 5˚C min −1 , to 300˚C at 25˚C min −1 and then held for 5 min; the injection port temperature was 250˚C, and the transfer line temperature was 280˚C.The carrier gas, ultra-high purity helium, flowed at a constant rate of 1.2 mL min −1 .For full-scan data acquisition, the MS scanned from 35 to 550 atomic mass units.Data analysis for GC-MS was performed with Mass Hunter software (Agilent, USA) and NIST Database.
Activity of PKs on E4P, D-ETS, D-GCD, and DHAP were determined with 1 mg mL −1 of purified recombinant protein in 200 μL reaction mixtures.The reaction system included substrate (10 mM E4P, D-ETS, D-GCD, or DHAP), 1 mM ThDP, 5 mM MgSO 4 , and 50 mM phosphate buffer (pH 7.4).After incubation at 37˚C for 0.5 h, the reaction was stopped by adding 200 μL of acetonitrile.AcP in samples was detected by chromogenic and liquid chromatography.Chromogenic and liquid chromatography detection of AcP were same as DHA.
Activity of PKs on erythrulose-4-phosphate (Eu4P) were determined with 1 mg mL −1 purified recombinant protein in 200 μL reaction mixtures.The reaction system included 10 mM Eu4P, 0.5 mg mL −1 RpiB, 1 mM ThDP, 5 mM MgSO 4 , and 50 mM phosphate buffer (pH 7.4).After incubation at 37˚C for 0.5 h, the reaction was stopped by adding 200 μL of acetonitrile.AcP in samples was detected by chromogenic and liquid chromatography.Chromogenic and liquid chromatography detection of AcP were same as DHA.

Expression, purification, and enzyme kinetics of PKs from different species
The PKs coding genes from different species were ligated into the expression vector pET-28a via NdeІ and XhoI restriction sites.All genes were expressed in BL21 (DE3) and purified on the Ni-NTA column.Large-scale purification (800 mL) typically produced about 50 mg enzyme.The protein concentration was determined using the BCA Protein Assay Reagent Kit (Pierce, USA) with BSA as the standard.
Various PKs (0.25 mg mL −1 ) from different species were added into the reaction system.The reactions conducted at 37˚C.The production of NADPH was detected at 340 nm.Enzyme kinetics with DHA, L-EUS, and D-EUS as substrates were determined in assays with concentrations of 0-110 mM.Kinetic parameters were determined from triplicate experiments using GraphPad Prism 5 (GraphPad Software, USA).

Expression, purification, and enzyme kinetics of phosphatases from different species
The phosphatases coding genes of different species were ligated into the expression vector pET-28a via NdeІ and XhoI restriction sites.All genes were expressed in BL21 (DE3) and purified on the Ni-NTA column.Large-scale purification (800 mL) typically produced about 5 to 50 mg enzyme.The protein concentration was determined using a BCA Protein Assay Reagent Kit (Pierce, USA) with BSA as the standard.
The kinetics of phosphatases on F6P, Xu5P, E4P, G3P, and DHAP were determined by monitoring the production of fructose, xylulose, ETS, GCD, and DHA by HPLC.The standard reaction mixture (100 μL) contained 50 mM potassium phosphate buffer (pH 7.5), 5 mM F6P (or Xu5P, E4P, G3P, DHAP).Various phosphatases (0.1 mg mL −1 ) from different species were added into the reaction system.The reactions conducted at 37˚C for 0.5 to 2 h, and 100 μL acetonitrile was added to the samples to terminate the reactions, and then analyzed by HPLC.
Acetic acid and DHA detection method is the same as above.Formaldehyde was detected by chromogenic method.Chromogenic detection of formaldehyde is as follows: dilute the sample to the appropriate concentration (0.1 to 1 mM), add 80 μL chromogenic solution to 120 μL of the diluted sample.The reaction conducted at 60˚C for 10 min and centrifugal at 12,000×g for 5 min.Take 150 μL to measure the absorbance at 414 nm.Calculate the concentration of formaldehyde in the sample according to the standard curve.Chromogenic solution is prepared as follows: Dissolve 250 g of ammonium acetate in 900 mL of ddH 2 O, then add 30 mL of acetic acid and 2.5 mL of acetylacetone, adjust the pH of the solution to 6 with acetic acid, and finally add ddH 2 O to a volume of 1 L.

Process analysis of the APK pathway for F6P
The 1 mL reaction system contained 50 mM potassium phosphate buffer (pH 7.5) 5 mM MgSO 4 , 2 mg mL −1 PK, 0.5 mg mL −1 EcHAD, 1 mg mL −1 Ps-LRHI, 1 mM ThDP, 10 mM F6P.The reactions conducted at 37˚C for 10 h.Samples were taken out every 2 h for analysis.The reactions were terminated by heating at 95˚C for 5 min, and then cooled down to 37˚C.Added the 5 U alkaline phosphatase, and conducted at 37˚C for 4 h, equal volume acetonitrile was used to terminate the reaction.Fructose, D-ETS, D-EUS, GALD, and acetic acid were detected by HPLC.

Process analysis of the APK pathway for Xu5P
The 1 mL reaction system contained 50 mM potassium phosphate buffer (pH 7.5) 5 mM MgSO 4 , 2 mg mL −1 PK, 1 mg mL −1 CpHAD, 0.1 mg mL −1 TIM, 2 mg mL −1 FLS, 1 mM ThDP, and 10 mM Xu5P.The reactions conducted at 37˚C for 10 h, and samples were taken out every 2 h for analysis.The reactions were terminated by heating at 95˚C for 5 min, then cooled down to 37˚C.Added the 5 U alkaline phosphatase, and conducted at 37˚C for 4 h, equal volume acetonitrile was used to terminate the reaction.D-xylulose, GCD, DHA, and acetic acid were detected by HPLC.Formaldehyde was detected by chromogenic method as before.

Fig 2 .
Fig 2. The distribution and new functions of PKs.(A) The distribution of PKs across the tree of life, as referenced in Jillian F. Banfield's study [38].The blue color in pie chart represents species with PKs present in all genome sequenced species.(B) The catalytic activity of PKs from different species on different substrates.The displayed 7 PKs not only exhibited activity on F6P or Xu5P, but also demonstrated the ability to convert short-chain ketoses into AcP.The corresponding table on the right represents the catalytic activity of these 7 candidate PKs on 6 classes of ketose or ketose phosphate.Each color represents to a specific enzyme activity (U/mg).Detailed catalytic activity data are shown in S1 Table.The raw data was listed in S1 Data.AcP, acetyl-phosphate; F6P, fructose-6-phosphate; PK, phosphoketolase; Xu5P, xylulose-5-phosphate. https://doi.org/10.1371/journal.pbio.3002285.g002

Fig 3 .
Fig 3.The directed evolution of BbPK.(A) The complex structure model of BbPK and GALD.The binding pockets of the substrates are situated at the interface of the BbPK homodimer.The 2 chains of the BbPK homodimer are distinguished by their white and orange colors.Residues comprising the pockets and key interfacial residues near the pockets are represented by green and yellow spheres based on their chain IDs.These residues are appropriately labeled with their chain ID and residue ID. (B) The relative activity of wild-type BbPK and beneficial mutants is assessed for GALD, DHA, and D-EUS.Relative activity was determined by calculating the ratio of the conversion rate of substrate for mutants to that of the wild-type.The raw data was listed in S1 Data.(C) The pocket volumes of WT, H142N, E521I, and Q321A.Pink dots represent the volumes of the binding pockets, which are measured as 366.6, 269.3, 254.5, and 346.6 Å 3 , respectively.The structures of the mutants were obtained through AlphaFold and further subjected to 50 ns molecular dynamics simulations.The figures were generated using Pymol software version 2.3.0, while the pockets were calculated using POVME 3.0.D-EUS, D-erythrulose; DHA, dihydroxyacetone; GALD, glycolaldehyde.https://doi.org/10.1371/journal.pbio.3002285.g003