Skip to main content
Advertisement

< Back to Article

Fig 1.

Sequence-based analysis identifies genetic adaptations unique to Calvin cycle-containing genomes.

Bacterial and archaeal genomes from the Genome Taxonomy Database (GTDB) were subjected to a Hidden Markov Model-based homology search for Prk and Rubisco, which identified Calvin cycle-positive genomes. The Calvin cycle-positive genomes were contrasted as a collective against their closest relatives to identify genes, i.e. Enzyme Commission (EC) numbers and Pfams, associated with the Calvin cycle via three statistical comparison methods. First, enrichment identified genes that were generally depleted or enriched in Calvin cycle-positive genomes using a Wilcoxon rank sum test. Second, phylogenetics-based ancestral character estimation was used on subtrees in order to correlate the emergence of the Calvin cycle to other genes. Third, a random forest machine learning algorithm was employed to distinguish between Calvin cycle-positive and Calvin cycle-negative genomes based on other genes that were thereby ranked according to their importance in the classification task.

More »

Fig 1 Expand

Fig 2.

The Calvin cycle is present in a diverse range of Bacteria and Archaea.

Bars display the taxonomic distribution of 1,020 CBB-positive (+) and 1,020 CBB-negative (-) genomes analyzed in this study. Bars are grouped by phylum, or class for Proteobacteria. The orders with most members have separate bars, while other organisms are aggregated under the “Other” labels.

More »

Fig 2 Expand

Fig 3.

Three methods for identifying and ranking the importance of genes distinguishing Calvin cycle-positive genomes from relatives cooperate to yield a consensus ranking.

The rank of genes, i.e. Enzyme Commission numbers (EC) or Pfam families, within each method (x-axes, logarithmic scale) is plotted against the consensus rank from all three methods (y-axis, logarithmic scale). The orange color intensity (square root scale) indicates the median distance between the gene and Rubisco in number of genes in CBB-positive genomes (S6 Dataset), if the gene was found on the same DNA strand as Rubisco more than 200 times. Genes detected 200 times or fewer on the same DNA strand as Rubisco are shown in light purple. Note that ECs and Pfams were ranked separately in the random forest analysis and thereby each rank is shared by one EC and one Pfam. The random forest analysis included only 1,200 genes due to so-called feature selection preceding ranking (see Materials and methods). The gap between ranks 4,714 and 6,824 in the enrichment analysis is due to 2,110 genes sharing the same q value used for ranking (S2 Dataset). Abbreviations: Ald, fructose-bisphosphate aldolase (EC 4.1.2.13, PF01116); ATPsyn, ATP synthase (PF02823); CbbQ, Rubisco activase CbbQ (PF08406); CbbX, Rubisco activase CbbX (PF17866, “AAA_lid_6”); Fbp, fructose-1,6-bisphosphatase (EC 3.1.3.11, PF00316); GP, glycogen phosphorylase (EC 2.4.1.1); Mdh, malate dehydrogenase (EC 1.1.5.4); Rpe, ribulose-phosphate 3-epimerase (EC 5.1.3.1, PF00834); Tkt, transketolase (EC 2.2.1.1).

More »

Fig 3 Expand

Table 1.

Consensus rank of genetic adaptations to the Calvin cycle.

More »

Table 1 Expand

Fig 4.

Tracing the evolution of Calvin cycle genome integration.

The panels show likelihood of ancestral Calvin cycle presence (node fill color; cyan indicates CBB-positive and brown indicates CBB-negative) in bacterial subtrees (A), positive correlation (Spearman r ≈ 0.45) to ancestral gene copy numbers (line color) of fructose-1,6-bisphosphatase (Fbp; EC 3.1.3.11) in Archaea (B), and strong negative correlation (Spearman r ≈ -0.85) to ancestral gene copy numbers (line color) of transcriptional regulator AraC (PF06719) in bacterial subtree 1 (C). Each leaf node (triangles) is one contemporary genome. Outer rings indicate genome taxonomic association. Scale bars show substitutions per site. Asterisks (*) indicate archaeal genomes encoding Rubisco activase CbbQ (PF08406). CbbQ is negatively correlated to the Calvin cycle in Archaea (r ≈ -0.62), which is explained by the fact that most of the archaeal genomes with CbbQ are CBB-negative (brown).

More »

Fig 4 Expand

Fig 5.

Central carbon metabolism and the pentose phosphate pathway represent a hotspot for Calvin cycle adaptations.

Color indicates the consensus rank of enzymes on a logarithmic scale. Points above the color scale bar represent the consensus rank of individual enzymes. Line thickness indicates whether the enzyme-encoding genes were enriched or depleted in CBB-positive genomes. Dashed lines indicate that the enzyme was not detected or that it was removed because it was Prk or Rubisco (see Materials and methods). Co-factors and small molecules such as CO2 have been omitted from most reactions. Arrows are used where enzymes that mainly catalyze specific directions rank differently. Special characters indicate pyrophosphate-dependent phosphofructo-1-kinase (*) and 2,3-bisphosphoglycerate-dependent phosphoglycerate mutase (†). The map is based on relevant subsystems of KEGG’s central carbon metabolism map (map01200) and related maps. The logarithm of consensus ranks for Enzyme Commission (EC) numbers were normalized to the range 0 to 1 and encoded as color. We also encoded significant EC enrichment or depletion in CBB-positive genomes as different colors. The EC-to-color tables were submitted to KEGG’s pathway mapping tool (https://www.genome.jp/kegg/tool/map_pathway2.html) to yield annotated maps that were then used as templates for drawing the figure. Note that the reaction SBP to S7P is represented by EC 3.1.3.11, rather than the eukaryotic SBPase EC 3.1.3.37, assuming that EC 3.1.3.11 represents bifunctional F/SBPase (Fbp). Also note that when multiple ECs mapped to the same reaction, only the best ranking EC color was used, unless special patterns of interest were present, e.g. 3PG to 2PG (†). Abbreviations: 2OG, 2-oxoglutarate; 2PG, 2-phosphoglycerate; 3HP, 3-hydroxypropionate; 3PG, 3-phosphoglycerate; AC, acetate; ACAH, acetaldehyde; AC-CoA, acetyl-CoA; ACP, acetyl phosphate; Ald, fructose-bisphosphate aldolase; BPG, 1,3-bisphosphoglycerate; CIT, citrate; CM-CoA, citramalyl-CoA; Cyt c, cytochrome c; DCHB, dicarboxylate-hydroxybutyrate; DD-Gn6P, 2-dehydro-3-deoxy-gluconate-6-phosphate; DHAP, dihydroxyacetone phosphate; DHFUM, dihydroxyfumarate; E4P, erythrose-4-phosphate; Eda, 2-dehydro-3-deoxy-phosphogluconate aldolase; Edd, 6-phosphogluconate dehydratase; EtOH, ethanol; F6P, fructose-6-phosphate; FBP, fructose-1,6-bisphosphate; Fbp, fructose 1,6-bisphosphate phosphatase; FUM, fumarate; G1P, glucose-1-phosphate; G6P, glucose-6-phosphate; G, glycerate; GAP, glyceraldehyde-3-phosphate; GLX, glyoxylate; Gly, glycine; GLYC, glycolate; Gnd, 6-phosphogluconate dehydrogenase; Gn6P, gluconate-6-phosphate; GnL6P, glucono-1,5-lactone 6-phosphate; HPHB, hydroxypropionate-hydroxybutyrate; HPYR, hydroxypyruvate; Hu6P, arabino-3-hexulose-6-phosphate; ICIT, isocitrate; LAC, lactate; MAL, malate; MAL-CoA, malyl-CoA; MM-CoA, methylmalonyl-CoA; m-TAR, meso-tartrate; OA, oxaloacetate; OGLYC, oxaloglycolate; PEP, phosphoenolpyruvate; PEPC, phosphoenolpyruvate carboxylase; PEPK, phosphoenolpyruvate carboxykinase; Pgl, 6-phosphogluconolactonase; PGLYC, phosphoglycolate; PRPP, 5-phosphoribosyl 1-pyrophosphate; PYR, pyruvate; R1P, ribose-1-phosphate; R5P, ribose-5-phosphate; Ru5P, ribulose-5-phosphate; RuBP, ribulose-1,5-bisphosphate; S7P, sedoheptulose-7-phosphate; SBP, sedoheptulose-1,7-bisphosphate; Ser, serine; SSA, succinate semialdehyde; SUCC, succinate; SUCC-CoA, succinyl-CoA; TAR, tartrate; TARS, tartronate semialdehyde; TCA, tri-carboxylic acid; Tkt, transketolase; Xfpk, phosphoketolase; Xu5P, xylulose-5-phosphate; Zwf, glucose-6-phosphate dehydrogenase.

More »

Fig 5 Expand

Fig 6.

Calvin cycle-positive organisms avoid metabolite-level regulation that may disturb cycle function.

The enzyme arabinose-5-phosphate isomerase (Api; EC 5.3.1.13) was negatively correlated with the Calvin cycle (Spearman r ≈ -0.61) in subtree 1 (A), illustrated by likelihood of ancestral Calvin cycle presence (node fill color) and ancestral Api gene copy numbers (line color). The scale bar (A) shows substitutions per site. Api interferes with Calvin cycle operation (B) by converting ribulose-5-phosphate to arabinose-5-phosphate (A5P). A5P inhibits (-) transaldolase (GAP and S7P to F6P and E4P) and ribose-5-phosphate isomerase (R5P to Ru5P). The map is based on KEGG’s central carbon metabolism map (map01200). Abbreviations: 3PG, 3-phosphoglycerate; A5P, arabinose-5-phosphate; Api, arabinose-5-phosphate isomerase; BPG, 1,3-bisphosphoglycerate; DHAP, dihydroxyacetone phosphate; E4P, erythrose-4-phosphate; F6P, fructose-6-phosphate; FBP, fructose-1,6-bisphosphate; GAP, glyceraldehyde-3-phosphate; R5P, ribose-5-phosphate; Ru5P, ribulose-5-phosphate; RuBP, ribulose-1,5-bisphosphate; S7P, sedoheptulose-7-phosphate; SBP, sedoheptulose-1,7-bisphosphate; TCA, tri-carboxylic acid; Xu5P, xylulose-5-phosphate.

More »

Fig 6 Expand