Fig 1.
Simplified illustration of the flavonoid biosynthesis in plants showing the competition between FLS and DFR for common substrate, dihydroflavonols.
CHS, chalcone synthase; CHI, chalcone isomerase; F3H, flavanone 3-hydroxylase; F3′H, flavonoid 3′-hydroxylase; F3′5′H, flavonoid 3′,5′-hydroxylase; FLS, flavonol synthase; DFR, dihydroflavonol 4-reductase; ANS, anthocyanidin synthase; arGST, anthocyanin-related glutathione S-transferase; UFGT, UDP-glucose flavonoid 3-O-glucosyl transferase; MT, methyltransferase.
Fig 2.
Illustration of the branching point in the flavonoid biosynthesis leading to anthocyanins and flavonols, respectively.
Asterisk (*) shows that the reactions from dihydroflavonols to anthocyanins also required the additional enzymes anthocyanidin synthase (ANS), anthocyanin-related glutathione S-transferase (arGST) and UDP-dependent anthocyanidin 3-O-glucosyltransferase (UGT) downstream of DFR. DFR and FLS can process substrates with different hydroxylation patterns leading to three distinct products each. The characters at the arrows represent the previously reported substrate-preference-determining residues of FLS and DFR. The DFR residues, represented as purple characters depict the 133 position while the FLS residues, represented as yellow characters depict the 132 position in FLS (position according to AtDFR and AtFLS1).
Fig 3.
Selected parts of multiple sequence alignment of DFR sequences restricted to 10 species to reduce the complexity.
Monocot and dicot species are highlighted by light pink and light green color, respectively. The dark violet, mid violet, and light violet colors indicate 100%, >75%, and >50% similarity among the sequences, respectively. Important domains associated with DFR functionality are highlighted and some columns are masked with three dots (…). Red boxes highlight the NADPH-binding domain and the 26 amino acid long substrate-binding domain. The red-star labeled residue at the 3rd position within this region is crucial for dihydroflavonol recognition (position 133 in the A. thaliana DFR). MAFFTv7 was applied to generate the alignment.
Fig 4.
Selected parts of an FLS multiple sequence alignment restricted to 10 species to reduce the complexity.
Monocot and dicot species are highlighted by light pink and light green color, respectively. The dark orange, mid orange, and light orange colors indicate 100%, >75%, and >50% similarity among the sequences, respectively. Important domains and residues associated with FLS functionality are highlighted and some columns are omitted as indicated by three dots (…). The conserved 2-ODD domain and the Fe2+ binding sites are indicated by black asterisks and red arrowheads, respectively. The black boxes highlight the FLS-specific motifs. The black arrowheads indicate potential DHQ-binding sites where the first residue (position 132) is thought to be critical for the substrate preference of FLS. The alignment was generated by MAFFT.
Fig 5.
The patterns of commonly occurring amino acid residues at substrate-preference determining positions observed in different orders of angiosperms for FLS and DFR.
Orders are sorted by branching in the evolution of angiosperms according to the Angiosperm Phylogeny Group (2016). The residues were investigated across 172 angiosperm species and are represented here at the order level. Monocot and dicot species are highlighted by light pink and light green color, respectively. Rosales II consists of 2 species: Fragaria x ananassa and Fragaria vesca. n represents the number of analyzed species harboring DFR and FLS, respectively, within each order.
Fig 6.
Phylogenetic analysis of DFR sequences in diverse plant species, highlighting amino acid residue diversity at position 133 associated with substrate specificity.
Gymnosperm, monocot, and dicot species are denoted by light blue, light pink, and light green color stripes, respectively. Non-DFR sequences are represented by dashed gray branches while the functional DFRs are indicated by solid black branches. The color-coded scheme represents different residues: Asparagine (light purple), Aspartic acid (periwinkle blue), Alanine (deep purple), and other amino acids (gray) at position 133. The preferred substrate of the DFR type is written in brackets: DHK, dihydrokaempferol; DHQ, dihydroquercetin and DHM, dihydromyricetin. Distinct clusters of DFRs from major plant orders are labeled for reference. DFR sequences identified in previous studies are highlighted by an asterisk at the start of the terminal branch, with asterisks of functional DFR genes colored in red (S5 File). Leaf labels are hidden to reduce the complexity. The outgroup comprises SDR members like ANRs, CCRs, and other DFR-like sequences (S4 File).
Fig 7.
Phylogenetic diversity of FLS sequences in diverse plant species, highlighting amino acid residue diversity at position 132 associated with DHQ-substrate binding.
Gymnosperm, monocot, and dicot species are denoted by light blue, light pink, and light green color stripes, respectively. Non-FLS sequences are represented by dashed gray branches while the functional FLSs are indicated by solid black branches. The background color highlights different presumably substrate-preference-determining amino acid residues: histidine (pale yellow), phenylalanine (dark orange), and tyrosine (dark golden) at position 132. The hypothesized preferred substrate of the FLS type is written in brackets: DHK, dihydrokaempferol; DHQ, dihydroquercetin, and DHM, dihydromyricetin. Distinct clusters of FLSs from major plant orders are labeled for reference. FLS sequences identified in previous studies are highlighted by an asterisk at the start of the terminal branch with asterisks of functional FLS genes colored in red (S5 File). The branches of Arabidopsis thaliana AtFLS1-AtFLS6 are labeled. Individual leaf labels are hidden to reduce the complexity. The outgroup comprises members of 2-ODD like F3H, ANS, and other FLS-like sequences (S7 File).
Fig 8.
2D density heatmap with marginal histograms showing divergent expression patterns of FLS vs DFR across samples from 43 species.
Each sample shows only the expression of FLS or DFR, but not both. The sample size is indicated by n. Colorbar depicts the logarithmic density of points in the plot. (a) Combined expression of all FLS and DFR types, (b-g) specific combinations of FLS and DFR types.