Fig 1.
Sequence space of sequences annotated to EC 1.1.3.15.
Sequences listed in BRENDA and SwissProt as experimentally tested are encircled (A) Taxonomic origin of sequences. (B) Percentage of sequence identity to the closest experimentally tested or curated S-2-hydroxyacid oxidase. (C) Pfam domain architecture. (D) The mean alignment-based sequence identity between and within domain clusters. Pfam protein domains: FMN_dh (PF01070)—FMN-dependent dehydrogenase, DAO (PF01266)—FAD dependent oxidoreductase, Fer2_BFD (PF04324)—BFD-like [2Fe-2S] binding domain, FAD_binding_4 (PF01565)—FAD binding domain, FAD-oxidase_C (PF02913)—FAD linked oxidases C-terminal domain, CCG (PF02754)—cysteine-rich domain.
Fig 2.
Characterisation of protein cluster with high sequence identity to previously characterised S-2-hydroxyacid oxidases.
(A) Activity screen and protein characteristics. Dendrogram indicates protein relatedness. Superkingdoms: light purple—Bacteria, brown—Eukaryotes. Recorded activities are marked with squares, for proteins active with more than one substrate, the substrate preference is shaded with the highest activity for each enzyme scaled to 100%. Listed amino acids correspond to conserved residues in a glycolate oxidase from S. oleracea. The cartoons represent predicted domain and motif composition of the sequences, based on Pfam search. Domains lacking full Pfam alignment are represented with a sharp edge. FMN-binding domain (FMN_dh, PF01070) is marked in magenta, cytochrome b5-like heme binding domain (Cyt_B5, PF00173) is marked in green, and a prolonged stretch in loop4 is marked in blue. (B) Conserved amino acids of the active site of S-2-hydroxyacid oxidase mapped on a structure of glycolate oxidase from S. oleracea (PDB: 1GOX). Conserved residues are marked in blue, the FMN cofactor is marked in yellow, and the glycolate substrate in green. (C) Superimposed structures of the representatives of FMN-dependant 2-hydroxyacid oxidase/dehydrogenase family with their distinct motifs represented in a cartoon form: glycolate oxidase (magenta, PDB 1GOX), flavocytochrome b2 (green, PDB 1FCB), mandelate dehydrogenase (light blue, PDB 6BFG), lactate 2-monooxygenase (dark blue, PDB 6DVH).
Fig 3.
Characterisation of protein clusters with low sequence identity to previously characterised S-2-hydroxyacid oxidases.
Dendrogram indicating protein relatedness. Superkingdoms: light purple—Bacteria, dark purple—Archaea. Activities are marked with squares; for proteins active with more than one substrate, the substrate preference is shaded. The cartoons represent predicted domain and motif composition of the sequences, based on Pfam search. Domains lacking full Pfam alignment are represented with a sharp edge. Proteins with alternative activities chosen for kinetic characterisation are marked in bold. (A) Characterisation of protein clusters containing DAO domain. FAD dependent oxidoreductase domain (DAO, PF01266) is marked in blue, BFD-like [2Fe-2S] binding domain (Fer2_BFD, PF04324) is marked in purple. (B) Characterisation of remaining protein clusters. FAD binding domain (FAD_binding_4, PF01565) is marked in orange, FAD linked oxidases C-terminal domain (FAD-oxidase_C, PF02913) is marked in green, cysteine rich domain (CCG, PF02754) is marked in red. (C) Comparison of Pfam domains of sequences annotated to EC 1.1.3.15 in BRENDA version 2017.1 and 2019.2.
Table 1.
Kinetic parameters of selected proteins with functions distinct from S-2-hydroxyacid oxidase.
Values represent mean averages (+/- standard error of mean; n = 3).
Fig 4.
Exploration of functional annotation throughout all BRENDA enzyme classes.
(A) The total number of representative protein sequences (after clustering at 90% identity) annotated to EC classes in BRENDA, which is approximately 5.3 million. (B) The total number of experimentally characterised/curated enzymes. (C) Histogram showing the number of characterised/curated enzymes per EC class (bin size of 1). Histograms showing the distribution of sequence identities between all 5.3 million cluster representatives and their closest characterised/curated enzyme for Archaea (D), Bacteria (E), and Eukaryota (F) (with a bin size of 1). Proteins which do not have the same Pfam domains as characterised/curated enzymes are coloured in grey.
Table 2.
Overview of annotation to enzyme classes of industrial interest.