Fig 1.
Domain architectures of GH2 family.
Numbers in parentheses indicate the number of sequences representative of each DA type, included in the Pfam database.
Fig 2.
Domain architecture of DA type 5 sequences.
Colored and stripped boxes correspond to domains identified by Pfam and Interpro databases, respectively. Modules with more than 40% sequence identity compared to BIG1 domains identified by Interpro, with a coverage higher than 60%, were tagged as BIG1. Letters (a-i) on the right edge of the figure group sequences with similar DAs. Domain assignment at the I1, I2, and I3 regions (subtype a) was based on the analysis carried out with the β-galactosidase from S. pneumoniae [30]. Numbers on top of non-identified regions indicate approximate number of residues.
Fig 3.
Phylogenetic analysis of the GH2C domain.
The tree was calculated by Maximum Likelihood method based on the JTT matrix-based model [21] condensed at < 50% bootstrap support. The analysis involved a selection of 380 amino acid sequences for the different DAs. The tree was drawn using FigTree software (http://tree.bio.ed.ac.uk/software/figtree/). Numbers indicate the DA type. The asterisk marks the subtree further analysed in Fig 4. Sectors corresponding to a single DA type are colored in turquoise (DA type 1), purple (DA type 2), blue (DA type 3), red (DA type 4). Cluster 5* is colored in light grey. Sectors including mixed DA types are colored in dark grey.
Fig 4.
Phylogenetic subtree corresponding to the region marked 5* in Fig 3.
Letters on the right edge of the figure group sequences with similar DAs, as indicated in Fig 2. Branch numbers indicate bootstrap values. Bc and Bf correspond to Bacillus circulans and Bifidobacterium bifidum β-galactosidases, respectively.
Fig 5.
Docking of a DA type 5 (A) and a DA type 3 (B) β-galactosidase with their main transglycosylation products. The figure shows the domains that compose the architecture of the enzymes, represented in different colors. The residues potentially interacting with β-D-(1,4)-galactosyl-lactose in Bacillus circulans β-galactosidase (A) or with β-D-(1,3)-galactosyl-lactose in Thermotoga maritima β-galactosidase (B) are highlighted on the right side.
Fig 6.
Sequence alignment around the putative catalytic site of proteins analyzed in Fig 4.
Purple boxes indicate residues potentially involved in the active site, as predicted by docking analysis with β-D-(1,4)-galactosyl-lactose. Positions with more than 50% identity are colored in light blue and those with 100% identity are shown in dark blue color. Bc and Bf correspond to Bacillus circulans and Bifidobacterium bifidum β-galactosidases, respectively.
Fig 7.
Proposed evolutionary model for GH2 enzymes.