An Inserted α/β Subdomain Shapes the Catalytic Pocket of Lactobacillus johnsonii Cinnamoyl Esterase

Background Microbial enzymes produced in the gastrointestinal tract are primarily responsible for the release and biochemical transformation of absorbable bioactive monophenols. In the present work we described the crystal structure of LJ0536, a serine cinnamoyl esterase produced by the probiotic bacterium Lactobacillus johnsonii N6.2. Methodology/Principal Findings We crystallized LJ0536 in the apo form and in three substrate-bound complexes. The structure showed a canonical α/β fold characteristic of esterases, and the enzyme is dimeric. Two classical serine esterase motifs (GlyXSerXGly) can be recognized from the amino acid sequence, and the structure revealed that the catalytic triad of the enzyme is formed by Ser106, His225, and Asp197, while the other motif is non-functional. In all substrate-bound complexes, the aromatic acyl group of the ester compound was bound in the deepest part of the catalytic pocket. The binding pocket also contained an unoccupied area that could accommodate larger ligands. The structure revealed a prominent inserted α/β subdomain of 54 amino acids, from which multiple contacts to the aromatic acyl groups of the substrates are made. Inserts of this size are seen in other esterases, but the secondary structure topology of this subdomain of LJ0536 is unique to this enzyme and its closest homolog (Est1E) in the Protein Databank. Conclusions The binding mechanism characterized (involving the inserted α/β subdomain) clearly differentiates LJ0536 from enzymes with similar activity of a fungal origin. The structural features herein described together with the activity profile of LJ0536 suggest that this enzyme should be clustered in a new group of bacterial cinnamoyl esterases.


Introduction
Hydroxycinammates are natural phenolic compounds with a widespread distribution throughout the plant kingdom. These phytophenols are naturally present in the form of monophenols or polyphenols and participate in the formation of macromolecular structures in plant cells. Gastrointestinal absorbable monophenols are particularly interesting to health researchers due to their innate ability to work as free radical scavengers, anti-inflammatory supplements, and immunostimulants [1][2][3]. The health benefits associated with the consumption of natural phenolics are extensively documented and supported by results obtained both in vitro and in vivo [4,5]. The most abundant bioactive monophenols present in a balanced human diet are ferulic, caffeic, p-coumaric, and sinapic acids [6]. Although the structures of monophenols are similar, the major biochemical differences among them are due to the presence or absence of hydroxyl and/or methyl functional groups attached to the aromatic ring. Monophenols are frequently ester-conjugated to aromatic organic acids to form polyphenols such as oleuropein, chlorogenic acid, or rosmarinic acids, which are present in the diet [7,8].
Dietary phytophenols can exert action locally immediately following their absorption and act systemically following distribution by the circulatory system [9]. In fact, some of these bioactive compounds can pass the blood brain barrier to reach the central nervous system. Nevertheless, the primary absorption of these compounds at the gastrointestinal level is extremely limited. It was experimentally demonstrated that the monocarboxylic acid transporter is the molecular system capable of mediating cellular uptake of monophenols, such as caffeic and ferulic acids [10]. This system does not have an affinity for esterified polyphenols, since the carboxylic ester group of polyphenols interferes with the recognition system of this specific active transporter. Consequently, these low molecular weight polyphenols, such as rosmarinic acid, are rarely absorbed by paracellular diffusion [9]. In this context, the intestinal hydrolysis of the ester bond is necessary for improving the absorption of bioactive monophenols to maximize their beneficial systemic effects on the host. However, humans do not synthesize enzymes with the cinnamoyl/feruloyl esterase activity required to break down these ester bonds and release the absorbable bioactive moieties efficiently [11]. It is well known that the cinnamoyl or feruloyl esterase activity present in the human intestinal lumen is produced only by the colonic microbiota [12,13]. The first cinnamoyl esterase purified from a bacterium commonly found in human colonic microbiota (Lactobacillus johnsonii) was recently purified and biochemically characterized in our laboratory [14]. Interestingly, the producer strain, L. johnsonii N6.2, was naturally abundant in animal models that are genetically predisposed to develop autoimmune diabetes but do not display the characteristic symptoms of the disease [15]. A subsequent study involving feeding the autoimmune diabetes animal models with L. johnsonii N6.2 demonstrated a decrease in the intestinal oxidative stress and a better survival rate [16]. Even though direct involvement of enzymatic activity was not investigated in that publication, the results could be linked to the antioxidative effects of phenolic compounds released from food components by bacterial cinnamoyl esterases.
Cinnamoyl esterases are classical members of the a/b fold structural family. Due to their application in industrial hemicellulose saccharification processes, several fungal enzymes have been biochemically and structurally studied [17][18][19]. In contrast, few bacterial cinnamoyl esterase structures are publicly available. We previously showed that LJ0536, a cinnamoyl esterase purified from L. johnsonii N6.2, is active towards a variety of substrates including short acyl chain aliphatic esters and phenolic esters [14]. Herein, we present the crystal structure and mutational analysis of LJ0536. A catalytically inactive Ser 106 Ala mutant was co-crystallized with three different substrates: ferulic acid, ethyl ferulate, and caffeic acid. Cocrystallization results are discussed in the light of enzymological data collected from assays using site-directed mutants. An inserted a/b subdomain was identified as a prominent structure necessary for phenolic ring binding and the formation of the catalytic pocket. Multiple structurally homologous enzymes were identified in the public databases, and many of these contain inserted subdomains of similar size. However, only the Est1E feruloyl esterase of Butyrivibrio proteoclasticus [20] contains an inserted a/b subdomain with the same secondary structure and architecture as the insert in LJ0536. We conclude that, despite the presence of the inserted subdomain and nearly identical overall scaffold, the enzymes discriminate substrates through specific features of their active sites.

General architecture of the LJ0536 structure
The apo-LJ0536 structure was crystallized in the presence of protease, which facilitated successful crystal growth (the mutant Ser 106 Ala was also crystallized in the presence of protease). The resulting fragment included a loss of four C-terminal residues, such that nearly the entire full-length protein was visible in the electron density map (1-245, plus the five residues from the expression tag). The structure was solved to 2.35 Å (Fig. 1) using Molecular Replacement (MR) with Est1E (PDB:2WTM). Crystallization and diffraction statistics are summarized in Table 1. The native molecular weight was determined to be 46.0 kDa by gel filtration (monomeric apparent molecular weight = 27.6 kDa). The molecule displayed a classical a/b hydrolase fold [21]. The overall structure of LJ0536 consisted of a central b-sheet composed of seven parallel (b 1 , b 3 , b 4 , b 5 , b 6 , b 11 , b 12 ) and one antiparallel bstrand (b 2 ). This central b-sheet shows a left-handed superhelical twist with an approximate 120u angle from b 1 to b 12 . It is flanked by five a-helices with two a-helices (a1, a9) externally located and three a-helices (a3, a4, a8) internally located towards the dimer interface. The asymmetric unit of the apo-LJ0536 crystal contained two protein molecules that formed an extensive interface (Fig. 1D). This interface is formed by a 4 , a 6 , and b 1 . As calculated by the PDBe PISA (Protein Interactions, Surfaces, Assemblies) server, which identifies interfaces between proteins by measuring their buried surface area, the dimer interface comprises 34 and 37 residues of chain A and chain B, respectively, burying a total of 2373 Å 2 between the two chains. The interface is primarily hydrophobic (18/37 residues from chain B), although there are six salt-bridges as well. Due to these characteristics, this interface is presumed to be the dimer interface of the protein. A sequence of 54 amino acids (Pro 131 to Gln 184 ) forms an inserted a/b subdomain that is between the b 6 and b 11 strands. This insertion is composed of two short b-hairpins (b 7 /b 8, b 9 /b 10 ) and three ahelices (a 5 , a 6 , a 7 ). The two b-hairpins are projected towards the entrance of the catalytic site.

The enzyme hydrolytic center
The presumed catalytic pocket resembles an open canal-like feature, which has the shape of a boomerang that ends in a hydrophobic pocket buried between the a 5 and a 6 of the inserted a/b subdomain ( Fig. 1A and 1B). Using the Ser 106 as the center of the boomerang, one of the two clefts of the catalytic pocket is approximately 13 Å long and 7 Å , and it is large enough to accommodate the aromatic acyl group of the substrate. The two protruding hairpins from the inserted a/b subdomain decorate the entrance of this cleft and form the ''roof'' of the catalytic compartment. The other cleft is about 12 Å long and can accommodate the alkoxyl group plus additional atoms from larger substrates.
The catalytic pocket contains the presumed catalytic triad composed of Ser 106 , His 225 , and Asp 197 ( Fig. 1B and 1C) with the catalytic serine residue located at the nucleophilic elbow formed between b 5 and a 4 . The oxyanion hole formed by the backbone nitrogen atoms of Phe 34 and Gln 107 is buried towards the base of the inserted a/b subdomain. Ser 106 takes part in a classical Gly 104 -X-Ser 106 -X-Gly 108 serine esterase motif previously identified by analysis of the linear sequence [14]. In addition, a potential second motif Gly 66 -X-Ser 68 -X-Gly 70 was identified in a highly conserved region (Fig.S1) that is 18 Å from the first catalytic motif (Fig.S2). We validated that the active catalytic triad is Ser 106 , His 225 , Asp 197 , since the Ser 106 Ala and Asp 197 Ala mutants displayed no catalytic activity ( Table 2). The His 225 Ala mutant retained approximately one-fifth of the wild-type activity (as measured by k cat ), but nonetheless the activity is drastically hampered. In contrast, the structure revealed no formation of a catalytic pocket associated with Ser 68 . The potential triad (Ser 68 , His 32 , Asp 61 ) is not in the correct orientation and Ser 68 is not located at the sharp turn of any nucleophilic elbow (Fig.S2). Circular dichroism analysis of the Ser 68 Ala mutant confirmed a significant change in the overall secondary structure of the protein (Fig.S3). This analysis indicates that the activity of this mutant is affected by an overall change in the protein structure rather than by a direct change of a catalytic residue.
Analysis of the crystal structures of Ser 106 Ala-substrate complexes reveals residues important for substrate binding and catalysis We determined the structure of the active site mutant (Ser 106 Ala mutant) and also co-crystallized this mutant with ferulic acid, ethyl ferulate, or chlorogenic acid (Fig. 2). The Ser 106 Ala-ethyl ferulate complex crystallized in two forms (Form I and Form II with a dimer and a single chain in the asymmetric units, respectively). The two ethyl ferulate crystal forms are nearly identical in structure. The root mean square deviation (RMSD) values of 244 Ca atoms of both chains of Form I onto Form II are 0.25 and 0.3 Å , respectively, and the dimer from Form II is essentially identical to the Form I dimer. We focused our analysis on Form II due to better occupancy of the ligand in the active site. Overall, we observed no appreciable differences in the backbone structure between the apo wild-type (WT) enzyme and the Ser 106 Ala mutant (RMSD of 0.33 Å over all 244 Ca atoms). The ligand binding did not induce major structural changes in the active site, except for a rotamer change in Gln 145 and a slight rotation of the side chain of His 225 . This created a second hydrogen bond to Asp 197 (Fig. 2), which is presumably the active conformation of the catalytic triad.
The excellent diffraction of these crystals (resolutions between 1.58 Å -1.75 Å ) allowed us to compare the position of different substrates and the conformation of the active site residues (Fig.S4A). Gln 145 from the inserted a/b subdomain adopted a different conformation, creating a bridge-like structure on top of the catalytic site (Fig. 1B). This feature, along with the side chains of Phe 34 and Val 199 , limits the size of substrate that can enter the catalytic pocket to 7 Å in width. In all three complexes, the substrates in the catalytic groove were oriented with the aromatic acyl moiety of the carbonyl group bound in the deepest part of the pocket (Fig. 2). The opposite end of the ligands, on the far side of the ester moiety, such as the ethoxyl group of ethyl ferulate, rests on a more solvent-exposed area of the groove and has few interactions with the protein. The electron density for the C2 atom of ethyl ferulate has missing electron density in our structure, which is consistent with this atom being part of the leaving group after hydrolysis. As well, no clear electron density was resolved for the quinic acid moiety of chlorogenic acid (labeled as caffeic acidbound), perhaps due to residual enzymatic activity, or a lack of productive interactions with the enzyme. The enzyme does have an area of the binding cleft that would accommodate the quinic acid group, or other groups with similar size, formed by the side chains of His 32 , Ala 36 , Thr 40 , Leu 42, Leu 43, His 105 , and Cys 226 in the binding cleft ( Fig. 1B and Fig.S4B). This pocket is occupied by water molecules and/or sodium ions in each of our structures (Fig. 2).
The protein forms extensive hydrogen bonding networks at both ends of the ligands, such that the protein forms a molecular ruler where the distance between the aromatic ring and the site of hydrolysis is constrained by these hydrogen bonds and the position of the catalytic triad. Other than the catalytic residues and the oxyanion hole, the enzyme does not contribute any hydrogen bonds at the end of the ligand with the ester group ( Fig. 2 and 3).  This suggests that substrate discrimination is accomplished by the hydrogen bonds to the aromatic ring and its substituents. More hydrogen bonds are formed with the phenolic rings of the ligands, including the presence of an ordered water molecule in all of the complexes (Fig. 2). The 4-hydroxyl group (ethyl ferulate, ferulic acid, and caffeic acid) and 3-hydroxyl group (caffeic acid) of the aromatic ring of the substrates are hydrogen bonded to Asp 138 and Tyr 169 , respectively, from the inserted a/b subdomain at the back of the enzyme cavity. The 3-methoxy (3-hydroxyl in the case of caffeic acid) and 4-hydroxyl groups also interact with an ordered  water molecule in all of the complexes. This water is also coordinated by the Oc1 atom of Thr 144 from the inserted a/b subdomain. The 3-methoxy group of ethyl ferulate or ferulic acid is accommodated by a small hydrophobic cavity formed by the benzyl moieties of Phe 34 and Phe 160 , plus the Leu 165 residue (Fig. 3). The aliphatic chain separating the aromatic ring from the site of hydrolysis is accommodated by hydrophobic side chains (Phe 34 , Ala 132 , Val 199 , and Val 200 ). One oxygen atom of the carbonyl group that forms the ester interacts directly with the oxyanion hole formed by the backbone nitrogen atoms of Phe 34 and Gln 107 , while the other oxygen interacts with His 225 and an ordered water molecule present in the caffeic and ferulic acid structures (the ethoxy group of ethyl ferulate occupies the space of this water molecule). Due to these interactions, ethyl ferulate rotates slightly and positions the ester bond perpendicular to Ser 106 at a distance of 2.73 Å . The ester bond is strained from planarity (bond angle of 116u), suggesting that the hydrolytic mechanism involves a tetrahedral enzyme-ester intermediate, typical of esterases. The fact that the ester bond of the ferulic acid product is in a planar configuration that is parallel to the main axis of the groove is further suggestion of this mechanism (Fig. 2C, 2D, and Fig.S4A).
We observed a water molecule 3.3 Å from the non-carbonyl oxygen of the ester bond of ethyl ferulate towards the solventexposed face of the pocket. It is possible that this corresponds to the water molecule that is the target for deprotonation by His 225 to hydrolyze the tetrahedral intermediate between the ligand and Ser 106 , regenerating the enzyme and releasing the product. After hydrolysis, the new hydroxyl group forms a polar interaction with the Ne2 of His 225 .
The caffeic acid moiety of chlorogenic acid adopts a similar position and interaction with the catalytic pocket, as does ferulic acid. However, caffeic acid has two hydroxyl groups in the benzyl ring (positions 3 and 4) which interact with the side chain of Asp 138 and Tyr 169 through hydrogen bonding ( Fig. 2 and 3). These differences between enzyme and substrate interaction explain the differences in the turnover number previously reported (chlorogenic acid K cat = 28.1 s 21 ; ethyl ferulate K cat = 7.9 s 21 ) [14].
The size of the binding pocket as revealed in the crystal structure helps explain the results of a previous study as to why this enzyme has lower substrate affinity (based on K m ) with 1/2naphthyl acetate compared to 1/2-naphthyl propionate and butyrate (1-Naphthyl-acetate: 0.29860.03 mM. 1-Naphthyl propionate: 0.16260.01 mM. 1-Naphthyl butyrate: 0.15060.01 mM. 2-Naphthyl acetate: 0.89760.22 mM. 2-Naphthyl propionate: 0.22560.02 mM. 2-Naphthyl butyrate: 0.22260.01 mM) [14]. It is possible that the size of acetate is not long enough to fully exploit the binding pocket for interactions.

Site-directed mutagenesis of the inserted a/b subdomain demonstrates a role in substrate preference
The analysis of the catalytic site indicated that the inserted a/b subdomain from Pro 131 to Gln 184 could be important for substrate binding. We hypothesized that this structure is critical for holding the phenolic ring of the phenolic esters in the correct position for catalysis but has a less important role when aliphatic esters are used as the enzyme substrate. We assessed this hypothesis by introducing a dramatic change to the enzyme and expressing a partial deletion mutant of the inserted a/b subdomain. The purified enzyme carrying the deletion D 147-173 showed low activity when 4-nitrophenyl butyrate was used as the model substrate ( Table 2). In contrast, no activity was detected with any of the phenolic esters (ethyl ferulate, chlorogenic acid, and rosmarinic acid), even when excessive amounts of enzyme (50 mg/ml) were used in the reaction mixtures and analyzed using HPLC. Although we cannot exclude the possibility that this mutant is misfolded, we can only assume that this deletion does not introduce changes to protein folding based on the residual activity observed from saturation kinetic assay. A deeper analysis of site-directed mutants confirmed the importance of the inserted a/b subdomain in phenolic ester catalysis (Table 2, Fig. 4). Among these mutants, Asp 138 Ala and Gln 145 Ala had the highest impact on the enzymatic activity. A direct comparison using four different substrates at a fixed concentration (0.1 mM) indicated that Asp 138 Ala and Gln 145 Ala showed 73.162.8% and 87.660.3% activity, respectively, towards 4-nitrophenyl butyrate (Fig. 4). The activity of these mutants dropped to less than 10% when caffeic acid esters (chlorogenic acid and rosmarinic acid) were used as substrates ( Fig. 4B and 4C). Interestingly, the mutant Gln 145 Ala retained 21.762.8% activity when ethyl ferulate was used as the substrate (Fig. 4D). These results suggested that the residues Asp 138 and Gln 145 play a role in interacting with and/or restricting access to the binding pocket for caffeic and feruloyl esters, but not for nitrophenyl-based esters. A possible explanation could be that the orientation of the ester bond in 4-nitrophenyl butyrate is such that in order to maintain the proper orientation in the binding site for catalysis, this substrate would need to be oriented with the 4-nitrophenyl moiety bound in the other pocket of the boomerangshaped binding canal. This hypothesis would have to be tested by mutation, such as to Thr 40 or His 105 . These results together with the crystallographic data indicated that Asp 138 and Gln 145 from the inserted a/b subdomain are important in recognizing the caffeic and feruloyl esters.

Sequence and structural comparisons of LJ0536 with the a/b hydrolase superfamily
We undertook PSI-BLAST searches of Genbank [22] to better understand the genomic distribution of the a/b subdomain of LJ0536 and to identify whether it is unique to this enzyme. We utilized the PSI-BLAST algorithm, which develops positionspecific scoring matrices to identify remote sequence homology [23]. We reasoned that if the a/b subdomain is unique to LJ0536 and closely-related enzymes, PSI-BLAST searches would not reveal significant matches to other types of hydrolase enzymes in other genomes. We also reasoned that conversely, searches using the full-length sequence of LJ0536 should reveal a wide range of hydrolase enzymes from diverse genomes. The results matched this reasoning very well. A PSI-BLAST search with the full-length sequence retrieved various a/b fold hydrolases upon the first iteration, such as esterases, thioester hydrolases, peptidases, and lipases, from Gram positive or negative bacteria, archea, and plants (data not shown). Conversely, a PSI-BLAST search with residues 131-184 of LJ0536 did not retrieve the same range of hydrolases as the search with the full-length sequence, even after the search reached convergence after six cycles. The ranked order of the results is Lactobacillus hydrolases, followed by cinnamoyl hydrolases from Bacteroides and Actinobacteria. The sixth iteration retrieved a match with Est1E from Butyrivibrio proteoclasticus. Therefore, we concluded that the a/b subdomain is unique to cinnamoyl esterases from Firmicutes, Bacteroides, and Actinobacteria.
We were also interested in structural similarity searches to gain insight into the conservation of the LJ0536 structure and the a/b subdomain. A structural similarity search using the Dali Database [24], which calculates intramolecular distances in a structure and then compares the result with similar calculations performed on the contents of the Protein Databank, identified many proteins with structural homology to LJ0536. This was based on similarity of the overall a/b hydrolase fold. The top matches were Est1E from Butyrivibrio proteoclasticus (PDB: 2WTM) [20], human monoglyceride lipase (PDB: 3JW8 and 3HJU) [25], bromoperoxidase A1 from Streptomyces aureofaciens (PDB: 1A8Q) [26], and aryl esterase from Pseudomonas fluorescens (PDB: 3HI4) [27]. After structure superposition, these enzymes have a range of sequence identities between 17% and 32%, suggesting a low level of sequence similarity, over nearly the full-length of the LJ0536 structure (220-244 matching Ca atoms). Further down the list of matches, LJ0536 has structural similarity with valacyclovir hydrolase (VACVase) (PDB: 2OCG) [28] with 20% sequence identity over 221 matching Ca atoms (nearly the full-length of both structures).
Superposition of these structures showed that the enzymes are highly similar in the architecture of the esterase fold, but there is structural variation in the inserted subdomains (Fig. 5). LJ0536 and Est1E superimpose with a RMSD of 0.9 Å , indicating a high level of structural similarity, over nearly the full-length of both structures (231 matching Ca atoms which includes the inserted subdomains). The inserted a/b subdomain of Est1E is highly similar to that of LJ0536, as it contains the same secondary structure topology. However, the other structural homologs listed above have different topologies of their inserted subdomains. These inserted subdomains are always all-a helical, which differs from the mixed a/b LJ0536 insert. For example, while VACVase contains an inserted subdomain, it is comprised of four a-helices. An optimal superimposition of LJ0536 and VACVase resulted when the inserted subdomains were excluded (RMSD of 1.6 Å over 140 matching Ca atoms).
The structural variation in the inserted a/b subdomain is reflected in substrate preferences. Est1E is indeed a closely related enzyme to LJ0536 as seen by its activity towards feruloyl esters [20]. Unlike most of the esterases, VACVase demonstrated high specificity for amino acid esters. Since VACVase is an enzyme of interest involved in prodrug activation, we purified this enzyme and compared its activity in parallel with LJ0536 to assess their substrate preferences. Previously published results [28] were successfully reproduced and VACVase was not active towards 4nitrophenyl esters, ethyl ferulate, chlorogenic acid, or rosmarinic acid. In agreement with this, despite the fact that LJ0536 has a large range of catalytic specificity, no activities were detected when valacyclovir and L-amino acid benzyl esters were used as substrates for this enzyme.
Considering the sequence, structural, mutational, and substrate preference data, the results suggest the architecture of the inserted a/b subdomain of cinnamoyl esterases is a unique structure that plays a role in conferring substrate specificity towards cinnamoyl/ feruloyl esters.

Discussion
The hydrolytic mechanism of serine esterases is well characterized [29,30]. However, the mechanism by which this group of promiscuous enzymes recognizes their substrates is still a subject of scientific debate. Thus, the discussion of this work is focused on the analysis of the relationship between the substrates used and the amino acids that form the catalytic pocket of LJ0536.
The overall structural features of L. johnsonii cinnamoyl esterase LJ0536 resemble those recently described in Est1E, a predominant esterase encoded in the genome of Butyrivivrio proteoclasticus [20]. However, the specific structural differences in the architecture of the catalytic pocket promote different substrate binding preferences. The differences in the catalytic pocket scaffolds become evident when both protein structures are superimposed. The protruding hairpins of LJ0536 at the entrance of the catalytic groove are slightly shifted (2.30 Å and 1.85 Å ) with respect to Est1E. Consequently, the catalytic serine of LJ0536 is more exposed to the solvent. The inserted a/b subdomain of LJ0536 adopts the same conformation in the apo and ligand-bound structures, suggesting that this is a rigid structure. In contrast, it was suggested that the catalytic pocket of Est1E changes conformation upon substrate binding [20]. The pocket flexibility of Est1E is based on the rotation of Trp 160 . When ferulic acid was bound to the catalytic site, the protruding hairpins of Est1E shifted. Trp 160 , which is located on the second protruding hairpin, flips and creates a small hydrophobic pocket, allowing the binding of substrate. The rotation of Trp 160 on this second hairpin, which corresponds to the hairpin formed between b9 and b10 in LJ0536, forms a tunnel to bury the ferulic acid in the catalytic pocket. In addition, the backbone of Leu 144 forms a hydrogen bond to the 4hydroxyl group on the aromatic ring of ferulic acid. This dynamic mechanism is likely not present in LJ0536. Trp 160 of Est1E corresponds to Phe 160 of LJ0536, and Phe 160 adopts the same conformation in the apo and each of the ligand-bound complexes. Leu 144 of Est1E corresponds to Gln 145 of LJ0536. The bridge-like structure formed by this residue in LJ0536, which lies on the first hairpin of the inserted a/b subdomain, is present in the same location in both apo and holo structures. Hydrogen bonding in the pocket of LJ0536 is formed between side chains (Asp 138 and Tyr 169 ) and the hydroxyl groups on the aromatic ring of ferulic acid and caffeic acid.
The orientation of substrates in the catalytic pocket of LJ0536 is important for catalysis. The correct conformation is acquired through interaction of the enzyme with the substituents on the aromatic ring on one end of the ligands and with the ester group on the other. These two points of interaction fix the carbonyl group near the oxyanion hole and allow bending of the molecule. Conversely, functionalities beyond the ester group, including the alkoxyl group of ethyl ferulate/ferulic acid and the quinic acid moiety of chlorogenic acid, appear not to play roles in substrate binding. This notion is suggested due to the fact that we did not observe electron density for the quinic acid moiety of chlorogenic acid, although the binding cleft is large enough to accommodate this group. In addition, there was relatively poor density for the alkoxyl group, perhaps due to the lack of direct interaction with the catalytic pocket. The substrate specificity only depends on the type of phenolic acid present in the ester. Binding of the leaving group is not necessary, which results in a poor electron density map of the leaving group. This hypothesis is supported by Faulds et al. [18], which showed that crystallization of a S133A AnFaeA mutant (catalytic deficient mutant of feruloyl esterase from Aspergillus niger) with feruloylated trisaccharide substrate shows only the ferulic acid moiety but not the carbohydrate moiety. A similar scenario is observed in the crystal structure of a catalytic deficient S172A FAE-XynZ mutant in complex with feruloyl arabinoxylan [31]. Only the ferulic acid is visible in the structure, even though the authors took extra precaution to avoid substrate hydrolysis during crystallization. Both studies lead to the same conclusion that the lack of leaving group in the structure is due to the lack of interaction between the enzyme and the leaving group.
In regards to the catalytic mechanism of LJ0536, the active site is formed by the classical triad of Ser, His, and Asp [32]. The role of His 225 is to deprotonate Ser 106 so that Ser 106 can perform a nucleophilic attack on the carbon atom of the carbonyl group of the substrate, while Asp 197 stabilizes the protonated His 225 . An intriguing feature of LJ0536 is the presence of a second GlyXSerXGly motif, which is conserved among LJ0536 orthologs. The GlyXSerXGly harboring the catalytic serine was identified at position Gly 104 -Ser 106 -Gly 108 while the second motif is located at Gly 66 -Ser 68 -Gly 70 . Even though Ser 68 is exposed to the exterior of the enzyme and the two other conserved His 32 and Asp 61 are found in the vicinity of Ser 68 , the orientation of these amino acids makes it impossible to form an active catalytic site. Analysis of the structure suggested that Asp 61 and Ser 68 form hydrogen bonds with Val 14 on b 2 strand of the central core, and these interactions could play a role in maintaining the proper folding and/or structure of the enzyme. Thus, the drastic decrease in activity detected in the Ser 68 Ala mutant could be related to changes in the overall structure of the protein as suggested by the circular dichroism assay (Fig.S2). We did not observe any evidence, such as internal repeated sequences, that would support gene duplications or recombination during evolution.
Structural similarity searches revealed a number of structural homologs of LJ0536. This included VACVase, which is a biphenyl hydrolase-like protein originally identified from human breast carcinoma and is usually produced in large amounts in the liver. This enzyme was also detected in Caco-2 cells, as well as in the intestinal mucosa [28,33]. Since this protein could potentially contribute to phytophenol ester hydrolysis in the human intestine and the substrates used herein were not previously assayed, the activities of both enzymes were analyzed in parallel. Our results indicated that, despite the overall structure conservation of the enzymes, the substrate preferences of the enzymes are completely different. This could be due to the structural variations in the inserted subdomain, which our structural superimpositions demonstrated.
A recent review proposes a novel classification scheme for feruloyl esterases [34] based on a combination of several features, such as enzymatic activity, sequence similarity, ligand profile, and structural conservation. Since only a few ferulic acid esterases from bacterial origin have been fully characterized, the classification scheme proposed relies largely on proteins of fungal origin. Neither LJ0536 nor any of its homologs are yet included in any groups in this review [34]. The major differences between the enzymes herein studied and the esterases of fungal origin are related to the architecture of the catalytic pocket and substrate binding. For example, the Aspergillus niger AnFaeA (PDB: 2BJH) pocket [19] is a narrow, open cleft formed by a small loop of 23 amino acids and a short a helix. The ferulic acid is wedged between the two walls of the crevice. The hydrophobic stabilization of the aromatic ring in ferulic acid is contributed by the amino acids on one of the walls; the methoxy group decorating the benzyl ring of ferulic acid is oriented towards a small cavity composed of polar amino acids. AnFaeA along with the other fungal structures are clearly different from the inserted a/b subdomain of LJ0536. Consequently, based on substrate binding and the architecture of the pocket, LJ0536 together with other bacterial feruloyl esterases, such as Est1E and CinI [35], should be clustered together in a new group of feruloyl/cinnamoyl esterases.
The functional role of cinnamoyl esterases in the host is a field of great interest as this kind of enzyme can be used as an additive to increase the nutritional value of foods of a vegetal origin. The structures and observations discussed in this study suggest further exploration of feruloyl/cinnamoyl esterases in other bacterial species could reveal further structural and functional diversity.

Chemicals
All chemicals were purchased from Sigma-Aldrich unless specified otherwise. Water was purified with SynergyH UV Millipore Water Purification System. Cloning, Expression, and Purification of LJ0536, LJ0536 Mutants, and Human Valacyclovirase (VACVase) The LJ0536 p15TV-L clone [14] was used as a wild-type plasmid template for the mutations. The mutants were constructed by PCR using Phusion TM high fidelity DNA polymerase from Finnymes according to the manufacturer's protocol. The amino acids selected for mutagenesis were replaced with alanine. The 39nucleotide long complementary primers containing the desired mutation in the middle of the primers were synthesized by Sigma-AldrichH. The forward primer 59-GGGCAACAATTGCCTATT-TATGAA-39 and reverse primer 59GGGTCCTTGTGAT-TACCTTCAA9 were used for PCR amplification to generate a deletion mutant of the inserted a/b subdomain (coding region from Val 147 to Ala 173 ) The resulting PCR fragment was flanked by SmaI restriction sites. It was treated with T4 DNA ligase to seal the SmaI restriction site to complete the recombinant plasmid. The PCR-amplified plasmids were treated with 20 units of DpnI at 37uC for 2 hours to digest the methylated wild-type plasmid template. The plasmids containing the desired mutations were transformed into E. coli DH5a. Mutated sequences were confirmed by DNA sequencing. The plasmid containing the gene of interest (pET17b-VACVase) was provided by Dr. Gordon L. Amidon, University of Michigan. The gene of interest was cloned into p15TV-L vector according to the protocol described by Lorca et al. [36] and transformed into Eschericha coli DH5a. The sequence was confirmed by DNA sequencing. The expression of His 6 -tagged proteins was carried out using E. coli BL21-DE3 (Stratagene) with 1 mM IPTG (Isopropyl b-D-thiogalactopyranoside) as the inducer. Cells were disrupted by French Press and proteins were purified by nickel affinity chromatography as previously described [14] with the following modification. HEPES-based buffer (pH 7.50) instead of Tris-HCl based buffer was used throughout the purification process to improve the total yield of proteins. For removal of the His 6 -tag for enzymatic studies, purified proteins were incubated with TEV protease (60 mg TEV protease per 1 mg of target protein) at 4uC for 16 hours. The His 6 -tag was removed by passing the sample through a nickel affinity chromatography column. Collected proteins were dialyzed at 4uC against a buffer containing 50 mM HEPES pH 7.50, 500 mM NaCl, and 1 mM DTT (dithiothreitol) for 16 hours. Final protein products were flashfrozen and preserved in small aliquots at 280uC until use.

Crystallization, Data Collection, and Structure Solution
All proteins (with His 6 -tag uncleaved) were crystallized using the sitting drop method with Intelliplate 96-well plates and a Mosquito Crystal liquid handling robot (TTP LabTech), mixing 0.5 mL of protein at 15 mg/mL, and 0.5 mL of reservoir solution, over 100 mL reservoir solution. The protein solutions were pre-treated with the proteases subtilisin and V8 for the apo and S106A mutant forms of the enzyme, respectively, and proteases stored at 1 mg/ mL stock solution, added to final 1:10 vol/vol ratio protease:protein. Successful crystallization required the presence of the different proteases, a technique often used to increase the success of crystallization due to removal of disordered/flexible regions that would disrupt crystal formation [37].
Ligands were co-crystallized at a final ligand concentration of 5 mM (25 mM for S106A ethyl ferulate Form II) in the sitting drop, by diluting a stock solution of 100 mM ligand 1:20 v/v with the protein/protease mix; 0.5 mL of this new solution was mixed with 0.5 mL of reservoir solution for crystallization.
All crystals were cryo-protected with reservoir solution supplemented with paratone-N oil [39] prior to flash freezing in an Oxford Cryosystems cryostream. Diffraction data at 100 K at the Cu-Ka wavelength were collected at the Structural Genomics Consortium using a Rigaku FR-E Superbright rotating anode with a Rigaku R-AXIS HTC detector. Diffraction data was reduced with HKL2000 [40].
The LJ0536 apo structure was solved by Molecular Replacement (MR) using Phaser [41], with a poly-alanine form of the structure of feruloyl esterase (Est1E, PDB:2WTM) from Butyrivibrio proteoclasticus [20] as a search model. The successful MR solution was identified by map inspection using Coot [42] and by a decrease in R free after refinement using Refmac [43]. The structure was fully built by manual building and rounds of refinement with Refmac, Phenix.refine [44] and Buster [45] at the final stages. Anisotropic B-factors were refined for protein and ligand atoms for all structures. Non-crystallographic (NCS) restraints were not utilized for any structure. All structures were refined using TLS parameterization (TLS groups were the Nterminal residue to residue 179, and 180 to the C-terminal residue), as assigned by the TLSMD server [46]. Addition of TLS restraints resulted in lower R and R free values. Water atoms were added by automatic methods using the refinement programs used in each structure (Phenix.refine, Refmac/CCP4/ARP/wARP, or BUSTER, respectively). Ions were added after the automatic water building by inspection of magnitude of residual F o -F c density and hydrogen bonding patterns. The final atomic model includes residues 1-245 of LJ0536, with six atoms from the expression tag at the N-terminus of one chain of the asymmetric unit.
The LJ0536 S106A structure was solved by MR using the apo structure. All ligands were identified by the presence of residual F o -F c density in the active site of the enzyme after molecular replacement using the apo S106A enzyme. Refinement of ligand structures was executed with geometric restraints generated by the PRODRG server [47] and with a combination of Refmac and/or Phenix.refine. Final validation of the structure of the ligands was performed by calculating simulated annealing omit F o -F c maps using Phenix.refine and Cartesian simulated annealing with default parameters, after removing atoms from the ligand and any protein atoms within 5 Å of the ligand atoms.
In the LJ0536 S106A+ethyl ferulate Form I complex (two chains in the asymmetric unit), one ligand was modeled with an occupancy of 1.0 and the other with a manually-assigned occupancy of 0.55 (due to lower quality electron density, and higher B-factors than nearby protein atoms, at higher occupancy levels). For ethyl ferulate Form II complex (one chain in the asymmetric unit), the ligand was modeled with an occupancy of 1.0. All other ligands in their respective complexes were modeled with occupancies of 1.0.
All structures were refined until convergence of R work and R free values, and reasonable geometries were verified using the Procheck [48] and Molprobity [49] servers.

Structural Analysis
Protein-protein interaction interfaces were identified and analyzed with the PDBe PISA server [50] with default settings; a residue is considered in an interface if its change in accessible surface area between chain A and chain A complex with chain B is non-zero. All structural images were generated using PyMOL [51]. Structure similarity searches were performed using the Dali database [24].

Enzymatic Assays
The esterase activities with aliphatic (4-nitrophenyl butyrate) and aromatic (ethyl ferulate, chlorogenic acid, and rosmarinic acid) substrates were measured spectrophotometrically using a Synergy HT Biotek Reader at 412 nm for aliphatic esters and 324 nm for aromatic esters. A typical reaction mixture contained 20 mM HEPES pH 7.80, 0.1 mM ester substrates, and 0.3 mg/ mL purified enzymes (ethyl ferulate and chlorogenic acid) or 3 mg/ mL purified enzymes (4-nitrophenyl butyrate and rosmarinic acid). The reactions were carried out at 25uC for mutated and wild-type LJ0536 and 37uC for VACVase. The extinction coefficients of 4nitrophenyl butyrate (16300 M 21 cm 21 ), ethyl ferulate (15390 M 21 cm 21 ), chlorogenic acid (26322 M 21 cm 21 ), and rosmarinic acid (15670 M 21 cm 21 ) were used to determine the amount of substrate hydrolyzed. The assay was performed in triplicate and the percentage activity was calculated based on the average specific activity. Saturation kinetic assays were performed using the classical model substrate 4-nitrophenyl butyrate under the conditions previously described [14]. The kinetic parameters were estimated by non-linear fitting using Origin software (OriginLab, Northampton, MA).

High Performance Liquid Chromatography (HPLC) Analysis
HPLC assays were performed using the Hitachi HPLC L-2000 series system with a SymmetryH C18 5 mm 3.9 mm6150 mm reversed-phase column and a SymmetryH C18 5 mm guard column. The determination of esterase activity using valacyclovir as the substrate was carried out as described by Lai et al. [28]. The reaction mixture contained 50 mM HEPES pH 7.80, 4 mM valacyclovir, and 10 mg/mL enzyme. The determination of esterase activities using ethyl ferulate, chlorogenic acid, and rosmarinic acid was carried out using linear gradient elution with water/acetic acid/1-butanol (350:1:7, vol:vol:vol) and methanol with a flow rate of 1 mL/min as described by Mastihuba et al. [52]. The reaction mixture contained 20 mM HEPES pH 7.80, 1 mM ester substrate, and 20 mg/mL enzyme.

Circular Dichroism Assay
Protein secondary structure was estimated using AVIV Quick Start 215 Circular Dichroism Spectrometer with Hellma #110 Series 0.1 cm quartz cuvette. Protein samples at a concentration of 1.1 mg/mL were dialyzed against 2 mM HEPES buffer with 50 mM NaCl for 16 hours. Protein samples were adjusted to 0.2 mg/mL in 0.5 mM HEPES buffer with 10 mM NaCl after dialysis. Spectra were acquired at 1 nm intervals and averaged with 10 scans. Scans with buffer alone were used as background correction. The final spectra was expressed in molar ellipticity (ME) using the formula ME = h/10nCl, where h is the signal acquired, n is the number of residues, C is the molar concentration of protein, and l is the pathlength of the cuvette.

Multiple Sequence Alignments of LJ0536 Structural Homologs
A structural similarity search was performed using the Dali server to identify proteins with structural homology. The sequences of structural homologs were retrieved from NCBI database [22]. Multiple sequence alignment was performed using CLUSTAL X2 [53].  Figure S4 Orientation of catalytic residues and structural superimposition of LJ0536 S106A co-crystallized with ferulic acid (FA) and ethyl ferulate (EF). (A). Cartoon representation of LJ0536 S106A co-crystallized with ferulic acid and ethyl ferulate. LJ0536 S106A with ferulic acid bound is colored in light violet. LJ0536 S106A with ethyl ferulate bound is colored in yellow. Water molecules are represented by red spheres. The 4-hydroxyl group on the phenolic ring of ferulic acid and ethyl ferulate is hydrogen bonded with Asp 138 in order to orient the phenolic ring in the correct position. Additional polar interactions of 4-hydroxyl and 3-methoxy groups with water molecules further stabilize the binding of substrate. The oxyanion hole is formed by Phe 34 and Gln 107 . Gln 145 positions a water molecule adjacent to the ester bond of substrate, which might be involved with the activation of Ser 106 . (B). Cutaway view of the LJ0536 S106A surface representation showing the phenolic ring binding pocket and the leaving groove. (TIF)