Crystal Structure of a Novel Esterase Rv0045c from Mycobacterium tuberculosis

There are at least 250 enzymes in Mycobacterium tuberculosis (M. tuberculosis) involved in lipid metabolism. Some of the enzymes are required for bacterial survival and full virulence. The esterase Rv0045c shares little amino acid sequence similarity with other members of the esterase/lipase family. Here, we report the 3D structure of Rv0045c. Our studies demonstrated that Rv0045c is a novel member of α/β hydrolase fold family. The structure of esterase Rv0045c contains two distinct domains: the α/β fold domain and the cap domain. The active site of esterase Rv0045c is highly conserved and comprised of two residues: Ser154 and His309. We proposed that Rv0045c probably employs two kinds of enzymatic mechanisms when hydrolyzing C-O ester bonds within substrates. The structure provides insight into the hydrolysis mechanism of the C-O ester bond, and will be helpful in understanding the ester/lipid metabolism in M. tuberculosis.


Introduction
M. tuberculosis is the most prevalent pathogen causing tuberculosis in humans and animals [1]. The bacteria is characterized by an unusual waxy coating on the cell surface (primarily mycolic acid) and it expresses more than 250 enzymes related to ester/lipid metabolism. In contrast, only about 50 enzymes are involved in the ester/lipid metabolism in Escherichia coli (E. coli) [2,3]. These enzymes in M. tuberculosis which catalyze ester/lipid and carbohydrate metabolism are more likely to be required and essential to bacterial existence and survival [4]. In 2007, a cell wall-associated carboxyl esterase, Rv2224c, of M. tuberculosis H37Rv was identified as a major virulence gene and was further found to be required for bacterial survival in mice [5]. Rv0045c, participating in ester/lipid metabolism in M. tuberculosis, was predicted to be a hydrolase belonging to a/b hydrolase fold family based on bioinformatics studies. However, little is known about its substrate specificity and mechanism of action.
The a/b hydrolase fold was identified in 1992, by comparing five hydrolytic enzymes with widely different catalytic function [6]. Since then, more than 50 members belonging to this family have been identified and characterized by structure determination [7]. The a/b hydrolase fold involves a variety of enzymes including esterases, lipases, epoxide hydrolases, dehalogenases, proteases, and peroxidases, making it one of the most versatile protein families known [8]. The conversed feature of the a/b hydrolase fold has been described as a mostly parallel, eight-stranded b sheet surrounded on both sides by a helices (only the second b strand is antiparallel) [9][10][11].
The Rv0045c gene encodes a polypeptide chain of 298 amino acids with a putative hydrolase activity. Sequence comparisons show that Rv0045c shares a low sequence identity (,30%) to other members of the a/b hydrolase fold family, however, the consensus sequence G-X-S-X-G of the nucleophile elbow and the catalytic residues are highly conserved. Similar to other a/b hydrolases, it has been previously shown that Rv0045c can hydrolyze ester bonds within a series of p-nitrophenyl derivatives (C 2 -C 14 ) [12]. The purified enzyme can effectively hydrolyze p-nitrophenyl derivatives with short hydrocarbon chains, especially C 2 -C 8 . We identified p-nitrophenyl caproate (C 6 ) as the most suitable substrate of Rv0045c at the assay conditions of 39uC and pH 8.0 [12].
To understand the active site and enzymatic mechanism of esterase Rv0045c, we determined the crystal structure of the enzyme and performed docking experiments. Our studies clearly revealed that 1) Rv0045c contains two distinguished domains: the a/b fold domain and the cap domain, 2) Rv0045c, from M. tuberculosis, is a novel member of a/b hydrolase fold family, and 3) Rv0045c probably employs two kinds of enzymatic mechanisms (indirect and/or direct) where S154 attacks the carbonyl carbon within the C-O ester bond using or without using an activated water molecule.

Structure determination and features of Rv0045c
The purified Rv0045c protein and selenomethionine (Se-Met) labeled Rv0045c protein were crystallized in the same condition (0.2 M MgCl 2 , 100 mM imidazole, pH 7.0, 19% (w/v) PEG 4000). However, both crystals revealed different space groups ( Table 1). The crystal structure of Rv0045c was determined by SAD and was refined to 2.8 Å resolution. The final model (Fig. 1) of Rv0045c consists of residues 38-193 and 205-329 with the missing residues being not visible in the density maps. The analysis of Ramachandran plot by COOT [13] showed that most of the modeled residues were in preferred and allowed regions ( Table 1). The model clearly contains two distinct structural domains: an almost globular a/b fold domain (a1b1b2b3a2a3b4a4b5a5b6a6-a9b9a10b10a11a12) and an inserted cap domain (a7a8b7b8) which interacts with the a/b fold domain.

Structure of the a/b fold domain
Like other members of the a/b hydrolase fold family, the a/b fold domain represents the core of Rv0045c ( Fig. 1A and 1B). The a/b fold domain of Rv0045c consists of a mostly parallel, 8-stranded b sheet surrounded by a-helices on both sides (only the second strand is antiparallel), which has been regarded as the ''canonical'' feature of the a/b hydrolase fold family [8]. The last strand is oriented with a twisting angle of approximately 120u to the first one (Fig. 1B). The topology of b-a-b motifs (b3-a2-a3-b4, b5-a5-b6 and b9-a10-b10) in the centre displays a right-handed super helical twist (Fig. 1C). The a/b hydrolase fold domain provides the stable scaffold for the active site of Rv0045c. Sequence alignments revealed that the ''nucleophile elbow'' of G-X-S-X-G sequence motif is located in the sharp turn connecting b5 and a5 ( Fig. 1C and 2) and is highly conversed among these enzymes (Fig. 2), although Rv0045c shows no significant sequence homology to any other a/b hydrolase fold family member.

Structure of the cap domain
The polypeptide region, Arg205 -Ile252, in Rv0045c forms the cap domain. The cap domain comprises two sequential a-helices (a7, a8) and another two consecutive b-strands (b7, b8). Unlike the a/b fold domain, structural homologies of the cap domain cannot be absolutely identified among the superposed a/b hydrolase fold family members ( Fig. 3A and 3B). The alignment and orientation of a-helices and b-strands within the cap domain show a little difference. The inserted cap domain is supposedly related to substrate binding both in E-2AMS hydrolase [14] and esterase ybfF [15] and may provide clues about these two enzymes' substrate specificity, however, no devotion contributed by the cap domain of Rv0045c was revealed when p-nitrophenyl caproate was docked into the active site ( Fig. 4A and 4B). Residues 194-204 are missed in this domain, for the reason that this region is much more flexible and reveals very poor electron density.

Active site of Rv0045c
The putative active site of Rv0045c was identified via sequence alignment ( Fig. 2) and structural homology ( Fig. 3) with other a/b fold hydrolases. The active site formed by Ser154 and His309 is shielded by the cap domain. The putative nucleophilic residue Ser154 is located at the beginning of a5. Results of docking experiment indicated that Gly90, Gln92, Leu155, Ile252 and Phe255 help p-nitrophenyl caproate locate onto the active site, and that these residues comprise the binding site of Rv0045c (Fig. 4B). Three hydrophobic residues, including Leu155, Ile252 and Phe255, contribute to the stable conformation of the hydrocarbon chain of p-nitrophenyl caproate. The substrate is further stabilized by two hydrogen bond contributed by Gly90 and Gln92 (Fig. 4B, blue dotted line). Residues involved in forming active site and binding site devote themselves shaping the active groove (Fig. 4C), which can well accomodate p-nitrophenyl caproate (Fig. 4D).

Discussion
The a/b hydrolase fold family has been structurally well characterized and comprises a variety of enzymes including esterases, lipases, epoxide hydrolases, dehalogenases, proteases, and peroxidases, catalyzing myriad reactions. Analysis of the primary sequence for Rv0045c using BLAST suggested that this enzyme shares little sequence identity to other members of the a/b hydrolase fold family, though the enzyme was structurally characterized to be a novel member of the family. A DALI [16] search was performed using the structure of Rv0045c, and these results confirmed that Rv0045c shows little sequence identity but high structural similarity to other members of the a/b hydrolase fold family. Related members of this family are shown in Table 2, and their similarity to Rv0045c is presented by Z score, rmsd, identity and number of aligned residues [14,15,[17][18][19]. Data of superposition of Rv0045c with E-2AMS hydrolase and esterase ybfF showed that the cores of the three enzymes, which are all comprised of eight stranded b-sheets with a-helices on both sides, overlap with each other (Fig. 3A and 3B) The members of a/b hydrolase fold family utilize a highly conserved catalytic nucleophile which contains a serine, cysteine or aspartic acid residue [8]. The nucleophile of Rv0045c is Ser154, positioned as the first residue at the beginning of a5. The active site of Rv0045c (Ser154 and His309) identified by sequence alignment is highly conserved among the enzymes aligned. Both in E-2AMS hydrolase and esterase ybfF, the cap domain directly contributes to the substrate binding, which is not observed in Rv0045c when p-nitrophenyl caproate was docked into the active site. As shown in a docking experiment, when a small substrate, pnitrophenyl caproate, was bound, the cap domain is not involved in the binding of the substrate to the protein. The binding site is located on the surface of the a/b fold domain. Three hydrophobic residues (Leu155 in a5, Ile252 and Phe255 after b8) help the hydrocarbon chain of p-nitrophenyl caproate to obtain an optimum conformation to reduce the binding energy. The orientation of the C-O ester bond of p-nitrophenyl caproate is stabilized via two hydrogen bonds contributed by Gly90 and Gln92 after b3. It has been already known that Ser is an executive residue in both E-2AMS hydrolase and esterase ybfF [14,15]. To confirm the activity of Ser154 within Rv0045c, we generated a mutant of this enzyme, but no any activity could be detected (data not shown).
A previous study about the biochemical activity of Rv0045c suggested that the enzyme can hydrolyze the ester bond of pnitrophenyl derivatives and p-nitrophenyl caproate was identified as the most effective substrate [12]. As an esterase, Rv0045c can hydrolyze the C-O ester bond of p-nitrophenyl caproate to produce p-nitrophenol and caproic acid (Fig. 5A). In the model of Rv0045c binding p-nitrophenyl caproate, the hydroxyl oxygen of Ser154 is 3.2 Å (purple dotted line, Fig. 4B) from the carbonyl carbon of the C-O ester bond of the substrate. The indirect and direct enzymatic mechanisms of Rv0045c can be subsequently hypothesized (Fig. 5B). It is probable that Ser154 interacts with the C-O ester bond indirectly, using an activated water molecule  Fig. 5B), for the reason that it is too long (3.2 Å ) for Ser154 to directly attack the carbonyl carbon within the C-O ester bond. Similar to the mechanism proposed in the model of E-2AMS hydrolase [14], there must be some small molecules, for instance the water molecules, mediating the hydrolysis reaction. In detail, the hydroxyl oxygen of Ser154 is firstly polarized by adjacent His309 before Ser154 attacks the hydrogen atom of a free water molecule, and then, the activated water molecule attacks the carbonyl carbon within the C-O ester bond.
However, it cannot be ignored that the binding of substrate to Rv0045c may cause conformational change of the enzyme. In that case, Ser154 might be close enough to directly attack the carbonyl carbon within the C-O ester bond and the enzyme employed a direct mechanism (Mechanism 2, Fig. 5B). Rv0045c can catalyze a mount of substrates with hydrocarbon chains of different length. We infer that Rv0045c may adopt different enzymatic mechanisms (direct and/or indirect) when binding different substrates. We have performed cocrystallization with ligands, however, no esterase Rv0045c-substrate complex has been successfully crystallized by now. We will continue to seek the way to get solvable crystals of Rv004c-substrate complex to clarify the catalytic mechanism of Rv0045c.
Tuberculosis is a contagious respiratory system disease, which is caused by M. tuberculosis via infecting the lungs of mammalian. M. tuberculosis can tolerate and withstand rigorous condition and weak disinfectants to survive in a dry state for weeks. It was reported that the unusual cell wall, rich in lipids, is likely responsible for this resistance [20]. Rv0045c is proposed to be an esterase or hydrolase involved in lipid metabolism. Our study determines for the first time the structure of Rv0045c and will give further insight into the mechanism of esters or lipids hydrolysis in M. tuberculosis. This work will help to design and screen inhibitors against Rv0045c to verify the function and role of this enzyme in M. tuberculosis.

Protein preparation
The expression construct was generated using a standard PCR procedure. Full-length Rv0045c was sub-cloned into pET28a vector (Invitrogen). The production induced with 0.3 mM IPTG was overexpressed at 16uC for 20 h in E. coli BL21 (DE3) strain (Novagen). The soluble fraction of Rv0045c from cell lysate was purified by Ni Sepharose TM 6 Fast Flow resin (GE Healthcare) to homogeneity and further polished by ion-exchange chromatography (Resource Q and S 1 mL, GE Healthcare) and gel filter chromatography (Superdex 75 10/300 GL, GE Healthcare). Se-Met labeled Rv0045c was produced by growing the E. coli cells in a minimum medium containing selenomethionine and purified in the same way as described above. Crystallization and data collection The diffracting crystals of native and Se-Met labeled Rv0045c were grown at 16uC using the hanging-drop vapor-diffusion method by mixing 1 mL protein (5 mg/mL) with an equal volume of reservoir solution. The Crystal Screen kit I and Crystal Screen kit II of Hampton Research (Aliso Viejo, CA, USA) were used for preliminary screen. Both the native and Se-Met labeled Rv0045c were crystallized in the same condition consisting of 0.2 M MgCl 2 , 100 mM Tris-HCl pH 8.5, 30% (w/v) PEG4000 with, however, the different space groups. The native crystals are in the space group P3 1 , with unit cell parameters a = b = 73.465 Å , c = 48.063 Å , and the Se-Met labeled crystals in P3 1 21 with a = b = 130.330 Å , c = 48.785 Å . For data collection, 20% (v/v) glycerol was added to the crystallizing precipitant as a cryoprotectant and the crystals were flash frozen in a 2173uC nitrogengas stream. A complete 2.8 Å native dataset and a complete 2.6 Å Se-Met MAD dataset were respectively collected on beamline BL17U at Shanghai Synchrotron Radiation Facility (SSRF, Shanghai, China) and beamline BL17A at the Photon Factory (Tsukuba, Japan) and processed using the HKL-2000 program package [21].

Structure determination
The structure of Rv0045c was determined by single-wavelength anomalous dispersion (SAD). Selenium atom coordinates were determined using the HKL2MAP [22] program suite and initial SAD phases were calculated and improved with the program SOLVE/RESOLVE [23,24]. The residues of Rv0045c were built manually using the program COOT [13] and the refinement was performed with CCP4 refmac5 [25]. The Rv0045c crystal structure has been refined to 2.8 Å resolution and working and free R factors are 21.69% and 28.57%, respectively. The PyMOL (http://www.pymol.org) molecular graphics program of DeLano Scientifics was used to present the final structure and to produce figures. The data statistics are summarized in Table 1.

Docking experiment
For docking experiment, the AutoDockTool [26][27][28] software was used for macromolecule and ligand preparing, macromolecule-ligand docking and result analysis. The orientations of nitrogroup and hydrocarbon chain of p-nitrophenyl caproate were allowed to rotate until the favorable docking position and conformation were found. The docking did not require reorientation of the macromolecule side chains.