Crystal Structure of the N-Acetylmannosamine Kinase Domain of GNE

Background UDP-GlcNAc 2-epimerase/ManNAc 6-kinase, GNE, is a bi-functional enzyme that plays a key role in sialic acid biosynthesis. Mutations of the GNE protein cause sialurea or autosomal recessive inclusion body myopathy/Nonaka myopathy. GNE is the only human protein that contains a kinase domain belonging to the ROK (repressor, ORF, kinase) family. Principal Findings We solved the structure of the GNE kinase domain in the ligand-free state. The protein exists predominantly as a dimer in solution, with small populations of monomer and higher-order oligomer in equilibrium with the dimer. Crystal packing analysis reveals the existence of a crystallographic hexamer, and that the kinase domain dimerizes through the C-lobe subdomain. Mapping of disease-related missense mutations onto the kinase domain structure revealed that the mutation sites could be classified into four different groups based on the location – dimer interface, interlobar helices, protein surface, or within other secondary structural elements. Conclusions The crystal structure of the kinase domain of GNE provides a structural basis for understanding disease-causing mutations and a model of hexameric wild type full length enzyme. Enhanced Version This article can also be viewed as an enhanced version in which the text of the article is integrated with interactive 3D representations and animated transitions. Please note that a web plugin is required to access this enhanced functionality. Instructions for the installation and use of the web plugin are available in Text S1.


Introduction
Sialic acids are Nor Osubstituted terminal monosaccharides with a nine-carbon backbone highly expressed on eukaryotic cell surfaces [1]. Sialylation of glycoproteins and glycolipids modulates a wide range of biological and pathological events including early development [2], tumorigenesis [3], viral and bacterial infection, and immunity [4,5]. In vertebrate systems, N-acetylneuraminic acid (Neu5Ac) is the metabolic precursor of all known naturally occurring sialic acids [6]. Neu5Ac is synthesized in the cytosol from UDP-N-acetylglucosamine (UDP-GlcNAc) by four consecutive reactions; and UDP-GlcNAc is a derivative of fructose-6phosphate and the end-product of the hexosamine biosynthesis pathway (Figure 1).
The first two steps of the biosynthesis of Neu5Ac from UDP-GlcNAc are catalyzed by the bi-functional enzyme UDP-GlcNAc 2-epimerase/N-acetylmannosamine kinase (GNE). GNE contains an N-terminal epimerase domain and a C-terminal kinase domain [7]. The epimerase domain converts UDP-GlcNAc to Nacetylmannosamine (ManNAc), which is then phosphorylated at the 6 position by the kinase domain. GNE is feedback-inhibited by the activated form of Neu5Ac, i.e., cytidine-monophosphate Nacetylneuraminic acid (CMP-Neu5Ac). The kinase domain belongs to the ROK (Repressor, ORF, Kinase) family. The ROK family consists of a set of bacterial proteins that include repressors for sugar catabolic operons, and sugar kinases [8]. Gne is the only known gene in the entire human genome that encodes a ROK domain-containing protein.
Three protein isoforms have been described for human GNE, where isoform 1 is ubiquitously expressed and is believed to be responsible for the basic supply of sialic acids. Isoforms 2 and 3 are generated by alternative splicing and show tissue specific expression patterns. Isoforms 2 and 3 have reduced epimerase activities but almost intact kinase activities and may fine-tune the production of sialic acids [9]. Wild type GNE forms homohexamer in solution [10], and allosteric regulation of the epimerase and kinase activities of GNE is important for the normal function of the protein [10,11]. Mutations in the epimerase domain lead to the rare congenital metabolism disorder sialurea, which results in the production of high levels of Neu5Ac due to loss of the allosteric feedback control of the UDP-GlcNAc 2-epimerase activity by CMP-Neu5Ac [12]. Late onset autosomal recessive inclusion body myopathy, which is also known as hereditary inclusion body myopathy (hereinafter referred to as HIBM), and allelic Nonaka myopathy are neuromuscular disorders that are caused by a number of different mutations within the gne gene. The mutations are located at either the epimerase domain or the kinase domain [13] and lead to hypoactivity of the enzyme [11]. Mutagenesis and enzymatic activity analysis revealed that the activities of the epimerase domain and the kinase domain are interrelated such that a single mutation in one domain could affect the activities of both domains [11]. Here, we solved the structure of the dimeric GNE kinase domain in the ligand-free state. The structure reveals the dimerization interface of the kinase domain and also suggests a possible hexameric assembly of the protein. Furthermore, the structure provides insights into the relationship between GNE mutations and GNE-related metabolism disorders.

Results and Discussion
Overview of the GNE kinase domain monomer The overall structure adopts a typical bi-lobal kinase architecture. Both the N-lobe and the C-lobe have the a/b fold. Each lobe consists of a central b-sheet flanked by a-helices on both sides of the sheet. The last helix C-terminal to the C-lobe is part of the Nlobe and perpendicular to the interfacial helix of the C-lobe. Residues 475-498 of the N-lobe are invisible in the electron density map (Figure 2).
The GNE kinase domain contains a type I zinc-binding motif GHx 9-11 CxCGx 2 G(C/H)xE, which forms an HC3 type zincfinger with residues H569, C579, C581, C586. The zincbinding motif is a characteristic feature for all ROK family members [14]. The kinase domain also contains a DxGxT type ATP-binding motif [15,16]. The side chains of this ATPbinding motif residues point toward the cleft between the Nlobe and the C-lobe. Comparison with the actin/hexokinase/ hsp70 ATPase domains suggests that the disordered residues 475-498 form part of the binding pocket for the adenosine moiety of ATP [17] and are located near the DxGxT ATPbinding motif. Taken together, these findings suggest that the ATP binding pocket of the GNE kinase domain is located in the cleft between the two lobes.

Oligomeric state of the GNE kinase domain
Previous deletion mutations study has suggested that the GNE kinase domain is responsible for dimerization, while a segment of residues between the epimerase and kinase domains, residues 360-382, is a potential site for trimerization [18]. Our gel filtration data ( Figure 3) show that the kinase domain exists predominantly as a dimer in the solution, with small amounts of monomer and a higher order oligomer. The apparent molecular weight of the oligomer fits a hexamer of the kinase domain (see also below). However, the possibility of a tetramer [19] cannot be completely ruled out due to the low resolution of the gel-filtration column at this molecular size  The GNE kinase domain was crystallized in space group P3 1 21 with three molecules in the asymmetric unit. Protein interface and assembly analysis using the PISA server [20] suggests that two of the three molecules dimerize through the C-lobe with an average buried surface area of 1587 Å 2 per molecule ( Figure 4) whereas the third molecule dimerizes with a two-fold symmetry related molecule through the same C-lobe ( Figure 5a). The solvation free energy gain upon formation of the interface, D i G, is 224.2 kcal?mol 21 , indicating that the dimer interface is very stable and may not simply be a crystal packing artifact.
A crystallographic hexamer can be produced when a two-fold rotational symmetry operation is applied to the three molecules (one and a half dimers) in the asymmetric unit ( Figure 5). In this hexamer, the N-lobes of three kinase molecules are pointing to the same side of the ''hexamerization plane'', while the N-lobes of the other three molecules are pointing to the opposite side of the plane (Figure 5b). This assembly mode of the kinase domain allows locating the epimerase domain further away from the hexamerization plane and is consistent with the proposition that the interdomain segment (residues 360-382) is the site of trimerization [18].

Structure comparison with other ROK family members
Structural homology search of the GNE kinase domain using the FATCAT (Flexible structure AlignmenT by Chaining Aligned fragment pairs allowing Twists) server [21] revealed the top four non-redundant hits to be PDB codes 2aa4, 1xc3, 1z05 and 1z6r. All these structures contain the signature zinc-binding motif of the ROK family. The structure of E. coli putative ManNAc kinase (PDB 2aa4) was the top hit with twist-adjusted r.m.s. deviation (opt-rmsd value) of 1.94 Å . The structure of a putative sugar kinase from Bacillus subtilis was the second best hit (PDB 1xc3). The other two homologous structures were transcription repressors that belong to the ROK family (PDB: 1z05 and 1z6r). Vibrio cholerae transcriptional regulator (PDB: 1z05) is a homolog of the E. coli Mlc protein (PDB: 1z6r). The latter is a transcriptional repressor that controls the expression of malT, the central transcription activator of the E. coli maltose system [22]. The structure of the GNE kinase domain aligns well with the E. coli Mlc structure: the N-lobe of GNE kinase domain aligns to the E-domain of Mlc and the C-lobe aligns to the Odomain of Mlc. It is interesting to note that the Mlc O-domain is responsible for the oligomerization of Mlc protein [22] in a way similar to the dimerization of GNE kinase through the C-lobe. However, these four structures do not contain sugar ligands that would help inform on a substrate binding mode for GNE.
To evaluate the putative sugar binding site, the sequence of the GNE kinase domain was aligned with that of E. coli glucokinase complexed with glucose (PDB: 1sz2) [23], which is the closest homologous structure containing a bound substrate currently available in the PDB data bank. The five residues involved in sugar binding are conserved in GNE (N516, D517, E566, H569, E588, GNE numbering). These five residues are arranged to accommodate the sugar substrate ( Figure 6). Two residues, H569 and E588, are located in the ROK family zinc-binding signature motif and H569 directly coordinates the zinc ion. This finding suggests that zinc may play a catalytic role in sugar substrate binding, as well as a structural role.

Structural mapping of disease-related GNE mutations
Since the identification of the relationship between gne mutations and HIBM [13], more than 60 mutations have been found to be associated with HIBM [24]. Among these mutations, 25 missense mutations at 23 unique sites are located in the kinase domain of the GNE protein. These 23 mutation sites can be classified into 4 different groups based on their solvent accessibility, and their locations ( Table 1).
The first group of residues I557, G559, V572, and G576 is located at the dimerization interface of the C-lobe and mutation of these residues may interfere with dimerization of the kinase domain. It is noteworthy that kinase domain dimerization does not affect the solvent accessibility of G576 (Table 1), indicating that G576 is not directly involved in dimerization. The amino acid side chain of a mutant at this position would point into a hydrophobic niche that also accommodates the side chain of L574 from another monomer. The G576E mutation would exert both charge and space hindrances on the side chain of L574 and thus disrupt the dimerization (Figure 7), consistent with the previous observation that the G576E mutant of the full length GNE remains as a trimer [11].
The location of this group of residues is also close to the residues involved in sugar substrate binding (Figure 6b). Residues V572 and G576 are located on the zinc-binding signature motif of the ROK family (Figure 6b, 8b), which could play both a functional and a    structural role. Mutations of these residues could thus also affect the sugar substrate binding affinity of the kinase domain indirectly.
The second group of residues includes those located at the interfacial helices between the N-lobe and the C-lobe, i.e. N519, A524, F528, G708, and M712. Since the interlobar cleft is the site of ATP and carbohydrate binding as well as where phosphoryl transfer occurs, mutation of these residues could change the interlobar movement during catalysis and thus affect the kinase activity of the protein. For example, the first identified HIBM-related mutation, M712T [13], would likely abolish the hydrophobic interaction of the side chain of M712 with that of L523 from the C-lobe helix (Figure 8a). In the previous study [11], the M712T mutation has been shown to cause a 30% reduction in the kinase activity without affecting the epimerase activity of full length GNE. On the contrary, mutations of other residues in this second group reduce not only the kinase activity but also the epimerase activity of the full length protein ( [11], Table 1). This suggests that the kinase domain is allosterically coupled to the epimerase domain. The structure of the full length GNE is needed to fully understand the coupled effects of the kinase and epimerase domains.
The third group currently includes residue P511. P511 has the highest relative solvent accessibility (.40%) among the 23 mutation sites and is located on a loop region of the structure. The underlying mechanism for the association of P511H and P511L mutations with HIBM is elusive without further data, but mutation of a proline to any other residue type will inevitably change the flexibility of the loop region around this residue and could thus change the allostery of full length GNE in the higher-order oligomeric state.
The fourth group of residues includes all the rest of mutation sites in Table 1. All residues have hydrophobic side chains and low solvent accessibilities, and are located within secondary structural elements. Mutations of these residues may disrupt the secondary structural elements at given mutation sites, and could interfere with the hydrophobic interactions of the secondary structure elements that stabilize the protein quaternary structure.

Summary
We show here the 3D structure of the N-acetylmannosamine kinase domain of GNE, the only ROK family kinase encoded in the human genome. The kinase domain dimerizes through an Data of missense mutations were extracted from reference [11] and [24]. b UDP-GlcNAc epimerase and ManNAc kinase activities are percentage values relative to the corresponding activities of the full length wild type GNE. Data extracted from reference [11]. c Oligomeric state of the full length mutant GNE. Data extracted from reference [11]. d Relative solvent accessibility of the residue calculated using the DSSP program [30] and normalized according to values in reference [31]. A value of 1 means full exposure of the residue while a value of 0 means the residue is fully buried. e The type of the secondary structure element the residue is located at was assigned using the DSSP program [30]. doi:10.1371/journal.pone.0007165.t001 interface at the C-lobe. This is consistent with mutagenesis data from other groups on the full length GNE protein [11,18]. The crystallographic hexamer, which consists of a trimer of kinase dimers, could serve as a prototype of a proposed full length GNE hexamer. Structure comparison of the GNE kinase domain with previously studied proteins revealed potential substrate binding sites at the interlobar cleft and also the structural and functional importance of the signature zinc-binding motif of the ROK family. Four groups of missense mutations associated with hereditary inclusion body myopathy are classified and their effects on the enzymatic activity can mostly be explained by the structure model.

DNA cloning, protein expression and purification
The cDNA template encoding the kinase domain of GNE was codon optimized for overexpression in E. coli and synthesized commercially (Codon Devices, Inc.) The DNA fragment encoding GNE residues 406-720 was PCR amplified and subcloned into the pET28-MHL vector (gi:134105571) using an In-Fusion dry-down PCR cloning kit (ClonTech). Protein was overexpressed in E. coli BL21(DE3) CodonPlus-RIL cells (Stratagene) grown in terrific broth medium. The culture was grown at 37uC in a LEX bubbling system (Harbinger Biotech. & Engineering Corp.) until OD 600 reached 3.0. The temperature of the culture was then lowered to 15uC and the cells were induced with 0.5 mM isopropyl 1-thio-b-D-galactopyranoside and allowed to grow further overnight. Cells were harvested by centrifugation and flash frozen in liquid nitrogen and stored at 280uC until purification. Frozen cells were thawed and resuspended in 10 mM HEPES buffer (pH 7.5) containing 500 mM sodium chloride, 5% glycerol, 2 mM b-mercaptoethanol, and supplemented with 5 mM imidazole, and mechanically lysed using a microfluidizer (Microfluidics, model M-110EH) at 1,000 bar pressure. The lysate was clarified by centrifugation. GNE protein was bound with nickel-nitrilotriacetic acid (Ni-NTA) beads (Qiagen) at a ratio of 2.5 mL 50% Ni-NTA flurry  per litre of cell culture. The bound protein was washed twice with the same HEPES buffer containing 30 mM or 75 mM imidazole, and finally eluted with the HEPES buffer supplemented with 300 mM imidazole. The elutant containing the GNE protein was further purified by Supderdex-75 size exclusion chromatography (GE Healthcare). The eluted fractions were pooled, concentrated to a final concentration of 40 mg per mL, and stored in a buffer containing 10 mM HEPES, pH 7.5, 500 mM sodium chloride, 5% glycerol and 5 mM dithiothreitol. The purity of the protein was better than 95% judging from SDS-PAGE gel.
Selenomethionine (SeMet) labelling of the protein was carried out using prepacked M9 SeMet growth media kit (Medicilon) following manufacturer's instructions.

Crystallization and structure determination
The ligand-free form crystals were grown at room temperature in sitting drops. A final concentration of 5 mM ADP, 1:100 chymotrypsin (w/w) were added into the protein stock solution and 0.5 mL protein solution was mixed immediately with 0.5 mL well solution containing 15% polyethylene glycol (PEG) 4000, 0.2 M ammonium acetate, 0.1 M sodium citrate, pH 5.6 and set up for vapour diffusion crystallization. The SeMet crystal used for structure determination was grown in 14.55% PEG4000, 0.2 M ammonium acetate, 0.1 M sodium citrate, pH 6.0, with 1:100 chymotrypsin (w/w) and 5 mM ADP in a sitting drop setup. Crystals grew to a mountable size within 24 hours. Paratone oil was used to cryo-protect the crystals.
Diffraction data of a selenomethionyl derivative of the GNE kinase domain were collected at beamline 19ID of the Advanced Photon Source (APS) at a wavelength of 0.9792 Å . Initial phases were obtained by single wavelength anomalous diffraction with SOLVE and density modification with RESOLVE [25]. For model building, the phases were combined with data collected at APS beamline 23ID-B at a wavelength of 0.9793 Å (see Table 2). The refined model of the target resulted from iterative application of density modification with DM and RESOLVE, interactive model building with COOT [26], coordinate and B-factor refinement with REFMAC [27] and PHENIX [28], and geometry validation with MOLPROBITY [29]. Diffraction data and refinement statistics are summarized in Table 2. The current model was deposited at the Protein Data Bank with PDB ID 3EO3.

Supporting Information
Datapack S1 Standalone iSee datapack -contains the enhanced version of this article for use offline. This file can be opened using free software available for download at http://www.molsoft.com/ icm_browser.html. Found at: doi:10.1371/journal.pone.0007165.s001 (ICB) Text S1 Instructions for installation and use of the required web plugin (to access the online enhanced version of this article). Found at: doi:10.1371/journal.pone.0007165.s002 (PDF)