A Novel N-Acetylglutamate Synthase Architecture Revealed by the Crystal Structure of the Bifunctional Enzyme from Maricaulis maris

Novel bifunctional N-acetylglutamate synthase/kinases (NAGS/K) that catalyze the first two steps of arginine biosynthesis and are homologous to vertebrate N-acetylglutamate synthase (NAGS), an essential cofactor-producing enzyme in the urea cycle, were identified in Maricaulis maris and several other bacteria. Arginine is an allosteric inhibitor of NAGS but not NAGK activity. The crystal structure of M. maris NAGS/K (mmNAGS/K) at 2.7 Å resolution indicates that it is a tetramer, in contrast to the hexameric structure of Neisseria gonorrhoeae NAGS. The quaternary structure of crystalline NAGS/K from Xanthomonas campestris (xcNAGS/K) is similar, and cross-linking experiments indicate that both mmNAGS/K and xcNAGS are tetramers in solution. Each subunit has an amino acid kinase (AAK) domain, which is likely responsible for N-acetylglutamate kinase (NAGK) activity and has a putative arginine binding site, and an N-acetyltransferase (NAT) domain that contains the putative NAGS active site. These structures and sequence comparisons suggest that the linker residue 291 may determine whether arginine acts as an allosteric inhibitor or activator in homologous enzymes in microorganisms and vertebrates. In addition, the angle of rotation between AAK and NAT domains varies among crystal forms and subunits within the tetramer. A rotation of 26° is sufficient to close the predicted AcCoA binding site, thus reducing enzymatic activity. Since mmNAGS/K has the highest degree of sequence homology to vertebrate NAGS of NAGS and NAGK enzymes whose structures have been determined, the mmNAGS/K structure was used to develop a structural model of human NAGS that is fully consistent with the functional effects of the 14 missense mutations that were identified in NAGS-deficient patients.


Introduction
In most microorganisms, fungi, and plants, two different enzymes catalyze the first two steps in arginine biosynthesis, N-acetyl-Lglutamate synthase (NAGS, EC 2.3.1.1) and N-acetyl-L-glutamate kinase (NAGK, EC 2.7.2.8). However, in Xanthomonas campestris and some other bacteria, these reactions are catalyzed by a single bifunctional N-acetylglutamate synthase/kinase (NAGS/K), which has been proposed to have evolved from the fusion of ancestral NAGK and N-acetyltransferase [1,2]. In vertebrates, the major physiological role of NAGS seems to be to regulate flux through the urea cycle via activation of carbamyl phosphate synthase by Nacetyl-L-glutamate (NAG), the product of the NAGS reaction. Vertebrate NAGS do not have kinase activity, although they retain an amino acid kinase (AAK)-like domain. Vertebrate NAGS have 25-35% sequence identity with bacterial NAGS/K, more than their sequence identity with other bacterial, fungal or plant NAGS, which, like vertebrate NAGS, have a non-functional AAK domain, coupled to the N-acetyltransferase domain (NAT).
Many NAGS, NAGK, and NAGS/K enzymes are allosterically regulated by L-arginine. In organisms that have a linear arginine biosynthetic pathway such as Escherichia coli, the target of arginine feedback inhibition is NAGS. In organisms that have a cyclic pathway such as Pseudomonas aeruginosa, the main target of feedback inhibition is NAGK [3]. However, in terrestrial tetrapods, arginine is an activator of NAGS. The transition from inhibition to activation appears to have occurred when tetrapods migrated from sea to land and needed a robust system for eliminating ammonia [4]. In the only NAGS crystal structure that has been determined, that from Neisseria gonorrhoeae (ngNAGS), the arginine binding site is located in the AAK-like domain, and arginine binding is accompanied by substantial structural changes [5].
Although the crystal structures of several NAGK enzymes have been determined [6,7,8,9], NAGS enzymes have proven more challenging and the only NAGS structure that has been determined is that of Neisseria gonorrhoeae [10]. This enzyme is a member of the classical bacterial and plant NAGS family, and has less than 18% sequence similarity to mammalian NAGS enzymes. We have now succeeded in solving the structures of two NAGS/K enzymes, from Maricaulus maris (mmNAGS/K) and X. campestris (xcNAGS/K) that have substantially higher sequence similarity to vertebrate NAGS. In both crystals and solution, ngNAGS is consistently hexameric, while mmNAGS/K and xcNAGS/K, as determined herein, are tetrameric. Thus these new structures are of interest in terms of understanding the evolution and mechanisms of NAGK, NAGS, and NAGS/K enzymes, providing potential insights into human NAGS, and perhaps in developing a strategy for determining the crystal structure of the human enzyme.

Enzymatic activity
We have previously shown that xcNAGS/K has both NAGS and NAGK activity and that its NAGS activity, but not NAGK activity, is inhibited completely by 1 mM L-arginine [1]. As shown in Figure 1, we demonstrate here that mmNAGS/K has both NAGS (7.0 mmole/min/mg) and kinase activity (20.1 mmole/ min/mg) and that its NAGS activity is inhibited 14 fold by 1.0 mM L-arginine. NAGK activity is not inhibited by L-arginine at physiological concentrations; instead, slight activation is observed at 1 mM L-arginine. 76% of the maximal kinase activity is retained even at 20 mM of L-arginine. These results are consistent with those reported previously for xcNAGS/K [1], which has 7.0 mmole/min/mg kinase activity at pH 6.0 and is only slightly inhibited by 1 mM L-arginine. The kinase activity of mmNAGS/K is somewhat lower than that of Thermotoga maritima NAGK (tmNAGK) (45 mmole/min/mg), P. aeruginosa NAGK (130 mmole/min/mg) and E. coli NAGK (ecNAGK) (64 mmole/ min/mg) [11,12], perhaps because of different assay conditions. Arginine appears to regulate arginine biosynthesis in both X. campestris and M. maris mainly by inhibiting their NAGS activity, as is the case for E. coli [3]. This pattern may be characteristic of bacteria that have a linear arginine biosynthetic pathway, and do not have an ornithine N-acetyltransferase gene.

Cross-linking experiments
To determine the state of oligomerization of both xcNAGS/K and mmNAGS/K in solution, cross-linking experiments were performed with dimethyl suberimidate as the cross-linking agent. Four major bands were seen for both enzymes with SDS-PAGE, with molecular weights corresponding to oligomers of 1, 2, 3 and 4 subunits (Figure 2, Lane 3 and 4). Thus, xcNAGS/K and mmNAGS/K appear to exist primarily as tetramers in solution.
As noted in Methods, crystals of mmNAGS/K were obtained in trigonal space group P3 1 21, hexagonal space group P6 2 22 and orthorhombic space group P2 1 2 1 2 1 . Crystals of the first two forms diffracted poorly and were difficult to reproduce. Although the last crystal form was also difficult to reproduce, crystals that diffracted to ,3.0 Å were obtained (Table 1). Several MAD data sets collected from Se-Met substituted wild-type protein were not of sufficient quality to locate the selenium positions. To increase phasing power, three additional amino acids codons were mutated to methionine (I106M, I294M and L367M). Crystals from this mutant protein diffracted to 2.7 Å (Table 1) and phase Figure 1. Effect of L-arginine on the NAGS and NAGK activity of mmNAGS/K. A. NAGS specific activity (mmol/min/mg) was measured at 2.5 mM AcCoA and 10 mM L-glutamate in a buffer containing 50 mM Tris-HCl pH 8.5 and a range of L-arginine concentrations (0, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 5 and 10 mM). B. NAGK specific activity (mmol/min/mg) was measured at 20 mM ATP and 100 mM NAG in a buffer containing 100 mM NaCl, 40 mM MgCl 2 , 400 mM hydroxylamine, 20 mM Tris-HCl pH 7.4 and a range of concentrations of L-arginine (0, 1, 2, 4, 10 and 20 mM). Reactions were performed at 310 K for 30 min. doi:10.1371/journal.pone.0028825.g001 information was obtained from the MAD dataset. The asymmetric unit was found to consist of four subunits assembled as a tetramer, and unit cell parameters are consistent with four subunits per asymmetric units and 50% solvent content [13].
Cell parameters for the trigonal crystal form of mmNAGS/K are a = b = 95.1, c = 253.0 Å and b = 120u, consistent with two subunits per asymmetric unit and 42% solvent content. Even though this crystal form diffracts to only 4.3 Å resolution, the structural solution in space group P3 1 21 could be found by molecular replacement using the A-X dimer of the P2 1 2 1 2 1 crystal form as the search model. Packing analysis indicates that the 2-fold symmetry axis of the molecular tetramer is aligned with the 2-fold crystallographic axis perpendicular to the plane of the tetramer ring.

xcNAGS/K structure determination
Crystals of xcNAGS/K diffracted anisotropically and had high solvent content (75%) [2]. MAD data collected from these crystals allowed the selenium sites to be identified, but did not have sufficient resolution for reliable model building, particularly of the NAT domain, and subsequent model refinement. However, the low resolution electron density suggested that there is only one subunit in the asymmetric unit, and that four subunits assemble to form a tetramer aligned with crystallographic symmetries, so that the xcNAGS/K tetramer has exact P222 point symmetry. The molecular replacement solution with the mmNAGS/K subunit structure as the search model confirmed these conclusions and assignment to space group P6 2 22 (Table S1).
Since the cell parameters of the hexagonal crystal form of mmNAGS/K are similar to those of xcNAGS/K crystals, this structure would be expected to be isomorphous to the xcNAGS/K structure, with only one subunit in the asymmetric unit.

Structural variation within the tetramer
Within the P2 1 2 1 2 1 mmNAGS/K tetramer, there are significant differences between the subunits, indicating that they have sufficient innate flexibility to respond to different packing environments. The RMSD for superposition of the four subunits is 1.5-2.0 Å , and subunits X and A are better defined than subunits Y and B. The resolution of the electron density of several loop regions in subunit Y, such as H4-H5, B9-B10 and H8-H9, does not allow their structures to be modeled. The RMSD value decreases to 0.8-1.2 Å when only AAK domains are superimposed and to 0.4-0.7 Å if only NAT domains are superimposed, indicating that the AAK domain has more structural flexibility than the NAT domain. The RMSD value decreases further to 0.2-0.5 Å if only the core b-sheets are superimposed for both NAT and AAK domains, demonstrating conservation of these core structures. Structural variation among subunits in an asymmetric unit has been observed in P. aeruginosa NAGK, where the average RMSD was as high as 2.1 Å when all Ca atoms in the 12 subunits in the asymmetric unit were superimposed [8].
The structural variation between equivalent subunits in the native protein and Se-Met substituted mutant is much lower than between subunits within a tetramer. The RMSD between equivalent subunits is 0.4-0.5 Å for subunits A, B, and X, and 0.8 Å for subunit Y.

Structures of AAK and NAT domains
As shown in Figure 3A, each mmNAGS/K subunit has two domains, the AAK domain (residues 1-290) and the NAT domain (residues 292-441) connected by a hinge residue, Gly291.
The AAK domain has the typical AAK fold seen in other NAGK structures, with an eight-strand twisted b-sheet core (arranged as B3Q, B6Q, B2Q, B1Q, B9Q, B11Q, B12q, B10q), with three a helices (H5, H3 and H10) on one side and four a helices (H6, H7, H9 and H8) on the other, forming a a 3 b 8 a 4 sandwich structure ( Figure 3A). a-helix H4 extends from the C-edge of b strand B2. A long loop connecting H4 and H5 hangs over the C-edge of the b-sheet core and contains a putative NAG binding residue, Arg99, equivalent to Arg66 in ecNAGK. A small b-sheet consisting of four short b strands (arranged as B8q, B7Q, B4Q, B5q) protrudes from the C-edge of the core and forms part of the flat dimerization interface with H5, B3 and H6.
The NAT domain has a characteristic GCN5-related Nacetyltransferase (GNAT) fold with a central twisted seven-strand b-sheet surrounded by six a helices ( Figure 3A). The central b-sheet consists of a four-strand anti-parallel sheet (N-terminal arm, residues 292-380, arranged as B13Q, B14q, B15Q, B16q) and a threestrand anti-parallel sheet (C-terminal arm, residues 381-441, arranged as B17q, B19Q, B18q). These two sheets form a V-shaped structure, with adjacent b-strands (B16 and B17) parallel. V-shaped central b-sheets are characteristic of the GNAT family and are believed to be essential for catalysis of the acetyl transferase reaction [15,16,17]. The best match with the NAGK/S NAT domain found in a structural neighbor search is the GNAT protein that catalyzes acetylation of the ribosomal protein S18 (PDB 2cnm; Z = 13.2, RMSD = 2.4 Å ) with over 123 aligned residues and 19% sequence identity [18]. GNAT proteins have highly variable sequences, probably because bound AcCoA interacts primarily with backbone rather than side-chain atoms, and, as a result, the 7 th best match (PDB 3ne7; Z = 12.0, RMSD = 2.4 Å ) with over 119 aligned sequences, has only 6% sequence identity. Although the NAT domain from ngNAGS is not in the top 50 matches, superimposition of the NAT domain from mmNAGS/K with that of ngNAGS resulted in an RMSD of 2.5 Å with 112 aligned residues and 15.2% sequence identity ( Figure 3C). The Vshaped core structure of the central b-sheet in the NAT domain is similar in both structures [10]. However, there are significant differences, particularly in the C-terminal arm. The loop that links b-strands B18 and B19 is much shorter in mmNAGS/K than in ngNAGS, which has two extra helices (H149 and H14). Instead, the structure of mmNAGS/K has a long H15 a-helix, occupying the position equivalent to H14 of ngNAGS and replacing its B24-B25 linker. Since this linker contributes Arg425 and Ser427 to the glutamate binding site of ngNAGS, mmNAGS/K must bind glutamate with other residues.

Tetramer structure
The four subunits of the mmNAGS/K tetramer form an elongated 35-74 Å thick ring with a long axis of 140 Å and a short axis of 108 Å ( Figure 4A). The tetrameric structures of the other two crystal forms (space groups: P6 2 22 and P3 1 21) are similar ( Figure S1A). Since the same tetramer is formed regardless of crystal conditions and crystal packing, it would be expected to be the predominant form in solution, and cross-linking experiments confirm that xcNAGS/K and mmNAGS/K exist as tetramers in solution ( Figure 2).
Interface interactions between subunits are extensive and of three types. There are two types of interactions between AAK domains-interactions between adjacent subunits (A-B or X-Y) within the ring, and interactions between subunits located on The first AAK-AAK interface, between adjacent subunits in the ring, is similar to the dimerization surface of ecNAGK and other enzymes of the AAK family [19] ( Figure 4B). However, the specific interactions are unique to mmNAGS/K, and different from those of any dimer interface previously reported for any NAGK or any other members of the AAK family, such as carbamate kinase, glutamate 5-kinase and UMP kinase [19,20,21]. Here, the interface consists of b strand B3 and two a helices, H5 and H6, arranged as an aba sandwich. The interacting a-helices, H5 and H6, (equivalent helices aC and aD in ecNAGK), and b-strand, B3, are almost parallel to the equivalent elements from the adjacent subunit. Residues from H6 interact with those from H5 of the adjacent subunit via numerous hydrophobic and hydrogen bonds. Since the distance between the two b-strands (B3) from adjacent subunits is about 5.0 Å , their backbone atoms cannot hydrogen Figure 4. Structure of the mmNAGS/K tetramer and interfaces between subunits shown as ribbon diagrams. A. The tetramer is shown in two different orientations, perpendicular to the plane of the ring, and parallel to the plane of the ring. Subunit A (red), subunit B (green), subunit X (purple -gray) and subunit Y (yellow). Bound CoA molecule is shown as a space-filling model. The two 2-fold non-crystallographic rotation axes in the plane of the ring are indicated by arrows and the 2-fold non-crystallographic rotation axis perpendicular to the plane of the ring is indicated by a filled oval. B. ecNAGK-like AAK-AAK domain interface between subunits A and B; a-helices, H5 and H6, and b-strands, B3 and B8, form this interface. C. N-terminal helix interface between subunits A and X, formed by interactions between the two N-terminal a-helices and two neighboring helices, H3 and H10. D. NAT-NAT domain interface between subunits A and X. Two a-helices, H14 and H15, and one b-strand, B18, form this interface. doi:10.1371/journal.pone.0028825.g004 bond directly. Instead, a small b-strand, B8, interacts with the equivalent element from the adjacent AAK domain in an antiparallel mode to form an interface unique to mmNAGS/K ( Figure 4B). In total, approximately 20 hydrogen bonds, three pairs of salt bridges (Arg168-Asp122, Arg110-Glu140, and Arg136-Glu180) and numerous hydrophobic interactions are involved in forming this interface.
The second interface, between the AAK domains of subunits on opposite sides of the ring, involves interactions between parallel helices of the N-terminal segment ( Figure 4C), analogous to the interactions of N-terminal helices of the AAK domains of ngNAGS and ecNAGK; however, the specific interactions are different. The calculated inaccessible surface of 968 Å 2 for the A-X subunit interface is larger than that of 555 Å 2 for the B-Y subunit interface. The smaller value for the latter interface may reflect the distinctive conformation of the subunit Y, and/or the increased disorder of its N-terminal helices relative to the other three subunits, which led to fewer residues being included in the structural model. The residues involved in this interface interaction are mainly hydrophobic and include Ile12, Leu15, Leu16, Met19 and Phe63, as well as one salt bridge between Arg20 and Asp21 of opposing subunits and two hydrogen bonding interactions, between Met19 O and Arg276 NH2, and between His18 NE2 and Ser59 OG. In contrast, in ngNAGS and arginine sensitive NAGK structures, the N-terminal helices interlace with adjacent AAK domains to form hexameric structures [6,7,8,10].
At the third interface, the C-terminal arms of NAT domains of opposite subunits (A-X and B-Y) form a continuous 6-strand antiparallel b-sheet ( Figure 4D). This extensive interface has a buried area of 1485 Å 2 . The primary interactions consist of backbone hydrogen bonding between equivalent B18 b-strands and cationp-p-cation stacking interactions between alternating side chains of Phe398 and Arg406 from opposite subunits. Similar interfaces are found in other enzymes with GNAT folds that form dimers in solution using their C-terminal arms as an interface [15,17]. In contrast, no NAT-NAT domain interactions were identified in the ngNAGS structure [10].
The putative NAGK active site and arginine binding site in the AAK domain Superimposition of the AAK domain of the mmNAGS/K structure with the ecNAGK (PDB 1gs5) structure [9] strongly suggests that the substrate binding sites and catalytic mechanism of their NAGK reactions are very similar. Active site residues K8, D181 and K217 that interact with the phosphate groups of ATP in ecNAGK are also found in mmNAGS/K (K44, D192 and K249, respectively) and are located in similar positions ( Figure S2A). Active site residues Gly185 and Gly213 which are part of the ATP binding pocket in ecNAGK, are also conserved (equivalent residues Gly215 and Gly245 in mmNAGS/K). Other important residues such as Gly11 and Gly44 (equivalent residues Gly47 and Gly77 in mmNAGS/K), whose backbone nitrogen atoms hydrogen bond to the c-phosphate group of ATP or the phosphate group of NAG phosphate in ecNAGK, are conserved as well. Key residues involved in binding NAG are also conserved (G64, R66 and N158 in ecNAGK, equivalent residues G97, R99 and N190 in mmNAGS/K) ( Figure S2A).
The structure of this binding site is also conserved in arginine sensitive NAGK and ngNAGS [5,8]. The corresponding site in mmNAGS/K consists of the cavity formed by the loop connecting helix H10 and b strand B12 (residues 277-287) ( Figure S2B). In tmNAGK, Arabidopsis thaliana NAGK (atNAGK), and ngNAGS structures, residues in the N-terminal helix (H1) (Tyr15 in tmNAGK, Lys32 and Phe33 in atNAGK, Tyr17 in ngNAGS) form part of arginine binding site [5,7,8]. In mmNAGS/K, Tyr28 and nearby residues in the second N-terminal helix, H2, are positioned to play this role.
The mechanisms by which arginine binding strengthens interdomain interactions in ngNAGS and mmNAGS/K may also be similar. In ngNAGS, arginine binding enhances AAK-NAT domain interactions by creating new hydrogen bonding interactions between Arg255-Asp334, Tyr17-Asn336 and Arg274-Gln362 [5]. The arginine binding site in mmNAGS/K is close to the NAT domain ''P-loop'' which is likely to interact with the pyrophosphate group of AcCoA. This creates the possibility of bound arginine or nearby residues in the AAK domain interacting with residues such as Glu366 in the ''P loop'', thereby regulating the binding of AcCoA.
Since mutations in the arginine binding site of the kinase domain of both mouse NAGS and xcNAGS/K [4] eliminate arginine's effect on NAGS activity, bifunctional NAGS/K would not be expected to have two arginine binding sites, one affecting NAGS activity and the other one affecting NAGK activity. Although arginine has been reported to inhibit the NAGS activity of the short version of NAGS, which consists of only the NAT domain, from Mycobacterium tuberculosis [24], it has not yet been shown that this NAGS domain has a specific arginine binding site.

Non-productive and putative CoA binding sites
Although both native and mutant mmNAGS/K were incubated with 25 mM CoA before crystallization, only one bound CoA molecule per tetramer, with incomplete occupancy, was identified in the cleft between the NAT domains of subunits B and Y in the mutant structure ( Figure 6A). In the native structure, this CoA molecule was visible, as well as a second molecule in the cleft between subunits A and X ( Figure S1B). This CoA binding site does not correspond to the CoA binding site in ngNAGS [10] and other GNAT enzymes [17,25] and is unlikely to be functional, since most of the residues forming the binding site are not conserved and there is no glutamate binding site close by. The suboptimal pH and high ionic strength of the crystallization conditions may be a factor in CoA binding to a non-functional site rather than the native active site. Substrate binding at a nonfunctional site rather than the active site has been observed in many other enzymes [26,27].
Although sequence similarity among GCN5-fold acetyltransferases is low (6-12% sequence identity), the AcCoA binding site is conserved across all enzymes that have been studied [25,28]. Therefore, the protein residues that interact with AcCoA in mmNAGS/K can be predicted by superimposing other known acetyltransferase structures [5,10,18] (Figure 3C and Figure S2C). The ''P-loop'' (Arg364-Gly365-Glu366-Gly367-Leu368-Gly369) is consistent with the consensus sequences [Q/R]-x-x-G-x-[G/A] [29] characteristic of AcCoA binding sites and is located close to the pyrophosphate moiety. The S-acetylpantetheine moiety of AcCoA probably forms a pseudo-antiparallel b-sheet interaction with B16, positioning the acetyl group in almost the same plane as the b-sheet. Tyr397 and Ser387 are within hydrogen bonding distance of the sulfhydryl group of AcCoA and could act as the active site acid and base, respectively. A highly conserved tyrosine, which is equivalent to Tyr397 in mmNAS/K, has been identified in several other GCN-5 related protein structures and has been considered to be catalytically important [28].

Glutamate binding site
Although the sequence similarity between mmNAGS/K and ngNAGS is too low to locate the glutamate binding site, structure alignment enables the probable site to be identified. In all known acetyltransferase structures, the substrate that accepts the acetyl group from AcCoA approaches from the opposite side of the bsheet (the same side as a-helix H15) for in-line nucleophilic attack on the Re face of the acetyl group of AcCoA. Therefore, the glutamate site in mmNAGS/K is likely to correspond to this site in ngNAGS (PDB 3d2m) [10]. In the native mmNAGS/K structure, there is residual electron density in subunits A and Y at this site that can be modeled as L-glutamate ( Figure 6B) suggesting that this site is likely to be the biologically relevant L-glutamate binding site. The equivalent site in the Se-Met substituted mutant structure has residual electron density that can be modeled as malonate which was present at a high concentration in the crystallization buffer.
The c-carboxyl group of L-glutamate appears likely to be anchored by Arg386, Lys356 and Thr436 ( Figure 6B), while the acarboxyl group may be fixed by hydrogen bonding interactions with the main chain N atom of Phe357 and the main chain O atom of Arg386. The side chains of hydrophobic residues Phe316, Leu437 and Trp410 are also part of the L-glutamate binding site. The residues involved in binding L-glutamate are highly conserved. Lys356 and Phe357 are part of the conserved motif Tyr353-Leu354-Asp355-Lys356-Phe357 (YLDKF), and Arg386 is part of the conserved motif Trp385-Arg386-Ser387-Arg388 (WRSR). These conserved motifs are found in all bifunctional NAGS/K, as well as vertebrate and fungal NAGS, implying a common binding mechanism ( Figure 5). Similarly, Tyr397, Ser387 and Asn391, which have roles in the catalytic reaction, are also highly conserved, implying a common catalytic mechanism.

Protein flexibility encoded intrinsically in the structure
The variability in the structures of the four subunits in the asymmetric unit of the P2 1 2 1 2 1 crystal and in crystals obtained under different conditions and in different space groups provides an opportunity to study the conformational range of NAGS/K and its relationship to the catalytic and regulatory mechanisms. When the b-sheet cores of the AAK domain of the Se-Met substituted mutant are superimposed, it is immediately apparent that the NAT domain can adopt different orientations relative to the AAK domain ( Figure S3A). Relative to subunit Y, the NAT domains of subunits B, X, and A are rotated 25.2u, 24.7u, and 16.9u toward the AAK domains, respectively. To test whether the difference in relative domain orientation in subunits Y and B might be related to L-arginine binding, CoA and L-arginine were modeled in their proposed binding sites. The clefts between the AAK and NAT domains of subunits B and X are in a closed conformation, creating a steric clash between bound CoA and the arginine binding loop (residues 281-287) ( Figure S3B). This clash does not exist in subunit Y ( Figure S3C). The closed conformation of subunits B and X may represent the conformation that exists when L-arginine is bound, while the open conformation of subunit Y and probably subunit A may represent the active form without L-arginine bound. Interestingly, in native mmNAGS/K, glutamate can only be identified in subunit Y and A ( Figure S1B).
There may also differences in the inter-lobe movements of the AAK domains of different mmNAGS/K subunits and between mmNAGS/K and xcNAGS/K. AAK inter-lobe movement has been demonstrated in ecNAGK [30]; upon ATP binding the Cterminal lobe rotates 24u-28u towards the N-terminal lobe, where the NAG binding site is located. The conformations of the AAK domain in our structures are all in the open conformation, which is consistent with structures without ATP or ADP bound.
In addition to the large inter-domain rotation between the AAK and NAT domain, and inter-lobe movement within the AAK domain, several loops in the AAK domain may exhibit large movements. Specifically, the NAG binding loop (residues 89-104), which contains a NAG binding residue, Arg99 ( Figure S2A), would be expected to move more than 5.0 Å when NAG binds. In the mmNAGS/K structure, the NAG binding loop is very flexible, reflected by a weak electron density.
In contrast to the AAK domain, the NAT domain appears to be less flexible with the inter-arm rotation among different subunits varying by only 2-3u. However, the ''P-loop'' (residues 360-370), which is likely to be involved in binding of the pyrophosphate group of AcCoA, probably moves ,1.0 Å when the substrate binds.

Human NAGS model
Since the primary sequences of mmNAGS/K and human NAGS have ,31% identity while the sequence identity of ngNAGS and human NAGS is only 17%, mmNAGS/K is likely to be a more reliable structural model for human NAGS than ngNAGS [5,10]. The human NAGS structure built with mmNAGS/K as the model using Swiss-model web server [31,32,33] is shown in Figure 7, with naturally occurring missense mutations identified in patients with clinical hyperammonemia shown as spheres. Among 14 missense mutations, 6 are located in the AAK domain and 8 are located in the NAT domain. The four neonatal missense mutations (S410P, L430P, W484R and A518T) are all located in the NAT domain close to the putative substrate binding sites. The model predicts that the side chain of Cys200 will be close to the side chain of Cys259 and could be potentially form a disulfide bond as previously predicted [34]. It also predicts that the arginine binding site and AcCoA binding site will be close to each other, and that the orientation of the NAT domain relative to AAK domain will be intermediate between those of subunits Y and B in the mmNAGS/K structure. The model predicts that arginine binding will not cause closure of the AcCoA binding site because of the steric restraints imposed by the non-glycine hinge residue (Ala375). This prediction is consistent with arginine's role as an allosteric activator of human NAGS.

Discussion
Although the subunit structures of bifunctional NAGS/K and classical bacterial NAGS have some similarities, there are substantial differences between them. Both have two-domain structures, consisting of a typical AAK fold and a GCN5-related NAT fold. The AAK domains are very similar, except for an extra N-terminal helix ( Figure 3B) in mmNAGS/K. However, there are significant differences in the structures of the NAT domain, particularly in the C-terminal arm ( Figure 3C). Importantly, the linker between the two domains consists of one amino acid in mmNAGS/K vs. three amino acids in ngNAGS, allowing much stronger interdomain interactions in mmNAGS/K than in ngNAGS, and different relative domain orientations ( Figure  S4A-B). As a result, the putative arginine and AcCoA binding sites are in proximity only in mmNAGS/K, creating the possibility of allosteric interactions between the binding sites.
However, the largest difference between mmNAGS/K and ngNAGS involves the quaternary structure. While the mmNAGS/ K holoenzyme is organized as a tetrameric ring, ngNAGS functions as a hexamer ( Figure S4C-D) [10]. The tetramer of mmNAGS is formed by AAK-AAK and NAT-NAT domain interactions which are unique to mmNAGS/K, while the hexamer of ngNAGS is stabilized by interactions between AAK domains at two major interfaces (the ecNAGK-like interface and the argininesensitive NAGK-like N-terminal interlaced helix interface) without involving the NAT domain. Arginine sensitive bacterial NAGKs that do not have a NAT domain have similar hexameric ring structures [8] confirming that a NAT domain is not important in the hexameric quaternary structure.
The mechanism of NAGS activity regulation by arginine appears also to be different in the two enzyme groups. In ngNAGS, the conformational changes induced by arginine propagate from the AAK domain to the NAT domain via the interdomain linker, re-orienting the NAT domain, and as a consequence disordering the glutamate binding loops to reduce enzyme activity [5]. In contrast, in mmNAGS/K, the interdomain interaction is stronger and the marked relative domain rotation proposed to occur upon L-arginine binding would close the AcCoA binding cleft preventing AcCoA from binding. The proposed regulatory mechanism of arginine is shown in Figure 8A. These differing allosteric mechanisms for arginine in the two enzyme groups are consistent with differences in arginine titration experiments. While arginine decreases ngNAGS activity by decreasing glutamate binding affinity [5], in xcNAGS/K and mmNAGS/K, arginine binding probably prevents binding of AcCoA [1,4].
As discussed above, and consistent with the above mechanisms, the length of the inter-domain linker appears to play a key role in the strength of the interactions between the AAK and NAT domains and the proximity of the arginine and CoA binding sites to each other. In addition, sequence comparisons indicate a clear correlation between the sequence of the linker and the allosteric effect of arginine ( Figure 8B). In all bifunctional NAGS/K and fish NAGS (fugu fish, zebra fish and tetraodon) in which arginine inhibits NAGS activity, the linker consists of a glycine. When the effect of arginine is neutral, as in frogs, or activating, as in mammals, the linker contains an alanine, cysteine, or threonine, but never glycine. It appears that the steric constraints imposed by a non-glycine residue limit the magnitude of the conformational changes that can be induced by arginine and give rise to the variable allosteric effect of arginine on NAGS activity. Partial inhibition of NAGS activity by arginine in fish is further influenced by neighboring amino acid residues [4]. Even though the differences in the linker appears to play major role in the variable arginine effect on NAGS activity, other residues such as Arg276, His281, Glu366 which are located in NAT and AAK interface may also be involved.
Twenty one deleterious mutations that cause human disease (hyperammonemia) have been reported to date [34]. Among them 14 are missense substitutions that may have structural relevance. Since patients with NAGS deficiency can be rescued by the administration of the NAG analogue, N-carbamylglutamate, identifying deleterious mutations that are likely to produce hyperammonemia is useful clinically [35,36]. It is therefore important to be able to predict the functional effects of amino acid substitutions on structure and function of the enzyme. In this regard a reliable protein structure of mammalian NAGS would be very useful. Unfortunately, mammalian NAGS has proven recalcitrant to crystallization and the current mmNAGS/K structure provides the most reliable model to date for human NAGS. This model allows the potential impact of NAGS mutations on structure and function to be examined.
Known naturally occurring amino acid substitutions predicted from mutations in the NAGS gene have been mapped onto a human NAGS model (Figure 7) based on the mmNAGS/K structure [34]. Since the active site is located within the NAT domain, while the AAK domain has only structural and regulatory roles, with arginine enhancing NAGS activity in human NAGS, mutations in the NAT domain might be expected to have more severe functional and clinical consequences, presenting clinically at birth or shortly thereafter, while mutations in the AAK domain might be expected to allow residual enzymatic activity and to have milder phenotypes. Indeed, 5 of the 6 mutations in the AAK domain are associated with a milder (late onset) phenotype, while the age of onset of the patient with the sixth mutation (G236C) is unknown. In contrast, 4 of the 8 missense mutations in the NAT domain are associated with a severe (neonatal onset) phenotype.

Cloning and protein expression and purification
The argA/B gene was PCR amplified from M. maris MCS10 genomic DNA (generously donated by Dr. Craig Stephens, Biology Department, Santa Clara University, 500 El Camino Real, Santa Clara, CA) using Phusion TM polymerase (Finnzymes, New England BioLabs) and the primers 59-CATATGAATCC-GAATGCACCGGG-39 and 59-GGATCCTCATTGCGGCG-CCTCAAGGGT-39. The PCR product was cloned into a TOPO vector using a Zero-Blunt TOPO cloning kit (Invitrogen). NdeI and XhoI (New England BioLabs) were used to transfer the argA/B gene from TOPO to the expression vector pET28a (Novagen) using T4 DNA ligase (Invitrogen) which was then transformed into Rosetta 2 cells (Novagen) for expression. An ABI PRISM 3130 Genetic Analyzer (Applied Biosystems) was used to confirm the sequence by DNA sequencing. Protein overexpression was induced by incubating overnight with 0.2 mM IPTG at room temperature (,298 K). The expressed protein has 20 non-native amino acid residues (MGSSHHHHHHSSGLVPRGSH) at its N-terminal including six His residues and a thrombin recognition site (underlined). The cells were harvested by centrifugation, suspended in 30 ml of Ni-affinity lysis buffer (20 mM NaH 2 PO 4 , 300 mM NaCl, 10% (v/v) glycerol, 10 mM b-mercaptoethanol, pH 7.4) and disrupted by sonication. The protein was then purified using an Ä KTA FPLC system (GE Healthcare) following the protocol described previously [2]. Protein concentrations were determined by the Bradford method using the BioRad protein-assay dye reagent with bovine serum albumin as a standard [37].
The Se-Met substituted NAGS/K protein was prepared using the Overnight Express Auto-induction System 2 (Novagen) as described previously [2]. In brief, the expression plasmid was transformed into a metE minus host strain B834(DE3) (Novagen). The clone was inoculated into 1L sterile deionized water supplemented with the chemicals provided in the kit, 125 mg Lselenomethionine and 50 mg kanamycin. Vitamin B12 (cyanocobalamin) was added to a final concentration of 100 nM and the culture was incubated at 303 K for 16 hours. After reaching stationary phase, the cells were harvested and the protein was purified as described above for the native protein.
The Se-Met substituted protein was characterized using a 4700 ABI TOF/TOF mass spectrometer (Applied Biosystem) operated in reflection positive ion mode as described previously [38]. Approximately 10 mg of purified native or Se-Met substituted protein were digested overnight at 310 K using Promega sequencing grade trypsin (enzyme/protein ratio, 1:50, w/w) in 50 mM ammonium bicarbonate (pH 7.4). After desalting, 0.3 mL of the resulting peptide solution was mixed with 0.3 mL of saturated a-hydroxycinnamic acid and spotted on the MALDI plate. The Se-Met substituted peptides were identified using the characteristic features of the isotopic distribution of selenium. The intensity was compared to the native peptide signal at a position of 257 Da to establish that more than 80% of the protein was Se-Met substituted.

Site-directed mutagenesis
Site-directed mutant genes of M. maris NAGS were created by utilizing primers containing the desired mutations (Table 2) and Quik Change Mutagenesis Kit according to the manufacturer's protocol (Strategene). Initially, a thermal cycle was applied to denature double-stranded M. maris NAGS/K wild-type plasmid, and then the appropriate mutagenic primer was annealed to it. PfuTurbo DNA polymerase from the Quik Change kit was used to extend the primer without primer displacement and to seal nicks. The product was treated with DpnI to digest parental plasmids, which are susceptible because of their methylated DNA. Then, transformation into XL10-Gold ultracompetent cells allowed conversion of mutated single stranded DNA to double stranded plasmid DNA. Finally, the sequences of mutant DNA sequences were verified by an ABI PRISM 3130 Genetic Analyzer (Applied Biosystems) using the commercially available primer pair annealed to the plasmid promoter and terminator regions.
To increase the number of methionine residues available for experimental phasing, Ile106, Ile294 and L376 were mutated to Met simultaneously, using the sequences of homologue protein from other bacteria as a guide. The mutant was overexpressed and purified in the same way as wild-type protein.

Biochemical characterization: Activity assays and cross-linking experiments
The NAGS and NAGK activities of mmNAGS/K were measured in the presence of different concentrations of L-arginine using the method described previously [1]. In NAGS assays, 0.16 mg of enzyme were incubated in 100 ml of assay solution containing 2.5 mM AcCoA, 10 mM L-glutamate and 50 mM Tris-HCl pH 8.5 at 293 K for 5 min. The reaction was stopped with 100 ml of 30% TCA. NAG was quantified using liquid chromatography-mass spectrometry. The arginine titration curve was obtained using different concentration of L-arginine. For NAGK activity, the enzyme (0.2 mg) was incubated in the 500 ml assay buffer containing 20 mM ATP, 100 mM NAG, 100 mM NaCl, 40 mM MgCl 2 , 400 mM hydroxylamine and 20 mM Tris-HCl pH 7.4 for 30 min at 310 K. The reaction was terminated by adding 450 ml ferric chloride solution (5% FeCl 3 , 5% TCA and 0.3 M HCl). The absorption for the colored reaction mixture was measured at 540 nm. Cross-linking experiments were performed using the protocol described by Davies and Stark [39]. mmNAGS/K and xcNAGS/K (0.15 mg) were incubated with cross-linking reagent dimethyl suberimidate (0.25 mg) in 50 ml solution containing 200 mM triethanolamine, pH 8.25 three hours at 298 K. Samples with and without cross-linking reagent were subjected to sodium dodecyl sulfate polyacrylamide get electrophoresis (NuPAGE 4-12% Bis-Tris gel) in MES SDS buffer (50 mM MES, 50 mM Tris base, 0.1% SDS, 1 mM EDTA, pH 7.3) and stained with Coomassie blue. Benchmark with premixed different molecular weights of protein standards was purchased from Invitrogen.

Crystallization
The purified protein was concentrated to 16 mg/ml with an Amicon-Y30 membrane concentrator (Millipore). Screening for crystallization conditions was performed using sitting-drop vapor diffusion in 96-well plates (Hampton Research) at 291 K by mixing 2 ml of the protein solution with 2 ml of the reagent solution from the sparse matrix Crystal Screens 1 and 2, and Index screen (Hampton Research). Further optimizations of the crystallization conditions were carried out using the hanging-drop vapor diffusion method.
Before crystallization, the enzyme was incubated with 25 mM CoA and 100 mM glutamate at 277 K for 1 hour. Different crystallization conditions yielded several different crystal forms: orthorhombic (space group P2 1 2 1 2 1 ), trigonal (space group P3 1 21) and hexagonal (space group P6 2 22). The best orthorhombic form crystals for native and NAGS were grown from a well solution containing 25% PEG3350, 200 mM NaCl and 100 mM Bis-Tris, pH 6.5. The best crystallization conditions for the Se-Met substituted mutant were 25% PEG3350, 200 mM sodium malonate, pH 7.0. The trigonal form crystals were obtained from a well solution containing 23% PEG400, 100 mM Bis-Tris pH 6.5 and 1 mM 5-amino-2,4,6-triiodoisophthalic acid. Hexagonal form crystals were produced from a solution containing 25% PEG3350, 100 mM Li 2 SO 4 and 100 mM Tris pH 9.0 and have the same morphology as the hexagonal bipyramid crystals of xcNAGS/K [2].

Data collection and structure determination
Before data collection, crystals were transferred from the cover slip on which they were grown to a well solution supplemented with 25% ethylene glycol and then frozen by direct immersion into liquid nitrogen. Data sets for Se-Met substituted proteins were collected at the selenium adsorption edge, the inflection point and a remote position at the SER-CAT advanced Photon Source. Data sets for the native crystals were collected to ,3.1 Å resolution for the orthohombic crystal form and ,4.3 Å resolution for the trigonal crystal form, respectively. All data were processed using the HKL2000 package [40]; statistics are summarized in Table 1. The diffraction data for the hexagonal crystal form of xcNAGS/K were reported previously [2].
The mmNAGS structure was solved using the three wavelength MAD (3W-MAD) protocol of Auto-Rickshaw: the EMBL-Hamburg automated crystal structure determination platform [41]. The input diffraction data were prepared and converted for use in Auto-Rickshaw using programs in the CCP4 suite [42]. The structure-factor amplitudes of anomalous scatterers (F A values) were calculated with the program SHELXC [43]. Based on an initial analysis of the data, the maximum resolution for substructure determination and initial phase calculation was set to 3.2 Å . The 25 Se atoms were identified using the program SHELXD [44] and the correct hand for the substructure was determined using the programs ABS [45] and SHELXE [43]. Occupancies of all substructure atoms were refined and initial phases were calculated with MLPHARE [42]. Density modification, phase extension and NCS-averaging were performed with RESOLVE [46]. A partial a-helical model contained 1503 residues out of a total of 1764 was produced with HELICAP [47]. After model adjustments with Coot [48], structural refinements were performed with Phenix [49]. During the initial stages of the refinement, NCS restraints were used and R and R free dropped to 32.0 and 42.6%, respectively, but did not decline further. In subsequent refinements, NCS restraints were removed and the structural models for each subunit were adjusted individually, revealing significant conformational differences between subunits. The final refinement without NCS restraints, but with translation/libration/screw parameters [50] included resulted in R and R free values of 18.9% and 25.6%, respectively. The rather large difference between R and R free is probably due to anistropic diffraction. The translation/libration/screw groups were selected based on five structural regions per subunit as shown in Figure 1A. The final model contains 4 protein subunits, 1 CoA, 2 malonates, 2 ethylene glycols, 1 sulfate group and 84 water molecules. The native structure was refined and modeled in the same way as the Se-Met substituted protein, but with a new set of random reflections chosen for the calculation of R free . The final native structure model has 4 protein subunits, 2 CoAs, 2 glutamates and 6 water molecules. Refinement statistics for the final refined model are given in Table 1.
The structure of the trigonal crystal form was solved in space group P3 1 21 with the dimer of subunits A and X of the orthorhombic crystal form as the search model and the program Phaser [51,52]. Rigid body refinement brought R and R free values to 42.9% and 43.1%, respectively, confirming the correctness of the structural solution. Further refinement with the reference model (Se-Met substituted mmNAGS/K structure) restraints resulted in R and R free values of 27.4% and 41.9%, respectively (Table S1). The structural refinement greatly improved when additional restraints from known homologous structures were introduced [53].
The structure of the hexagonal form of xcNAGS/K was solved in space group P6 2 22 using CaspR, the web-server for automatic molecular replacement [54] and the structure of mmNAGS/K as the search model. The solution from molecular replacement is consistent with the electron density map constructed using experimental phases from MAD data [2]. Rigid body refinement reduced R and R free values to 49.6% and 48.7%, respectively. After mutating residues from the mmNAGS/K sequence to the xcNAGS/ K sequence and manual model rebuilding, further refinement brought R and R free values down to 31.9% and 38.4%, respectively. Since the dataset was collected at the selenium edge from Se-Met substituted crystals and contained anomalous signals, scattering factors were included in the refinement. The refinement improved significantly for low resolution data with the inclusion of anomalous diffraction data, as reported [55]. Data collection and final refinement statistics for the trigonal crystal form of mmNAGS/K and hexagonal form of xcNAGS/K are listed in Table S1.

Structural modeling
The structural model for human NAGS subunit was built using the Swiss-Model web server and the mmNAGS/K structure (subunit A) as the template [31,32,33]. The model, checked using program PROCHECK [56], has good stereo geometry with 86.2% dihedral angles in the most favored region of the Ramachandran plot and 11.9% in the generally allowed region. There are 9 bad contacts in the model. Coordinates for human NAGS model are provided in Supporting Information S1 (hNAGS-model.pdb).

Protein Data Bank accession numbers
The final refined coordinates of the native and Se-Met substituted mutant structures of mmNAGS/K in orthorhombic space group P2 1 2 1 2 1 and in trigonal space group P3 1 21, and the Se-Met substituted structure of xcNAGS/K in hexagonal space group P6 2 22 have been deposited in RCSB Protein Data Bank with accession codes, 3S6H, 3S6G, 3S7Y and 3S6K, respectively. Figure S1 A. Molecular packing of mmNAGS in the unit cell in space groups P2 1 2 1 2 1 and P3 1 21, and of xcNAGS/K in space group P6 2 22. Different tetramers are shown in different colors. B. Ribbon diagram of native mmNAGS/K tetramer structure with subunit A (red), subunit B (green), subunit X (purple -gray) and subunit Y (yellow). Two bound CoA and glutamate molecules are shown as space-filling models. Glutamate binding site is remote from the non-functional CoA binding site. (TIF) Table S1 Diffraction data and refinement statistics for mmNAGS/K and xcNAGS/K in space groups P3 1 21 and P6 2