Structure of a murine norovirus NS6 protease-product complex revealed by adventitious crystallisation.

Murine noroviruses have emerged as a valuable tool for investigating the molecular basis of infection and pathogenesis of the closely related human noroviruses, which are the major cause of non-bacterial gastroenteritis. The replication of noroviruses relies on the proteolytic processing of a large polyprotein precursor into six non-structural proteins (NS1–2, NS3, NS4, NS5, NS6pro, NS7pol) by the virally-encoded NS6 protease. We report here the crystal structure of MNV NS6pro, which has been determined to a resolution of 1.6 Å. Adventitiously, the crystal contacts are mediated in part by the binding of the C-terminus of NS6pro within the peptide-binding cleft of a neighbouring molecule. This insertion occurs for both molecules in the asymmetric unit of the crystal in a manner that is consistent with physiologically-relevant binding, thereby providing two independent views of a protease-peptide complex. Since the NS6pro C-terminus is formed in vivo by NS6pro processing, these crystal contacts replicate the protease-product complex that is formed immediately following cleavage of the peptide bond at the NS6-NS7 junction. The observed mode of binding of the C-terminal product peptide yields new insights into the structural basis of NS6pro specificity.


Introduction
Noroviruses are members of the calicivirus family of positive sense RNA viruses. In humans noroviruses cause rapid onset diarrhoea and vomiting, a condition often referred to as gastric flu. Noroviral gastroenteritis is estimated to affect 21 million people annually in the United States [1], and to be responsible for up to 200,000 deaths a year in developing nations [2]. Despite the obvious medical importance of norovirus infections, there are still no vaccines or antiviral treatments.
One of the reasons for the slow progress in drug and vaccine development has been the lack of cell culture and small animal model systems for studying the molecular mechanisms and pathology of noroviral gastroenteritis. These limitations have been partly overcome thanks to the relatively recent development of cell culture [3], and reverse genetics techniques for murine norovirus (MNV) [4][5][6]. MNV is closely related to human noroviruses and some strains are known to cause gastroenteritis in STAT1 (2/2) mice [7]. The virus has therefore emerged as an important surrogate model for studying human norovirus infections at the molecular level.
The Norovirus genome is approximately 7.5 kb in length and encodes three open reading frames (ORFs). ORF2 and ORF3 encode the major and minor capsid proteins respectively [8,9], while ORF1 encodes a non-structural polyprotein that is cleaved into functional units by the viral protease (NS6 pro ) [10][11][12][13][14]. There are five cleavage junctions in the norovirus polyprotein, so processing ultimately releases six cleavage products, all of which play essential roles in the intracellular replication cycle of the virus [12,14]. These include NS1-2, a protein that disrupts cellular trafficking [15]; NS3, an ATPase and putative helicase [16]; NS4, a protein that also interferes with cellular protein trafficking within infected cells [17]; NS5, also known as VPg, the genome-linked protein that has key roles in the translation and replication of the viral RNA genome [18]; NS6 pro , the viral protease itself [19]; and NS7 pol , an RNA-dependent RNA polymerase [20].
Although there is some variation in the sequence of the five cleavage sites recognised by norovirus NS6 pro , common features are identifiable. The amino acids on the N-terminal side of the cleaved peptide bond (the P1 position in the nomenclature of Schechter and Berger [21]) are typically glutamine or glutamic acid, while the P19 amino acids on the C-terminal side is usually glycine or alanine; the residues at the P2 and P4 positions tend to be large and hydrophobic in nature [12,14].
X-ray crystallographic analysis of the NS6 protease from three different strains of human norovirus [19,22,23] has revealed the protein to be a cysteine protease with a chymotrypsin-like fold, similar to that of the 3C proteases from picornaviruses [24,25]. The NS6 pro fold is composed of two b-barrel domains tightly packed against each other; the interface between these two domains forms the peptide-binding cleft at the centre of which is the protease active site, a catalytic triad consisting of cysteine, histidine and aspartic or glutamic acid [22]. While the catalytic mechanism of NS6 pro and 3C pro enzymes has not been fully defined, it has been shown to involve all three amino acids of the triad [22,[26][27][28] and is therefore likely to resemble the mechanism of the serine proteases.
In one of these crystallographic studies, the structure of human norovirus NS6 pro was determined in complex with a peptide-like inhibitor [23]. In this structure the N-terminal half of the peptide (positions P5 to P1) was covalently attached -by the action of a Cterminal Michael acceptor -to the cysteine nucleophile in the active site of the enzyme. Although the mode of binding was similar to that observed for other chymotrypsin-like proteases, the chemical modifications of the peptide that were made to add the Michael acceptor to the inhibitor resulted in steric clashes that distorted the protease active site. In this report we extend the structural work on noroviral proteases by describing the X-ray structure of full-length MNV NS6 pro . Fortuitously the structure has been solved in a crystal form in which the C-terminus of one protease extends into the active site of another. This provides new insights into peptide recognition by MNV NS6 pro .

Results and Discussion
To prevent autolysis of the protease in the high concentrations used for crystallisation trials MNV NS6 pro was inactivated by mutation of the active site nucleophile to alanine (C139A). This mutant protease was expressed in E. coli and purified using affinity and size exclusion chromatography as described in the Materials and Methods. The protein crystallised by hanging drop vapour diffusion in 20% w/v PEG 3350, 0.2 M KSCN, 0.1 M Bis-Tris propane pH 7.5 to yield crystals that belong to space-group C222 1 and have two molecules in the asymmetric unit. They diffracted X-rays to 1.6 Å resolution, producing an electron density map of extremely good quality for residues 1-123 and 132-183 in both molecules of the asymmetric unit. However, electron density for the solvent-exposed loop formed by residues 124-131 was very weak, presumably due to disorder, and so was not included in the NS6 pro model. The evident flexibility of this loop is consistent with previous crystallographic analyses of human norovirus NS6 proteases which found the loop to be either disordered, as in the Norwalk virus NS6 pro structure [22], or to adopt very dissimilar conformations, as in the structures of Chiba virus (CV) and Southampton virus (SV) NS6 pro [19,23]. The final refined crystal structure of MNV NS6 pro has an R free value of 20.8% and very good stereochemistry (full data collection and refinement statistics are summarised in Table 1).
The overall structure of the MNV NS6 protease is nearly identical to Norwalk virus, SV and CV NS6 pro (Fig. 1), which comes as no surprise given the high degree of amino acid identity (,60%) between these proteins. As described previously, the structure is an abbreviated form of the chymotrypsin-like 3C proteases from picornaviruses [19,22], which comprise two bbarrel domains. Specifically, in norovirus NS6 proteases, the bstrands on one side of the N-terminal b-barrel are so much shorter that the domain is better considered as a single b-sheet decorated by loops (Fig. 1A). The active site triad of H30, D54, and C139 (A139 in our mutated protein) is located in the centre of the peptide-binding cleft formed at the interface of these two domains.
In the norovirus NS6 pro crystal structures published to date it has already been observed that, whereas the N-terminal helix found in picornavirus 3C proteases is retained in NS6 pro , the Cterminal helix in 3C pro is not found in the norovirus proteases [19,22,23]. Instead the C-terminal polypeptide of NS6 pro appears to be largely unstructured; it is either disordered or adopts very different conformations that appear to be determined by crystal contacts (Fig. 1B). Strikingly, in our structure the flexibility of the C-terminal polypeptide has actually allowed it to associate with the peptide-binding sites of a neighbouring molecule in the crystala happy accident that has revealed the structure of an MNV NS6 pro -peptide complex ( Fig. 2A). The interaction occurs for both molecules of the asymmetric unit, although in each case the C-terminus extends from the body of the protease in a different direction towards its neighbour (Fig. 2B). For chain A, the interaction is symmetric: it donates its C-terminus to a neighbouring A-chain that is related by crystallographic two-fold symmetry and binds the C-terminus received from this molecule. This pair of molecules thus forms a closed complex. In contrast, chain B donates its C-terminus to a neighbouring B-chain but binds the Cterminus of a third molecule (also a B-chain). Despite these differences, the interactions made with the neighbouring protease by the residues corresponding to the P4-P1 amino acids of the donated C-terminus are in each case essentially identical (Fig. 2C).
Superposition of the crystal structure of Southampton Virus NS6 pro with a covalently-attached peptide-like inhibitor shows that the positions of the P4-P1 amino acids are very similar to our MNV NS6 pro 'co-crystal' structure [23] (Fig. 3A, C). The positions of the P4-P1 amino acids observed in the peptide-binding site of MNV NS6 pro are also very similar to the equivalent residues in the co-crystal structure of FMDV 3C pro :peptide complex [29] (not shown). This indicates that, although the formation of an MNV protease-peptide complex is an accident of crystallisation, crystalpacking constraints do not prevent the C-terminal residues from adopting a physiologically-relevant conformation in the binding site of a neighbouring molecule.
The adventitious insertion of the NS6 pro C-terminus into the peptide-binding cleft of neighbouring proteases in our crystals has Values for highest resolution shell given in parentheses. 2 R merge = 100 6S hkl |I j (hkl) 2 ,I j (hkl).|/S hkl S j I(hkl), where I j (hkl) and ,I j (hkl). are the intensity of measurement j and the mean intensity for the reflection with indices hkl, respectively. 3 R work = 100 6S hkl ||F obs | 2 |F calc ||/S hkl |F obs |. 4 R free is the R model calculated using a randomly selected 5% sample of reflection data that were omitted from the refinement. 5 RMS, root-mean-square; deviations are from the ideal geometry defined by the Engh and Huber parameters [45]. doi:10.1371/journal.pone.0038723.t001 provided us with a detailed view of a structure that corresponds to the protease-product complex of MNV NS6 pro that forms following cleavage of the NS6-NS7 junction in the viral polyprotein. All of the amino acids on the P-side of the cleavage junction are visible in the electron density map and are included in the model. The primary contacts with the protease are made by the carboxylate group of P1-Gln and side-chains of P1-Gln, P2-Phe and P4-Leu. Although the P5 and P6 amino acids make some contact with the protease, their positions differ for molecules A and B in the asymmetric unit. These differences are likely to be due to differences in crystal packing, so our structure does not give a clear indication that they are important for peptide recognition. The co-crystal structure of SV NS6 pro with a P5-P1 (EFQLQ) peptide-based inhibitor covalently attached to the active site nucleophile [23] reveals a conformation for the bound peptide over the region P1-P4 that is very similar to the MNV NS6 propeptide complex, although there are some differences in detail caused by sequence variation and a notable structural alteration at the active site due to the chemical modification of the peptide inhibitor needed to introduce the reactive Michaels acceptor (see below). P1 residue recognition by MNV NS6 pro is mediated primarily by a specificity triad of residues composed of H157, T134 and Y143. The side-chain carbonyl of the P1-Gln in the peptide (Q183) forms hydrogen bonds with the imidazole ring of H157 and the T134 hydroxyl group (Fig. 3A); in turn, H157 is stabilised by a hydrogen bond from the hydroxyl group of Y143. This mode of interaction could be replicated by a P1-Glu residue and accounts for the ability of NS6 pro to cleave substrates with P1-Gln or P1-Glu. The specificity triad, which is also found in human norovirus NS6 pro structures, is also conserved in picornaviral 3C pro enzymes from poliovirus, human rhinovirus, coxsackie A virus (CAV) and FMDV [24,[30][31][32] and explains the ability to cleave after P1-Gln or P1-Glu, although only FMDV is exceptional in not having a strong preference for P1-Gln [29,33].
The P1-Gln of the peptide 'product' is also stabilised by interactions made by its carboxylate group with the main-chain amides of G137 and A139 (the oxyanion hole), and with the e amine group of H30 (Fig. 3A). An essentially identical set of interactions was observed to stabilise the carboxylate group of the P1 residue in the co-crystal structure of the picornavirus CAV 3C pro in complex with a peptide corresponding to the N-terminal half of the cleavage product [30] (Fig. 3F), and in the crystal structure of the Tobacco Etch virus 3C-like NIa protease where, as in our structure, the C-terminus of the protease is observed to be bound in the active site (although in the case of NIa crystals it is not known whether this is a cis or trans insertion) [34]. In contrast, this mode of interaction did not occur in the complex of SV NS6 pro with a peptide-like inhibitor because of the non-standard nature of the C-terminus of the SV peptide [23]; in that case the carboxylate group at the end of the peptide is separated from the C a of the P1 residue by an additional C-C bond, which has the effect of distorting the geometry of the SV NS6 pro site (Fig. 3C). In particular, the side chain of H30 in the catalytic triad of SV NS6 pro is rotated out of the position needed for catalysis. In contrast, the active site geometry observed in the MNV NS6 pro structure is almost identical to that of the unliganded Norwalk virus protease, albeit allowing for the fact that the general acid in Norwalk virus NS6 pro catalytic triad is a Glu but an Asp in MNV (Fig. 1A, B) [19,22].
The side-chain of F182 at the P2 in the MNV NS6 pro structure is sandwiched between H30 and I109 at the entrance to the hydrophobic S2 pocket (Fig 3A, B) but also contacts V114, V158, A159 and D54. The largely apolar nature of the S2 pocket accounts for the general selectivity for hydrophobic amino acids at the P2 amino acid in norovirus cleavage junctions [14] (Fig. 4). An exception to this is the NS4/NS5 junction of the MNV polyprotein, which has a P2-Ser. It seems unlikely that at the side-chain hydroxyl of this amino acid could interact specifically with the S2 pocket, suggesting that the NS4/NS5 junction may be a sub-optimal substrate for the protease. Intriguingly, the S2 pocket is more enclosed in human norovirus NS6 pro structures because the Q110-G111-R112 sequence at the top of the loop that forms one flank of the pocket is replaced in MNV NS6 pro by the G110-S111-A112 sequence, which has much smaller amino acids (Fig. 3A-D). The side-chain of P3-Glu (E181) in the MNV NS6 pro -peptide complex makes no specific interactions with the protease, though it's position is stabilised by b-strand-like interactions between its main-chain amide and carbonyl groups and the main-chain surrounding A160 (Fig. 3A), a mode of binding that is common to chymotrypsin-like proteases. The absence of any interaction with the P3 side-chain explains the diversity of residues (Q|G|K|N|E) observed at this position in MNV cleavage junctions [14], a feature that is also shared by picornavirus 3C pro cleavage junctions [29,30].
In contrast, at the P4 position in the 'peptide' the side-chain of L180 occupies the hydrophobic S4 specificity pocket, which is made up of T166, T161, I109, A159, I168 and I107. This mode of interaction is very similar to the binding of apolar P4 amino acids observed in structures of SV NS6 pro [23] and picornaviral 3C proteases [29,35], reflecting the general -but not universalconservation of hydrophobic functional groups at P4 in specific substrates for these enzymes [36]. In MNV cleavage junctions there is a strong preference for large hydrophobic residues at P4 (W|Y|I|F|L), all of which could be accommodated in the deep, apolar S4 pocket.
Beyond the P4 position there appears to be little specific interaction between the peptide and the protease. In MNV NS6 pro the side chain of the P5 residue (A179) does not contact the protease. This finding echoes the lack of observable contact between FMDV 3C pro and the P5 amino acid of its peptide substrate [29]. Moreover, the high diversity of amino acid at P5 in MNV cleavage junctions (E|D|W|K|A) is consistent with the notion that this residue is not a determinant of specificity (Fig. 4). Although the P5-Glu in the co-crystal structure of SV NS6 pro with a peptide-like inhibitor was observed to interact with the main chain amide of Q109 via a water molecule (Fig 3C), this could be an artefact of crystallisation [23]. While it remains true that efficient catalysis by NS6 pro requires substrates that have at least five residues on the P-side of the scissile bond -since removal of the P5 residue from a synthetic substrate for SV NS6 pro resulted in a nine-fold decrease in the k cat /K M specificity constant [23] there is no evidence that residues beyond P4 contribute to specificity.
It is of interest to note that, despite the observation in the crystal structure of numerous specific contacts between the P4-P1 residues and the protease, comparative analysis of the proteolysis of mutated NS6 pro cleavage sites in the polyprotein has tended to suggest that there are relatively few strong determinants of cleavage specificity [28,37]. Although these studies reported tolerance of substitutions at positions P4, P2, P1 and P19, this interpretation is based on analyses of self-processing of polyprotein precursors at a single time point following translation in rabbit reticulocyte lysates [28] or following over-expression in E. coli [37]. Such techniques are lacking in sensitivity. For example, our previous work with FMDV 3C pro showed that analysis of polyprotein cleavage rates by over-expression in E. coli could not distinguish from wild-type a mutant that in in vitro assays (using purified protein and synthetic peptide substrates) had only 1% of wild-type activity [26]. We suspect therefore that, consistent with the crystal structure reported here (Fig. 3) and broad patterns of sequence conservation (Fig. 4), NS6 pro does indeed discriminate between amino acids at the P1, P2 and P4 positions. The contribution to specificity of residues on the P9 side of the scissile bond remains to be fully investigated.
In summary, crystallisation of MNV NS6 pro has provided an unexpected insight into the nature of protease-peptide interactions because of the insertion of the C-terminus of the protein (a product of NS6 pro cleavage) into the peptide binding cleft of neighbouring molecules in the crystal. Crystallisation has in effect captured a snapshot of the protease-product complex that forms immediately following resolution of the covalent intermediate that forms during catalysis. The protease-peptide interactions revealed in this structure provide a basis for the engineering of specific NS6 pro inhibitors.
It has occurred to us that by extending the C-terminus of MNV NS6 pro to incorporate the sequence of the NS6-NS7 cleavage junction, or to modify the sequence to that of other cleavage junctions, it may be possible to exploit the same crystal form to obtain structure of NS6 pro -substrate complexes. This would permit examination of protease-peptide interactions on the P9 side of the cleavage junction and work towards this goal is now in progress in our laboratory.

Materials and Methods
cDNA for MNV1 strain CW1 NS6 pro (a kind gift from Dr Ian Goodfellow -GenBank accession number YP724460) was amplified by PCR and ligated into the expression vector pETM11 [38], which contains a thrombin cleavable N-terminal 6x polyhistidine tag. QuikChange mutagenesis (Stratagene) was used to introduce a C139A mutation into NS6 pro to knock out the active site nucleophile of the enzyme. The primers used in cloning and mutagenesis are given in Table 2. The protein was expressed in BL21 (DE3) pLysS Escherichia coli cells by induction with 1 mM IPTG for 3 hours at 37uC. The protein was purified on TALON resin (Clontech); the polyhistidine tag was removed by overnight incubation using approximately 10 U of thrombin (Sigma-Aldrich) per milligram of purified protein. The cleaved protein was further purified by size exclusion chromatography using a Hi-Load 16/60 Superdex 75 column (GE healthcare). Peak fractions were concentrated to 750 mM in 25 mM Tris.HCl, pH 8.0 containing 200 mM NaCl and 5 mM Dithiothreitol. The purified protein contained 185 residues comprising a non-native GS sequence at the N-terminus, followed by all 183 residues of MNV NS6 pro .