Crystallography of a Lewis-Binding Norovirus, Elucidation of Strain-Specificity to the Polymorphic Human Histo-Blood Group Antigens

Noroviruses, an important cause of acute gastroenteritis in humans, recognize the histo-blood group antigens (HBGAs) as host susceptible factors in a strain-specific manner. The crystal structures of the HBGA-binding interfaces of two A/B/H-binding noroviruses, the prototype Norwalk virus (GI.1) and a predominant GII.4 strain (VA387), have been elucidated. In this study we determined the crystal structures of the P domain protein of the first Lewis-binding norovirus (VA207, GII.9) that has a distinct binding property from those of Norwalk virus and VA387. Co-crystallization of the VA207 P dimer with Ley or sialyl Lex tetrasaccharides showed that VA207 interacts with these antigens through a common site found on the VA387 P protein which is highly conserved among most GII noroviruses. However, the HBGA-binding site of VA207 targeted at the Lewis antigens through the α-1, 3 fucose (the Lewis epitope) as major and the β-N-acetyl glucosamine of the precursor as minor interacting sites. This completely differs from the binding mode of VA387 and Norwalk virus that target at the secretor epitopes. Binding pocket of VA207 is formed by seven amino acids, of which five residues build up the core structure that is essential for the basic binding function, while the other two are involved in strain-specificity. Our results elucidate for the first time the genetic and structural basis of strain-specificity by a direct comparison of two genetically related noroviruses in their interaction with different HBGAs. The results provide insight into the complex interaction between the diverse noroviruses and the polymorphic HBGAs and highlight the role of human HBGA as a critical factor in norovirus evolution.


Introduction
Noroviruses, a genus in the family Caliciviridae, are a group of small-round structured, nonenveloped RNA viruses causing epidemic acute gastroenteritis in all age in both developing and developed countries [1,2,3]. Structurally, norovirus possesses an outer protein capsid that encapsulates a single-stranded, positive sense RNA genome of ,7.5 kb. Norovirus capsid displays an icosahedral symmetry that is composed of 180 copies of a single major structural protein, the capsid protein (VP1), which organize into 90 dimers [4]. Each VP1 can be divided into two major domains, the N-terminal shell (S) and the C-terminal protruding (P) domains, linked by a short hinge. The S domain forms the interior shell; the P domain constitutes the arch-like P dimer of the capsid, while the hinge provides flexibility between the two major domains [4]. The P domain is further divided into P1 and P2 subdomains, forming the leg and the head of the protruding P dimer, respectively [4].
In vitro expression of full-length VP1 assembles empty virus-like particles (VLPs) that are structurally and antigenically indistinguishable from the authentic virus. The S and the P domains were found to be structurally and functionally independent. Expression of the S domain alone formed smooth, thin-layer S particles without the protruding structures [5,6], while production of P domain in vitro resulted in P dimer, P particle, and small P particle [6,7,8,9,10,11,12,13]. Functional analyses showed that the P dimer and the P particle, but not the S particle, bind to histo-blood group antigens (HBGAs), the host susceptible factors of noroviruses (see below), indicating that the P domain is the HBGA binding domain [3,6,8,13,14]. Since human noroviruses are not cultivatable in the laboratory, these VLPs and subviral particles are valuable tools for studying the virus-host interaction and as vaccine candidates for noroviruses. The P particles are particularly useful due to their simple procedures and high yields of production in a bacterial system.
Noroviruses recognize the HBGAs as important host susceptible factors [3,14,15]. HBGAs are the determinants of blood types that are complex carbohydrates presenting at the outermost ends of Nor O-linked glycans or glycolipids [16,17] on the mucosal epithelia of intestinal and genitourinary tracts. They are also present as free oligosaccharides in biologic fluids such as blood, saliva and milk. Noroviruses interact with HBGAs in a strain-specific manner, in which all common HBGAs such as A, B, H, and Lewis antigens are involved in norovirus recognition. The carbohydrate binding interfaces of noroviruses have been elucidated by crystallography [18,19,20] followed by mutagenesis studies [10,11,21]. They are located at the distal end of the arch-like P dimer, corresponding to the outermost surface of the viral capsid.
The HBGA-binding sites of two noroviruses, Norwalk virus (GI.1) and VA387 (GII.4), each representing a major genogroup (GI and GII) of human noroviruses, have been elucidated [18,19,20]. The HBGA-binding sites of the two strains differ in their precise locations, structures, sequence compositions and binding modes to carbohydrates, although they share the same targets of A and H antigens [22,23]. Further study showed that the HBGA-binding sites are conserved within but not between the two major genogroups of human noroviruses [11,19,20], although the two genogroups share the same ligand repertoire of human HBGAs. These data indicate that the HBGA binding is a prerequisite of norovirus infection and therefore important in the evolution of human noroviruses [11].
It has been noted for Norwalk virus (GI.1) that the same HBGAbinding interface can interact with different carbohydrate antigens by distinct bonds. For example, in the interaction with the type A trisaccharides, the a-GalNAc is the major contact, while in the case of the H pentasaccharide, both the b-Gal and the a-1, 2 Fuc are the major contacts [18,20]. On the other hand, VA387 (GII.4) recognizes both the A and B trisaccarides through the same a-1, 2 Fuc as major contact [19]. These data indicate a great flexibility in the interaction between noroviruses and HBGAs, in which the same binding interface can bind to different HBGAs (Norwalk to A/H and VA387 to A/B antigen), while distinct binding interfaces are able to bind the same HBGAs (Norwalk and VA387 to A antigen). This structural information helps the explanation of the complex virus-host interaction and evolution of noroviruses.
Both Norwalk virus and VA387 [3,14] recognize the secretor antigens (A/B/H), but not the non-secretor antigens. In this study, we performed a crystallographic study on a first non-secretor binding strain (VA207, GII.9) [23]. Our data indicated that VA207 uses the common genogroup II HBGA binding interface described for VA387 but interacts with a different set of HBGAs (Le y and sialyl Le x ) through a completely different binding mode. Instead of targeting at the secretor epitopes of the HBGAs in secretor binders (VA387 and Norwalk virus), VA207 targets at the non-secretor (Lewis) epitope (the a-1, 3 fucose) as the major contact. Mutagenesis study identified amino acids that are responsible for the type-specificity of VA207. These data for the first time elucidate the genetic and structural basis of strainspecificity by a direct comparison of two genetically related noroviruses in their interaction with different HBGAs, highlighting the role of human HBGA as a critical factor in norovirus evolution. The structural data would help antiviral development for disease control and prevention of noroviruses.

Expression and purification of recombinant P protein for crystallization
The cDNA fragment encoding P domain of norovirus VA207 (GII.9) (amino acid 222-537, GenBank accession#: AY038599) was cloned into the expression vector pGEX-6P-1 (GE Healthcare Life Sciences) and was expressed as a GST fusion protein in E. coli BL21 DE3 at 16uC overnight induced with 0.5 mM Isopropyl b-D-1-Thiogalactopyranoside (IPTG). The recombinant protein was purified using glutathione-sepharose 4B (GE Healthcare Life Sciences) according to the manufacturer's protocol. The P protein was cleaved from the GST tag with Prescission Protease (GE Healthcare Life Sciences) at 4uC overnight. The eluted P protein was confirmed by Mass Spectrometry Fingerprint to bind to E. coli chaperone GroEL, which could be further separated from P protein by Mono Q anion ion exchange (GE Healthcare Life Sciences) at pH 7.3. P protein was eluted at ,300 mM NaCl while GroEL was eluted at ,500 mM NaCl. P protein was then dialysed against 20 mM HEPES (pH 7.3), 150 mM NaCl before crystallization.

Production of P particle for mutagenesis study
The expression construct of wild type P particles of VA207 were generated previously [11] by cloning the P protein-encoding sequences into the plasmid pGEX-4T-1 (GE Healthcare Life Sciences). Mutant P particles with single amino acid mutation at the HBGA binding interface were constructed by site-directed mutagenesis using the wild type constructs as templates. Sitedirected mutagenesis was conducted using the QuickChange Site-Directed Mutagenesis Kit (Stratagene, La Jolla, CA) using following primer pairs: caggtgacgccacggcggcccatgaggcaag/cttgcctcatgggccgccgtggcgtcacctg (R346A), ctcaacctcaagcgcttttgaaacaaacc/ggtttgtttcaaaagcgcttgaggttgag (D374A), ccaataggtatcgccattgagggcaattct/ agaattgccctcaatggcgatacctattgg(Y389A), ccaggagctagtgcccacacaaatggg/cccatttgtgtgggcactagctcctgg (G440A), ccaggagctagtggcgccacaaatggggagatg/catctccccatttgtggcgccactagctcctgg (H441A), ttcatcccaggagctgctggccacacaaatgg/ccatttgtgtggccagcagctcctgggatgaa (S439A), as described previously [6,10,11,21]. The wild type and mutant P particles were expressed in E. coli BL21 DE3 and then purified using the Glutathione Sepharose 4 flow (GE Healthcare Life Sciences, Piscataway, NJ) according to the manufacturer's protocol as described elsewhere [6,8,9,10,11]. The P proteins were released from GST tag by thrombin (GE Healthcare Life Sciences) digestion. The formation of P particle was determined by gel filtration using a

Author Summary
The interactions of noroviruses with histo-blood group antigens (HBGAs) are diverse, in which strains in both genogroups I and II (GI and GII) recognizing either the secretor or the non-secretor (Lewis) HBGAs have been reported. The crystal structures of the HBGA binding interfaces of two secretor binders (Norwalk virus, GI.1 and VA387, GII.4) have been elucidated. In this study we determined the crystal structure of the HBGA-binding interface of the first Lewis-binder (VA207, GII.9) and compared it with those of the two secretor binders. VA207 binds to the Lewis antigens via the Lewis epitope (a-1, 3 fucose) as the major interacting residue, which is distinct from the two secretor binders that interact with the secretor antigens through the A or H epitope as a major interacting residue. In addition, precursor saccharide was involved in binding and has a role in strain-specificity of VA207. VA207 shares a conserved HBGA binding interfaces with VA387, suggesting a strong selection of human HBGAs in norovirus evolution. The distinct binding modes between these two GII strains suggest a potential host-driving force on the diversity of noroviruses by the polymorphic HBGAs. The crystal structures resolved in this study also would facilitate the antiviral drug design against noroviruses.
size-exclusion column Superdex 200 (GE Healthcare Life Sciences) powered by an AKTA-FPLC system (model 920, GE Healthcare Life Sciences) followed by SDS-PAGE electrophoresis, in which the P particles formed a peak at ,830 kDa. None of the designed single residue mutations in this study affected P particle formation and none of the mutants revealed any detectable reduction in reactivity to the hyperimmune serum against VLP compared with that of the wild type P particle.
Crystallization of P protein and its complexes with Le y and sialyl Le x tetrasaccharides P protein was concentrated to 12 mg/ml. Native crystals were grown using the hanging drop vapor diffusion method by mixing 1. Data collection and processing X-ray diffraction data were collected from flash-cooled crystals using an in-house Rigaku FR-E rotating anode X-ray generator with an R-AXIS IV++ detector at a wavelength of 1.5418 Å for native P protein crystals and P protein complexed with Le y tetrasaccharide crystals. Diffraction data for P protein complexed with SLe x tetrasaccharide crystals were collected at BL17A of the KEK Photon Factory (Japan) at a wavelength of 1.0000 Å . The cryoprotectant solution contained 12% (w/v) PEG 3350, 50 mM magnesium formate and 15% (v/v) glycerol. The data were processed, scaled and merged using the HKL2000 program package [24]. Statistics for data collection and processing are summarized in Table 1.

Structure determination and refinement
We used the crystal structure of VA387 P domain as the search model and the program Phaser [25] to solve the phases of the crystal structure of VA207 P protein. The amino acid sequence was then replaced with that of VA207 P domain, and manual adjustments were carried out with the program COOT [26] by the guidance of (2Fo-Fc) and (Fo-Fc) electron density maps, where Fo is the observed structure factor and Fc is the calculated structure factor. Further adjustment and refinement were carried out with the programs CNS [27], Refmac [28] and Phenix [29]. At the final stage of refinement, water molecules were added in (Fo-Fc) map at peaks (.3s) where they could form good hydrogen bonds with nearby residues. Statistics for structure refinement are summarized in Table 2. The refined structures were validated with the program PROCHECK [30]. No residue was found in a disallowed region of the Ramachandran plot. The structure analysis was performed by programs EdPDB [31] and PyMOL (DeLano Scientific LLC ).

HBGA-binding assay
The synthetic oligosaccharide-based binding assays were performed basically as described elsewhere [22,23]. The affinitycolumn purified P particles were diluted to 10 mg/ml as working solution. Synthetic oligosaccharides representing H1, H2, H3, A, B, Le a , Le b , Le x , and Le y were purchased from GlycoTech (Gaithersburg, MD), while the other two representing SLe a and SLe x were from The Consortium for Functional Glycomics (CFG, USA). The oligosaccharides were coated on 96-well microtiter plates (Dynex Immulon; Dynatech, Franklin, MA). After blocking by nonfat milk, P particles at 10 ng/ml were added. The bound P particle were detected using a rabbit anti-VA207 VLP antiserum (1:3300), followed by the addition of HRP-conjugated goat antirabbit IgG (ICN, Aurora, OH). Protein data bank accession The coordinates and structure factors for the VA207 P protein (3PUM) as well as its complexes with Le y tetrasaccharide (3PUN) and SLe x tetrasaccharide (3PVD) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ.

Crystallization of VA207 P protein
Norovirus VA207 (GII.9) has been demonstrated to recognize Le y , Le x and SLe x antigens [23] and thus was chosen as a model to study the structure of the binding interface to Lewis antigens. The P domain of VA207 (amino acid 222 to 537) was expressed in E. coli and the HBGA binding profile of the resulting P protein was verified (data not shown) before being used in the crystallography study. The protein sample was concentrated to 12 mg/ml and the crystallization was performed in the presence or absence of the Le y or SLe x tetrasaccharide. Structures of the native P protein and the P protein complexed with the oligosaccharide were solved with the molecular replacement method using diffraction data up to 2.2 and 2.0 angstroms (Å ) resolution, respectively.
The overall structure of VA207 P protein protomer The crystal of the native P protein of VA207 belongs to space group of P2 1 2 1 2 and contains a homodimer of P protein in an asymmetric unit. Most residues from 226 to 527 could be modeled in the electron density map, while residues 295 to 297 and 392 to 394 could not be modeled due to lack of interpretable electron density map in these loop regions. However, in both complex structures with the Le y and SLe x tetrasaccharides, the above missing residues could be modeled, suggesting that binding to either of the two tetrasaccharides stabilizes the two flexible loops.
The P protein protomer of VA207 shares similar structure features with that of VA387 (GII.4) [18,19] and Norwalk virus (GI.1) [4,18,20] (Figures 2 and 3). It can also be divided into two subdomains, P1 and P2. P2 subdomain, spanning from residue 275 to 415, is an insertion into P1 subdomain, splitting the latter into two fragments (222 to 274 and 416 to 537, Figure 2). The P1 subdomain has a mixed a/b structure. Two twisted antiparallel b sheets (b1-b8-b10 and b14-b1-b8-b13-b12-b11-b15, where b1 and b8 strands are shared within the two b-sheets) and an a helix form a hydrophobic core (Figure 3). While P1 forms the legs, P2 constitutes the head of the P dimer with a b-barrel made of 6 antiparallel b strands folded as a Greek key topology. The structure of the P protein is very similar between VA207 and VA387, with a perfect match in their P1 subdomains. However, significant differences were observed in the four loops on the P2 subdomain connecting b2-b3, b4-b5, b6-b7 and b7-b8, as a result of sequence variations in the four loops between the two strains ( Figure 2).

Structure of VA207 P protein dimer
The P protein forms a homodimer along a non-crystallographic two-fold axis, and the dimerization is essential for forming a functional carbohydrate binding site on the outermost surface of the virus capsid ( Figure 3) (see below). The P dimer has a dimension of 56 by 66 by 70 Å and contains a large buried surface area of 3, 300 Å 2 (for two protomers) formed by both polar and non-polar residues. Both subdomains contributed to the dimer interface: the a helices in the P1 subdomain interacts with each other by hydrogen bonds and hydrophobic interaction, while strands b5 and b9 from the b barrel in P2 subdomain was bound mainly by hydrogen bonds. Similar to the protomers, the general structure of VA207 P dimer is very similar to that of VA387, particularly in the P1 subdomain region. However, clear difference can be seen on the top surface of the P dimer (Figures 3).

The carbohydrate-binding interface of VA207
Co-crystallization of the P protein with Le y and SLe x tetrasaccharides has resulted in crystals in a new form of space group P2 1 2 1 2 1 instead of the P2 1 2 1 2 native form. In this new crystal form, an asymmetric unit also contains one homodimer along a noncrystallographic two-fold axis. The two complex structures were solved with the molecular replacement method using the native P protein dimer structure as the initial model. In both complex crystals the four saccharide rings could be clearly discerned from the (fo-fc) electron density map at a resolution of Figure 2. Structure-based sequence alignment between VA207 and VA387 P domains. Boundaries between P1 and P2 subdomains are indicated by blue arrows. Identical residues are highlighted in blue background, while similar residues are shown in red character. The secondary structures that are assigned by the program DSSP are shown for both P domains above and below their respective sequence. Coils represent g helices and a helices, whereas black arrows represent b strands. doi:10.1371/journal.ppat.1002152.g002 2Å , and their structures were modeled accordingly (Figure 4). Torsion angles of glycosidic linkages for the two tetrasaccharides are listed in Table 3.
The carbohydrate binding site of VA207 is located at the distal surface of the P dimer (Figure 3), corresponding to the outermost surface of a norovirus capsid. Like VA387 [19] the ligand binding site is located in the interface between two P proteins, indicating both protomers contribute to the ligand binding. The binding pocket for the Le y and SLe x tetrasaccharide is constituted by 7 amino acid residues, of which T345, R346, and D374 are from one protomer forming the ''bottom'' (T345 and R346) and a ''wall'' (D374), while G440, H441, S439 and Y389 are from the other protomer constituting the other ''walls'' of the pocket (Figures 5, 6 and 7).
Extensive hydrogen bond networks were observed between these amino acid residues and the tetrasaccharides of the Lewis antigens. In particular, T345, R346, D374 and G440 interact with the a-1, 3 fucose (Lewis epitope), Y389 and S439 with the N- acetylglucosamine saccharide, one of the two precursor saccharides, and H441 with the a-1, 2 fucose (H epitope on Le y tetrasaccharide) via a water molecule as a bridge (Figures 5 and 6). It is noted that 5 of the 7 amino acid residues that participate directly in interaction of VA207 with the Le y antigen are conserved in VA387 (GII.4) [11,19], supporting the notion that these two GII strains share a conserved genogroup II HBGAbinding interface as predicted previously [11]. However, unlike VA387 that recognizes A and B antigens through the a-1, 2 fucose (H epitope) as its major contact [19], VA207 interacts with the two Lewis antigens through the a-1, 3 fucose (Lewis epitope) as its main recognition moiety. Thus, a conserved carbohydrate binding interface can recognize different HBGAs with different modes among genetically related but distinct GII viruses.

The sialic acid did not participate in binding with VA207
The sialyl modified Le x was recently suggested to participate in binding to some GII noroviruses [32,33], and our binding assays showed that VA207 binds strongly to the SLe x ( Figure 8A). Therefore, we particularly studied whether the sialic acid residue participates in binding of VA207 by co-crystallization of the P protein with SLe x . Our data showed the sialic acid was far away from the P dimer surface ( Figure 6) and there seemed no hydrogen bond or other direct interaction with the VA207 capsid. The number of hydrogen bonds between the SLe x and P dimer was reduced to 9, which is one less than that in P dimer-Le y complex, due to the absence of the a-1, 2 fucose in the SLe x tetrasaccharide (Figures 5 and 6).
Both a-1, 3 and a-1, 2 fucoses are critical for interacting of VA207 with HBGAs To further characterize the role of individual amino acids in the HBGA-binding interface, mutations were introduced to residues that form hydrogen bonds with carbohydrate ligands followed by binding assays of the mutant P particles [7,8] to HBGA receptors. When single amino acid mutations were introduced to amino acids interacting with the a-1, 3 fucose-(R346A, D374A and G440A) and the a-1, 2 fucose (H441A) (Figures 5 and 6), all mutant P particles lost their binding completely (R346A, D374A and G440A) or nearly completely (H441A) to H3, Le y , Le x and SLe x compared with the wild type P particles (Figure 8 A to E). Saliva-based binding assay confirmed the results (data not shown) [11]. These data indicated that both a-1, 3 and a-1, 2 fucoses are important in VA207 recognition of HBGAs, in which the interactions of the a-1, 3 fucose may be more important.  The N-acetylglucosamine binding site affects the binding specificity of VA207 When single mutations were introduced to the two amino acids that form the N-acetylglucosamine (GlcNAc) binding site (S439A and Y389A) observed for VA207, the resulting VA207 P particle mutants demonstrated a change in binding specificity to HBGAs (Figure 8, comparing A with F and G). While both mutants lost binding to the Le y and SLe x , the Y389A mutant gained a strong binding to Le b (Figure 8F), and the S439A mutant retained binding to H type 3 antigen ( Figure 8G). Since the GlcNAc of Le b , Le y /Le x , and H type 3 are from type I, II, and III precursors, respectively, with different linkages and side chains (Figure 9), our data indicated that this GlcNAc binding site determines the specificity of VA207 in a type-specific manner (see discussion).

Discussion
In this study we determined the atomic structure of the carbohydrate binding interface of the first Lewis-binding norovirus (VA207, GII.9) using X-ray crystallography followed by a mutagenesis study. As a member of GII noroviruses, VA207 is expected to share a conserved carbohydrate binding interface [11] with that of VA387 (GII.4) [10,19]. This has been now demonstrated in this study by showing that the HBGA binding sites of VA207 and VA387 share highly conserved location, amino acid composition and overall structures of the binding pockets. We also elucidated the structural basis of the two strains in recognizing different sets of HBGAs through different interaction modes. VA207 recognizes the Lewis antigens through the a-1, 3 fucose Figure 6. Extensive interaction network between VA207 P dimer and 39 sialyl-Lewis X tetrasaccharide. (A) Stereo view of the binding cavity of VA207 P dimer and the bound 39 sialyl-Lewis X tetrasaccharide. Protein structure is in ribbon representation, while tetrasaccharide and major amino acid residues involved in interaction are in stick style. Hydrogen bonds are indicated by blue dotted lines; carbon, oxygen and nitrogen atoms are in grey, red and blue, respectively. The two P protomers are in cyan and purple and residues involved in saccharide interactions are labeled. (B) Schematic diagram of P protein and 39 sialyl-Lewis X tetrasaccharide interaction. Residues of P protein involved in the interaction are labeled with detailed atomic information. Hydrogen bonds were shown as dashed lines with indication of H-bond lengths. doi:10.1371/journal.ppat.1002152.g006 (the Lewis epitope) as a major contact (Figures 5 and 6), while VA387 interacts with the secretor antigens through the a-1, 2 fucose (the H epitope) as the major binding target [10,19]. These data provide structural evidence on how subtle changes in the HBGA binding interface affect binding specificity and revolution of genetic linkage of noroviruses under the selection by the polymorphic human HBGAs.
Genetically, VA207 and VA387 belong to two genotypes within the same genogroup II but share only 54% amino acid identity in the P domains. However, the P dimers of the two strains share very similar structures with a nearly perfect match between their P1 subdomains. The basic structures of the carbohydrate binding pockets of the two strains are also highly conserved, in which 4 of the 5 amino acids constituting the core structure of the binding pockets are identical between the two strains. These data indicate that the recognition of HBGAs is a prerequisite of norovirus infection, which plays an important role in norovirus evolution. In contrast, no such similarity is found with Norwalk virus (GI.1), confirming the genetic linkage between the two GII noroviruses. Major sequence variations between these two GII noroviruses have also been found at the top surface (Figure 7, top panel) of the P2 subdomains. This region is associated with the host immune selection in addition to HBGA recognition.
The conservation of the HBGA binding sites raises a question on how the strain-specificities of HBGA binding are determined. VA207 binds the Le x , Le y and H type 3 antigens, while VA387 has a broader binding spectrum to A, B, H type 3, Le b , and Le y [22,23]. Our mutagenesis analysis on both VA387 [10] and VA207 (this report) indicated that the integrity of the core structure of the HBGA binding pocket is strictly required for the binding function. Introducing single mutations to any residue  involved in the core structures have resulted in completely loss of binding to all HBGA types ( [10], Figure 8 B to E). These core structures include those interacting to the a-1, 3 fucose of VA207 (this report), the a-1, 2 fucose of VA387 and Norwalk virus, and the GalNAc of Norwalk virus [10,11,18,19,20].
However, some residues outside the core structures also play a role in determining the binding specificity of noroviruses and mutations at these residues can change carbohydrate targets depending on the types of HBGAs being involved. For example, Q331 and K348 in VA387 interact with GalNAc of the A antigen and a-Gal of the B antigen [10,11], while S439 and Y389 in VA207 interact with GlcNAc (Figures 4 and 5), a b-1, 4 linked saccharide in the type II precursor of Le x and Le y antigens. These residues are located in the similar positions outside the core structures of the binding pockets of the two GII strains (Figure 7), providing further evidence of selection of noroviruses by the host HBGAs. The demonstration of carbohydrate target switch helps the understanding of the complex interactions between the diverse noroviruses and polymorphic HBGAs of human hosts.
Additional strain-specificity of VA207 binding to HBGAs was observed through the mutagenesis study of S439 and Y389. Several hydrogen bonds between the GlcNAc of Le y and SLe x and the side chains of S439 and Y389 were evident in crystallography studies ( Figures 5 and 6). These bonds, together with those with the a-1, 3 and the a-1, 2 fucoses, form an interacting network responsible for the binding of VA207 to the Le y and the SLe x antigens. A single mutation of Y389A lost the binding to the type II Lewis antigens (Le x , SLe x , and Le y ) but gained strong binding to the type I Lewis antigen (Le b ), while no change in binding to the H type III antigens (Figure 8, comparing A with F). Similarly, the S439A mutant abolished the binding to Le x and Le y without altering the binding to H type III (Figure 8, comparing A with G). These results can be well explained by the structural difference between the type I (Le a and Le b ) and type II (Le x and Le y ) Lewis antigens. Both Le x and Le y are derivatives of the type II precursor (Galb1-4GlcNAcb-R, Figure 9B), of which the b-1, 4 GlcNAc interacts with S439/Y389 (Figures 5, 6, and 9B), whereas the Le b is a derivative of the type I precursor (Galb1-3GlcNAcb-R, Figure 9A) with a b-1, 3 GlcNAc. Thus, in Le b , the glycosidic linkages of the GlcNAc with a-Fuc and b-Gal should be switched compared with Le y , resulting in the GlcNAc residue flipping by about 180uaround the axis which is across and perpendicular to the C1-O5 and C3-C4 bonds of the GlcNAc. Consequently, the -CH3OH group of the GlcNAc in Le b would collide with the side chain of S439 (1.3 Å , Figure 9A, right), making the Le b sterically difficult to bind to the wild type VA207. However, the explanation as to how the two mutations (Y389A and S439A) changed the binding specificity of VA207 P protein remains unidentified based on the current complex structures. Since a simple replacement of the Y389 and S439 with an alanine in the crystal structure cannot explain well the observed change of binding specificity, unknown conformational change in the vicinity of Y389 and S439 may occur due to the mutation, which remains to be defined. These data indicate that the precursor may be a determinant of the typespecific HBGA interaction of noroviruses.
The facts that VA207 binds strongly to Le y but weakly to Le x and that the Y389A mutant binds strongly to Le b but not to Le a suggest that the interaction between H441 and a-1, 2 fucose may be important for binding activity. H441 forms a water-mediated hydrogen bond with a-1, 2 fucose of Le y (Figure 5), most likely also of Le b in the Y389A mutant, which was absent in either Le a or Le x ; this may be the reason of difference of binding between Le a / Le x and Le b /Le y . It may also explain why the H441A mutation abolishes all binding function. In addition, we noticed the presence of Van der Waals interaction between the side chain of H441 and 6-methyl group of the a-1, 3 fucose (Lewis epitope), which further emphasizes the importance of the H441 in the binding function of VA207. A similar structure formed by Y443 is also present in VA387 which interacts through Van der Waals force with the a-1, 2 fucose of the A/B antigens, and this hydrophobic interaction has been shown to be vital for the binding to HBGA receptors [19].
The role of sialic acid in binding of SLe x to VA207 was suggested by our in vitro binding assays ( Figure 8A). Similar binding activities of SLe x to GII.4 viruses were also observed in two previous studies [32,33]. However, our crystallography data excluded a direct participation of the sialic acid in binding to VA207 by the lack of direct interaction between the sialic acid and the P protein in the crystal structure. We hypothesize that the addition of the sialic acid residue may stabilize the structure of the SLe x antigen and thus impact positively the binding to VA207 compared with Le x without sialic acid. Since VA387 is the only GII.4 virus with known crystal structure of the HBGA binding interface, we measured whether VA387 binds to SLe a and SLe x antigens. We did not observe binding of VA387 to either antigens. We also noticed weaker interactions of VA207 wild type and mutants with Le x and Le a than Le b and Le y (Figure 8), in which the occurrence of the a-1, 2 fucose (H epitope) in the Le b and Le y antigens clearly play a role. Further study is needed to clarify this issue.
The binding of VA207 to the H type 3 (Fuca1-2Galb1-3GalNAc-R1, H3) antigen in addition to the two Lewis antigens (Le x and Le y ) extends our understanding on flexibility of carbohydrate binding mode. Since the H antigen does not have an a-1, 3 fucose (Lewis epitope, [3,14]), VA207 must recognize another saccharide as the major contact. The fact that single mutations at the core binding pocket abolish binding to all HBGAs indicates that VA207 must recognize the H type 3 antigen through the same binding interface. However, mutations of S439A and Y389A did not decrease the binding to H type 3 antigen, suggesting that this GlcNAc binding site is not required for binding to the H type 3. Thus, future study would be of significance to find out how the same binding interface of VA207 binds to the H type 3 antigen with a different binding mode in addition to the known binding to Le x and Le y antigens.
In summary, our study elucidates the structural basis of norovirus-Lewis antigen interaction at atomic resolution, which well explains the type-specific HBGA recognition. The structural and functional data generated in this study are valuable to understanding the complex interaction between diverse noroviruses and the polymorphic HBGAs of human hosts. Our current results also highlight the importance of human HBGAs as a critical factor in norovirus evolution, in which a functional carbohydrate binding interface is a prerequisite for norovirus survival. The high resolution pictures of the Lewis antigen binding interface illustrated in this study expand the foundation of our strategies to control and prevent norovirus-associated diseases.