The Murine Coronavirus Hemagglutinin-esterase Receptor-binding Site: A Major Shift in Ligand Specificity through Modest Changes in Architecture

The hemagglutinin-esterases (HEs), envelope glycoproteins of corona-, toro- and orthomyxoviruses, mediate reversible virion attachment to O-acetylated sialic acids (O-Ac-Sias). They do so through concerted action of distinct receptor-binding (“lectin”) and receptor-destroying sialate O-acetylesterase (”esterase”) domains. Most HEs target 9-O-acetylated Sias. In one lineage of murine coronaviruses, however, HE esterase substrate and lectin ligand specificity changed dramatically as these viruses evolved to use 4-O-acetylated Sias instead. Here we present the crystal structure of the lectin domain of mouse hepatitis virus (MHV) strain S HE, resolved both in its native state and in complex with a receptor analogue. The data show that the shift from 9-O- to 4-O-Ac-Sia receptor usage primarily entailed a change in ligand binding topology and, surprisingly, only modest changes in receptor-binding site architecture. Our findings illustrate the ease with which viruses can change receptor-binding specificity with potential consequences for host-, organ and/or cell tropism, and for pathogenesis.


Introduction
To initiate infection viruses must bind to an appropriate host cell. Selectivity of binding is ensured by attachment proteins on the virion, tailored to recognize one -or at the most-a limited number of cell surface molecules. Remarkably, a large number of viruses, representative of at least 11 distinct families several of which of clinical and/or veterinary importance, use sialic acid (Sia) as receptor determinant. Owing to differential modification, Sia structural diversity exceeds that of any other monosaccharide [1]. The most common type of Sia substitution, O-acetylation at carbon atoms C4, C7, C8 and/or C9, occurs in a host-, organand even cell-specific fashion such that even individual cells of the same type and tissue may differ in their Sia expression profile [2][3][4]. Viruses have evolved to selectively use particular Sia variants and their attachment proteins are high-specificity sialolectins, the binding of which might depend on the identity of the penultimate residue in the sugar chain, the type of glycosidic linkage and/or the presence or absence of substitutions [5][6][7][8][9]. Ultimately, this preference in Sia receptor usage affects host-, organ-, and cell-tropism [10][11][12][13][14], the course and outcome of infection [15][16][17][18] as well as the efficacy of intra-and cross-species transmission [14,19], all to extents not yet fully appreciated.
The hemagglutinin-esterases (HEs) are a class of Sia-binding envelope glycoproteins found in some negative-stranded RNA viruses, namely in influenza C and infectious salmon anemia virus (family Orthomyxoviridae; [5,20]), but also in toro-and coronaviruses, positive-stranded RNA viruses in the order Nidovirales [21,22]. From phylogenetic and comparative structural analyses it appears that toro-and coronaviruses acquired their HE proteins separately via horizontal gene transfer, with an (hemagglutinin-esterasefusion) HEF-like protein as progenitor [22][23][24][25]. Like influenza C virus HEF, most nidovirus HEs bind to 9-O-acetylated (9-O-Ac) Sias and, correspondingly, display sialate-9-O-acetylesterase receptor-destroying enzyme activity [25]. Murine coronaviruses, however, occur in two closely related biotypes that differ in HE ligand/substrate preference. One of these -represented by mouse hepatitis virus (MHV) strain DVIM-displays the presumptive ancestral specificity and targets 9-O-Ac-Sias, while the other -represented by MHV strain Sappears to have evolved to use 4-O-Ac-Sias instead [6,[25][26][27] (for supplementary introduction see Text S1 and Figure S1). Given the stereochemical differences between these Sia variants ( Figure 1) and the essentially different requirements for ligand and substrate recognition by the respective HEs, the question arises how this major shift in receptor usage was achieved and what changes must have occurred in the receptorbinding and O-acetylesterase domains to make this transition possible.
The crystal structures of a number of 9-O-Ac-Sia-specific nidovirus HEs have been solved [23,24]. Unlike the receptorbinding site (RBS) of influenza C virus HEF [28], the RBSs of the corona-and torovirus HEs seem to be exceptionally plastic as they appear to have undergone significant changes and adaptations that altered their overall architecture in a relatively short evolutionary time span. Based on these observations, we anticipated and speculated [23] that this plasticity might have allowed for even more substantial adjustments in the RBS of the murine coronavirus HE as to produce an entirely novel binding site specific for 4-O-acetylated Sias.
We now present the crystal structure of the MHV-S HE receptor-binding domain, both in its native state and in complex with a receptor analogue. The data reveal in exquisite detail how the RBS changed to accommodate 4-O-instead of 9-O-acetylated Sias. Surprisingly, however, this shift in receptor usage seems to have involved primarily a change in ligand binding topology and relatively modest changes in RBS architecture.

Results/Discussion
Expression, purification, and biochemical characterization of MHV-S HE We produced the ectodomain of MHV-S HE (residues 25-403) as an Fc-fusion protein, either in enzymatically active (HE-Fc) or inactive form (HE 0 -Fc), by transient transfection of HEK293 cells. MHV-S HE 0 -Fc bound to horse serum glycoproteins (HSG), which are decorated with 4-O-acetylated sialic acids (4-O-Ac-Sia), but carry little to no 9-O-Ac-Sias (Figure 2A; [29]). The receptor determinants in HSG could be destroyed by treatment with MHV-S HE-Fc, but not by treatment with BCoV-Mebus HE-Fc (a sialate-9-O-acetylesterase; Figure 2B). No binding of MHV-S HE 0 -Fc was observed to bovine submaxillary mucin (BSM), a glycoconjugate devoid of 4-O-Ac-Sias (Figure 2A; [30]). The MHV-S HE ectodomain, released from HE-Fc by thrombincleavage, retained proper sialate-4-O-acetylesterase activity when assayed for substrate specificity with a synthetic di-O-acetylated Sia (5-N-acetyl-4,9-di-O-acetylneuraminic acid a-methylglycoside, aN-eu4,5,9Ac 3 2Me; Figure 2C). In hemagglutination assays, MHV-S HE 0 specifically bound to 4-O-acetylated Sias ( Figure 2D). The combined findings show that the recombinant MHV-S HE proteins are biologically active, both as Fc fusion proteins (Figures 2A and B) and after the removal of the Fc tail by thrombin-cleavage ( Figures 2C and D), which we take as an indication for proper folding and protein stability.

Author Summary
Glycans cover the surface of every living cell. In vertebrates, these sugar trees commonly terminate with sialic acid (Sia) and, in consequence, Sias have become the attachment factors of choice for a multitude of pathogens: protozoa, bacteria and viruses alike. To ensure selectivity, viruses evolved to target distinct Sia species. Whether a particular type of Sia serves as receptor may dependamongst others-on the absence or presence of specific Sia modifications. For example, most group A betacoronaviruses attach to 9-O-acetylated Sias. However, some murine coronaviruses have switched to using 4-O-acetylated Sias instead. In chemical/molecular terms this represents a momentous shift in receptor usage. We now have crystallized the hemagglutinin-esterase protein (HE) of a murine coronavirus and have solved the structure of its sugar-binding domain. Our findings reveal in exquisite detail the interactions between Sia binding site and cognate receptor. The data allow a reconstruction of how, during coronavirus evolution, the switch in receptor usage may have come about.

Structure determination and overall structure
Crystals of free MHV-S HE and of a complex of HE 0 with aNeu4,5Ac 2 2Me diffracted to 2.1 and 2.5 Å resolution, respectively. The structures were solved by molecular replacement by using BCoV-Mebus HE (PDB ID 3CL5; [23]) as template (BCoV-Mebus and MHV-S HE share 59% sequence identity; for crystallographic details, see Table 1).
In overall structure, the HE of MHV-S closely resembles that of BCoV-Mebus. It assembles into homodimers and the monomers are composed of three modules: a small membrane-proximal (MP), a receptor-binding (R), and a central esterase (E) domain (Figures 3A-C; [23]). The MP domain is virtually identical to that of BCoV-Mebus HE with a root mean square difference (rmsd) on main chain Ca atoms of only 0.48 Å . Unfortunately, residues in the E domain, that form the catalytic site were disordered in both crystals. Hence, the molecular basis for the unusual substrate specificity of MHV-S HE remains unknown. The structure of the R domain, however, was resolved, and in the complex the ligand molecule is well-defined ( Figure S2). The R domains of MHV-S and BCoV-Mebus HE are highly similar with an rmsd on main chain Ca atoms of 0.79 Å .

MHV-S HE has a unique receptor binding-site
The receptor-binding sites of BCoV-Mebus and MHV-S HE are very much alike in architecture. This is particularly surprising given the considerable differences in ligand preference and in their requirements for binding (i.e. binding of 9-O-Ac-Sia in a 9-O-Acdependent fashion versus binding of 4-O-Ac-Sia in 4-O-Acdependent fashion, respectively; Figure 1). The MHV-S HE receptor-binding site (RBS), like that of BCoV-Mebus HE, is composed of 5 surface exposed loops, four of which extend from the conserved 8-stranded ''Swiss role'' core-structure (loops R1 through R4; Figures [23]. While the overall organization of the MHV-S RBS is similar to that of BCoV-Mebus HE, the orientation of the receptor analogue with respect to the RBS is strikingly different ( Figures 4B, 5A and B). As compared to the ligand in the BCoV-Mebus HE binding site ( Figures 5C and D), the aNeu4,5Ac 2 2Me receptor analogue is rotated by about 90u and shifted by about 2.5 Å . Figures 5A and B show how residues from the four R-loops, the E-loop and the RBS-hairpin interact with the Sia receptor molecule. Two hydrogen bonds are formed between the nitrogen and oxygen main-chain atoms of Lys 217 and the oxygen of the C4 acetyl group and the nitrogen of the 5-N-acetyl group, respectively. The Ser 220 main chain nitrogen accepts an additional, weak hydrogen bond from the C8 hydroxyl group of the ligand ( Figure 5B).
Most remarkably, the hydrophobic pocket that in BCoV-Mebus HE accommodates the 9-O-acetyl moiety of the receptor (comprised of Leu 161 , Tyr 184 Leu 266 and Leu 267 ) -arguably the most crucial element of the BCoV HE RBS-is conserved in MHV-S HE (comprised of Ile 166 , Tyr 189 , Tyr 281 , and Leu 282 ), but it now accepts the Sia 5-N-acetyl group, while the Sia glycerol sidechain is solvent exposed ( Figure 5A). Moreover, the hydrophobic patch in the BCoV-Mebus HE RBS that interacts with the Sia 5-N-acetyl group ( Figure 5C) apparently changed into a shallow pocket that accommodates the Sia 4-O-acetyl moiety ( Figure 5A). The residues orthologous to BCoV-Mebus HE Thr 114 , Leu 161 , Phe 211 , and Leu 266 were replaced by Leu 119 , Ile 166 , Ser 216 , and Tyr 281 , respectively, and Leu 260 was recruited from the R4-loop, which in MHV-S HE is reoriented as compared to the one in BCoV-Mebus HE ( Figure 4A). These residues, together with conserved Phe 212 , form the hydrophobic lining of the newly shaped pocket (Figures 5A and B). As the Sia-4-O-acetyl group is crucial for ligand recognition by MHV-S HE, this pocket must be key to receptor-binding. In accordance, single Ala substitutions of Leu 119 , Ile 166 , Phe 212 , Leu 260 , or Tyr 281 all reduced receptorbinding activity (although that of Ile 166 to lesser extent) as shown by hemagglutination assay ( Figure 5E) and solid-phase lectin binding assay ( Figure 5F).

Relatively modest changes in the MHV-S receptor binding-site changes ligand specificity
The data reveal in minute detail not only the mode of interaction between MHV-S HE and its cognate receptor determinant,  but also clarify how a CoV HE RBS for 9-O-Ac-Sia might have transformed into one that now specifically binds 4-O-Ac-Sia. The most striking observation is that this major shift in ligand specificity required only minimal changes in the protein and that the binding site architecture was essentially maintained. How this was possible can be explained from the mode of lectin-ligand interaction, based largely on the docking of the methyl groups of the Sia-acetyl moieties into hydrophobic pockets, and from the structures of the two types of ligands. The juxtaposition of the Sia 5-N-and 9-O-acetyl moieties is quasi-similar to that of the Sia 4-Oand 5-N-acetyl groups. The distance between the groups may be different (7.1 versus 5.7 Å as measured between the methyl carbon atoms, respectively), but for each combination the acetyl groups are located in roughly the same plane and at roughly similar angles ( Figure S3). Thus, it can be envisaged that a pre-existing site for 9-O-Ac-Sia was converted to accommodate 4-O-Ac-Sia instead by (i) having the ligand rotate (with binding of the ligand in the novel orientation facilitated through hydrogen bonding with residues introduced by substitutions and/or insertions in the R3 loop) and (ii) by bringing the original 9-O-acetyl binding pocket and 5-Nacetyl binding patch more closely together so that they now can accept the 5-N-and 4-O-acetyl moieties, respectively ( Figure S3). From attempts to fit aNeu5,9Ac 2 2Me into the MHV-S RBS by in silico modelling, the the 9-O-and 5-N-acetyl groups would seem to be spaced too far apart to conveniently dock into the acetylbinding pockets. Moreover, were the ligand to bind in this orientation, the Sia carboxylate would clash with the modified R3loop. These findings thus provide an explanation for exclusion of the original ligand and for the specificity of MHV-S HE for 4-O-Ac-Sias ( Figure S3 and Video S1).
The structure of the MHV-S HE-receptor complex allows guarded predictions only of how glycosidic linkage or additional Sia modifications might affect ligand binding. The C2-oxygen through which glycosidically-bound Sia would be linked to the penultimate residue of the glycan chain is exposed to the solvent and we would therefore expect the lectin to bind Sias in a linkageindependent fashion. Still, the R4-and/or E-loops, as they are proximal to Sia C2 ( Figure 4A), might affect ligand binding such as to cause a preference for a particular linkage type. The pocket for the Sia 5-N-acetyl group would seem sufficiently wide to also accommodate the slightly larger 5-N-Gc substituent ( Figure 5A); whether the lectin does accept 5-N-glycolylated Sias as ligands remains to be shown, however. Finally, from the topology and orientation of aNeu4,5Ac 2 2Me in the RBS of MHV-S HE, ligand binding would seem to be tolerant to modifications at the Sia glycerol side chain ( Figure 5A). Yet, as demonstrated by hemagglutination assay with native and sialate-9-O-acetylesterase-treated erythrocytes, MHV-S HE apparently prefers 4-mono-Oover 4,9-di-O-acetylated Sias [27].
The occurrence of two distinct MHV lineages -exemplified by strains S and DVIM-that through their HE proteins bind to Surface representation of the MHV-S HE receptor-binding site reveals two pockets accommodating the 4-O-and 5-N-acetyl groups of the receptor, respectively. Note that crystals were soaked with aN-eu4,5,9Ac 3 2Me, but most likely as a result of the low pH crystallization conditions, the 9-O-Ac group was lost [42]. (C) Surface and (D) stick representation of the BCoV-Mebus HE receptor-binding site. Note that the topology of the two hydrophobic pockets is conserved, except they bind different substituents of the receptor analogue. (E) The effect of Ala substitutions on receptor binding. Relative binding affinity of wildtype HE 0 (wt) and its derivatives was assessed by hemagglutination assay with rat erythrocytes and twofold serial dilutions of each of the HE 0 -Fc chimeras (5,000 to 10 ng per well, arrow). (F) Binding of twofold serial dilutions of wild-type (wt) HE 0 -Fc chimera and its derivatives in a solid-phase lectin-binding assay towards horse serum glycoproteins (HSG) as described in Figure 2A  widely different Sia subtypes poses an interesting conundrum. While the structure reported here provides clues to how an HE protein ancestral to that of MHV-S may have changed to bind to 4-O-rather than to 9-O-acetylated Sias, the conditions that selected for this shift in ligand specificity and the biological consequences thereof are unknown. The limited data available on the in vivo role of HE suggests that it promotes viral spread [31]. Entry of murine coronaviruses, however, is mediated not by HE, but by the S protein, a type I fusion protein that binds to the principal receptor CAECAM1a [32][33][34]. We propose that HE may act during the very early stages of the infectious cycle as a molecular timer for temporary virion attachment. Through the concerted actions of its lectin and sialate-O-acetylesterase domains, HE would allow virus particles to bind with high avidity and yet reversibly to sialylated surfaces. The time allowed for virions to remain attached would be a function of HE binding affinity/ avidity, esterase activity and local Sia density. Virions by binding to the ubiquitous and highly accessible Sias in the glycocalix would buy time for the S protein to find and bind the main receptor at the cell's surface as an obligatory prelude to penetration. Such a strategy would be advantageous particularly under conditions of low receptor density or poor receptor accessibility. If within the allotted time, HE-mediated virion attachment would not progress to this next stage of entry (for example, because the particle attached not to a susceptible cell, but to decoy receptors on a noncell-associated glycoconjugate), the default would be for the virus to elute and ''take its business elsewhere''. In this model, , MHV HE would appreciably contribute to host cell selection, its ligand preference potentially affecting host-, organ-and cell tropism. Our findings pave the way to study the function of CoV HE and to assess the importance of ligand and substrate specificity through an approach of structure-guided mutagenesis, reverse genetics and animal experimentation in a natural infection model.

Protein expression and purification
A synthetic DNA with human codon-optimized sequence for the HE ectodomain of MHV strain S (MHV-S; amino acid residues 25-403) was cloned in pCD5-Ig [23,24], a derivative of expression plasmid S1-Ig [35]. The resulting construct, pCD5-MHV-S-HE-T-Fc, codes for a chimeric HE protein provided with an N-terminal CD5 signal peptide and, at its C-terminus, preceded by a thrombin cleavage site, the Fc domain of human IgG1 (HE-Fc). The QuikChange XL II site-directed mutagenesis kit (Stratagene) was used to construct pCD5-MHV-S-HE-T-Fc derivatives that code for an enzymatically inactive HE-Fc with the esterase catalytic residue Ser 45 replaced by Ala (HE 0 -Fc), and for HE 0 -Fc mutants with Ala substitutions in the receptor-binding site. For analytical purposes, HE-Fc fusion proteins were produced by transient expression in HEK293T cells and then purified from the cell culture supernatants by protein A-affinity chromatography and low-pH elution (0.1M Citric-acid pH 3.0). The pH of the eluate was neutralized by adding Tris pH 8.0 to a final concentration of 0.2 M and the protein solution was dialyzed against phosphate-buffered saline (PBS). For crystallography, HE-Fc fusion-proteins were transiently expressed in HEK293 GnTI(-) cells [36] and the MHV-S ectodomain was purified by protein Aaffinity chromatography and on-the-beads thrombin cleavage as described [23,24].

Solid-phase lectin binding assay (SLBA)
Maxisorp 96-well plates (NUNC) were coated for 16 hrs at 4uC with horse serum glycoproteins (HSG; 10% v/v horse serum in PBS) or bovine submaxilary mucin (BSM; 10 mg/ml; Sigma) at 100 ml per well. The wells were washed with washing buffer (PBS, 0.05% Tween-20) and treated with blocking buffer (PBS, 0.05% Tween-20, 2% bovine serum albumin, BSA) for 1 hr at RT. Twofold serial dilutions of HE 0 -Fc lectins were prepared in blocking buffer (starting concentration 100 mg/ml) and 100 ml samples of these dilutions were added to the glycoconjugate-coated wells. Incubation was continued for 60 min after which unbound lectin was removed by washing three times. Bound lectin was detected using an HRP-conjugated goat anti-human IgG antiserum (1:10,000 in blocking buffer; Southern Biotech) and TMB Super Slow One Component HRP Microwell Substrate (BioFX) according to the instructions. The staining reaction was terminated by addition of 0.3 M phosphoric acid, the optical density was measured at 450 nm, and graphs were constructed using GraphPad software. To assess and compare the enzymatic activities of BCoV-Mebus and MHV-S HE-Fc towards 4-Oacetylated Sias, HSG coated in Maxisorp plates was treated with samples from two-fold serial dilutions of either enzyme (starting at 100 ng/ml in PBS, 100 ml/well) for 2 hrs at 37uC. The destruction of 4-O-Ac-Sia receptor determinants was determined by SLBA with MHV-S HE 0 -Fc (5 mg/ml in blocking buffer) as described above. Enzymatic de-O-acetylation of aNeu4,5,9Ac 3 2Me was analyzed by gas-chromatography-electron impact mass-spectrometry (GC-MS) as described [24,25,37].

Hemagglutination assay
Hemagglutination assay was performed in V-shaped 96-well plates (Greiner Bio-One). Two-fold serial dilutions in 50 ml PBS, 0.1% BSA of HE 0 -Fc or of purified HE 0 ectodomains (starting amounts indicated in the text) were mixed with 50 ml of a rat erythrocyte suspension (Rattus norvegicus strain Wistar; 0.5% in PBS) and incubated for 2 hours on ice.

Crystallization
Crystallization conditions were screened by the sitting-drop vapor diffusion method using a Honeybee 961 (Genomic Solutions). Drops were set up with 0.2 ml of HE protein solution in 10 mM Tris-HCl pH 8.0 and 0.2 ml reservoir solution. Crystals with space group P2 1 2 1 2 1 were obtained from 0.2 M KH 2 PO 4 , 0.2 M sodium malonate, 15% (w/v) PEG3350 and 0-5% (w/v) glycerol at 18uC. Crystals for diffraction experiments were grown with the hanging drop vapor diffusion method set up by hand with reservoir and protein solution ratio 1:1 (1.6 ml total) at 18uC, and grew to a final size of up to 0.2560.2060.20 mm within one week. For data collection, crystals were flash-frozen in liquid nitrogen using reservoir solution containing 20% (w/v) glycerol as the cryoprotectant. To determine the HE structure in complex with its receptor, crystals of HE 0 were soaked by adding 2 ml of 10 mM aNeu4,5,9Ac 3 2Me in cryoprotectant solution directly into the margin of the drop, resulting in a final substrate concentration of about 7 mM. Crystals were flash-frozen after 5 to 10 minutes.

Data collection and structure solution
Diffraction data of crystals of MHV-S HE and its complex ( Table 1) were collected at ESRF station ID-14-1 and ID-14-3, respectively. Diffraction data of native and ligand-soaked HE crystals were processed using XDS [38] and scaled using SCALA from the CCP4 suite [39]. Molecular replacement was performed using PHASER with BCoV-Mebus HE as template (PDB ID: 3CL5; [23]). Models were built manually with Coot [40] and refinement was carried out using REFMAC [41]. Water molecules were added using ARP/WARP, graphics generated with PYMOL (http://pymol.sourceforge.net).
In the Ramanchandran plot three residues are found in disallowed regions. The electron density of these residues supports the modeled conformation. In both HE monomers present in the asymmetric unit of the crystal structure of free as well as ligandbound HE, the active site region of the esterase domain is largely disordered. No electron density is observed for esterase domain residues A52-A59, B51-B59, A108-A114, A308-A314, A335-A347 and B338-B346, while residues 44-50, 60-72, 332-334, and 348-358 adopt different conformations in the two monomers. Modeling of chain A residues 397-401 and chain B residues 334-337 and 394-398 should be considered tentative. C-terminal residues 396-403 followed by the 7-residue thrombin recognition sequence of the cleavable Fc-fusion are stabilized by crystal packing interactions suggesting that the observed conformation is not physiologically relevant. Stick representation of aNeu4,5,9Ac 3 2Me in gray with (left) the 5-N-Acand 9-O-Ac-or (right) the 4-O-Ac-and 5-N-Ac-groups colored in red (oxygen), blue (nitrogen) and cyan (carbon). Arrows indicate the distances between Ac methyl groups and asterisks the position of the O2 atom through which Sia would be linked to the penultimate residue of the glycan chain. The patch that accommodates the Sia-5-N-Ac group in BCoV-Mebus HE, and the newly formed pocket that accommodates the Sia-4-O-Ac group in MHV-S HE are colored in cyan. The pockets that harbor the Sia-9-O-Ac group in BCoV-Mebus HE and now accommodates the Sia-5-N-Ac group in MHV-S HE are colored in green. (B) Surface representation of the MHV-S HE receptor binding site with Neu5,9Ac 2 2Me modeled in silico in a topology corresponding to that in BCoV HE. The model predicts that 9-O-Ac-Sia will not be accepted as ligand because of (i) the spatial arrangement of the two hydrophobic pockets at too close a distance of each other and (ii) a clash of the Sia carboxylate with residues of the extended R3 loop.