Structural basis of mammalian glycan targeting by Vibrio cholerae cytolysin and biofilm proteins

Vibrio cholerae is an aquatic gram-negative microbe responsible for cholera, a pandemic disease causing life-threatening diarrheal outbreaks in populations with limited access to health care. Like most pathogenic bacteria, V. cholerae secretes virulence factors to assist colonization of human hosts, several of which bind carbohydrate receptors found on cell-surfaces. Understanding how pathogenic virulence proteins specifically target host cells is important for the development of treatment strategies to fight bacterial infections. Vibrio cholerae cytolysin (VCC) is a secreted pore-forming toxin with a carboxy-terminal β-prism domain that targets complex N-glycans found on mammalian cell-surface proteins. To investigate glycan selectivity, we studied the VCC β-prism domain and two additional β-prism domains found within the V. cholerae biofilm matrix protein RbmC. We show that the two RbmC β-prism domains target a similar repertoire of complex N-glycan receptors as VCC and find through binding and modeling studies that a branched pentasaccharide core (GlcNAc2-Man3) represents the likely footprint interacting with these domains. To understand the structural basis of V. cholerae β-prism selectivity, we solved high-resolution crystal structures of fragments of the pentasaccharide core bound to one RbmC β-prism domain and conducted mutagenesis experiments on the VCC toxin. Our results highlight a common strategy for cell-targeting utilized by both toxin and biofilm matrix proteins in Vibrio cholerae and provide a structural framework for understanding the specificity for individual receptors. Our results suggest that a common strategy for disrupting carbohydrate interactions could affect multiple virulence factors produced by V. cholerae, as well as similar β-prism domains found in other vibrio pathogens.


Introduction
The recognition of carbohydrate receptors on host cell-surfaces is an important strategy for achieving the selectivity and potency of virulence factors including adhesion molecules, toxins, and biofilm proteins [1][2][3]. Often mediated by a canonical set of lectin domains with conserved folds, these proteins may broadly recognize terminal sugars on the end of long glycan chains, or specific polysaccharide motifs with complex branched stereochemistry [4]. Understanding the structural mechanism for glycan specificity by lectin domains is important in determining how effector proteins recognize specific host cells and for developing drugs against pathogenic proteins [5][6][7].
Vibrio cholerae is a pernicious human pathogen that secretes factors that utilize carbohydrate receptors, most notably the classical cholera toxin (CT), which binds to GM 1 gangliosides on the intestinal epithelium [8]. V. cholerae also secretes a pore-forming toxin called Vibrio cholerae cytolysin (VCC), which helps defend the bacteria from the host immune system according to mouse models [9,10]. VCC recognizes complex N-glycans commonly found on animal cells [11] through a carboxy-terminal domain with a type I β-prism fold ( Fig 1A); deletion of this domain results in a more than 99.9% loss in cytolytic activity [11]. Even though complex N-glycans are the preferred target of VCC, the exact footprint recognized by the VCC β-prism domain and the structural mechanism for this interaction are currently unknown.
β-prism folds in V. cholerae are not unique to VCC: three additional β-prism domains exist in two biofilm matrix proteins called RbmC (rugosity and biofilm structure modulator C) [12] and Bap1 (biofilm associated protein 1) [13] (Fig 1B). The formation of biofilms by V. cholerae is an important survival strategy [14][15][16] that facilitates bacterial attachment to surfaces, helps protect against environmental insults [17], and is also implicated in human transmission of the disease [18,19]. The biofilm itself is primarily composed of secreted molecules including an exopolysaccharide [17] called VPS (Vibrio polysaccharide) assembled and exported by two clusters of VPS-related genes (vps I and vps II) [20], several matrix proteins produced by the rbm gene cluster, and a mixture of nucleic acids and small biomolecules [20]. RbmC and Bap1 are related multidomain proteins with overlapping functions [12,21], involved in the surface attachment of biofilms and encapsulation of growing cell clusters [21][22][23]. Simultaneous deletion of RbmC and Bap1 leads to colonies deficient in biofilm formation, however, either gene is sufficient to restore function [21]. Interestingly, mass spectral analysis of isolated biofilm material shows that RbmC, Bap1, and VCC are all retained in the biofilm matrix and may therefore display affinity for a subset of related ligands [21].
To better understand the structural specificity of β-prism-carbohydrate interactions in V. cholerae biofilm proteins and toxins, we cloned and expressed the two β-prism domains from RbmC and determined their glycan specificity by screening against a chip-based mammalian glycan library [24,25]. Using isothermal titration calorimetry (ITC) and a fluorescence-based binding assay, we determined the binding affinity of the vibrio β-prisms to multiple fragments content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. KK is supported by an NIH training grant in molecular biophysics (Grant T32 GM008271). Glycan screening was conducted by the Protein-Glycan Interaction Resource of the CFG, which is supported by the National Institutes of Health grants GM62116 and GM098791. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
of target N-glycans to determine the unique footprint recognized by these domains. We crystallized and solved the structure of one β-prism domain from RbmC bound to two fragments of the glycan core. Finally, we show that VCC and both RbmC β-prism domains can target glycans present on rabbit blood cells, a model system that contains complex N-glycans on the cell surface. Our results illustrate a common structural mechanism by which vibrio toxin and biofilm β-prism domains target glycans on host cell-surfaces, facilitating cell lysis in one case and cell attachment in another. Our structures provide a model for how Vibrio cholerae targets the invariant core of cell-surface receptors commonly found on vertebrate cells while allowing for heterogeneity in the rest of the glycan, a model that may apply to β-prism domains found elsewhere in nature.

The V. cholerae genome contains four β-prism domains
A protein BLAST [26] search (blastp) was conducted using the C-terminal Vibrio cholerae cytolysin β-prism domain (residues 587 to 716 from PDB ID 1XEZ in Vibrio cholerae strain N16961), which identified three additional β-prism domains in two open reading frames ( Fig  1B), belonging to rbmc (NCBI gene ID = 2614150) and bap1 (NCBI gene ID = 2613517), which encode biofilm matrix proteins [12,27]. For simplicity, we refer to the two β-prism domains of RbmC as RbmC1 and RbmC2 (Fig 1B). A sequence alignment of the four V. cholerae β-prism domains (Fig 1C) indicates that the degree of sequence identity between VCC and the other three domains is 36.5%, 40.5%, and 33.3%, for RbmC1, RbmC2, and Bap1, respectively. Attempts at expressing full-length RbmC and Bap1 proteins in E. coli were unsuccessful, however expression of the two β-prism domains from RbmC was achieved via generation of a fusion construct with bacterially optimized GFP UV . Unfortunately, expression of the Bap1 β-prism domain in a soluble form was not possible, even as a GFP UV -fusion construct.

Glycan-binding repertoire of RbmC β-prism domains
To determine the glycan specificity of RbmC1 and RbmC2, we labeled the purified proteins with a fluorescent tag and subjected them to glycan screening by the Consortium for Functional Glycomics (CFG) against the mammalian glycan chip v. 5.2, which contains 609 mammalian glycans arrayed via amino linkers on an N-hydroxysuccinimide-activated glass slide (http://www.functionalglycomics.org/). Screening results indicated a similar pattern of glycan recognition between RbmC1, RbmC2, and VCC [11], with top hits containing a similar pattern of biantennary complex N-glycans (Fig 2A and 2B). The glycans identified in the screen typically contained an NGA2-type conserved heptasaccharide motif (GlcNAcβ1-2Manα1-6 (GlcNAcβ1-2Manα1-3)Manβ1-4GlcNAcβ1-4GlcNAc), although binding to some glycans missing one or both antennae (truncated at the mannotriose core, see Fig 2) was also observed for all three proteins. The top four previously determined VCC hits [11] all contained a complete heptasaccharide NGA2 core, although there was one single-antenna complex N-glycan within the top 10 glycans identified. As was observed for VCC, binding to high-mannose type glycans (which contain highly-branched mannose chains past the mannotriose core) was not observed. The glycan chip results are not necessarily quantitative, meaning the fluorescence signal should not be interpreted as being directly proportional to the binding affinity since the glycan density on the chip may vary. For this reason, further binding experiments were required to determine the affinity of the target proteins for individual glycan fragments.
Our results show that the VCC toxin and RbmC β-prisms target a similar repertoire of carbohydrate moieties on cell surfaces, suggesting a common strategy of cell-surface recognition. While complex N-linked glycans are found in all higher eukaryotes including plants and animals [28], later processing steps in the Golgi vary across different organisms yielding differently modified repertoires of cell-surface glycans [29,30]. Complex N-glycans are abundantly expressed on differentiated epithelial cells in the gastrointestinal tract [31] and are also present on mucin proteins [32]. While the processing of N-glycans on surface proteins yield a heterogeneous mixture of branching and modification characteristics, the core structures found in the screen are likely present in glycans found on the intestinal epithelium.
Owing to the design of the glycan chip, our results indicate that the β-prism domains of VCC and RbmC are capable of targeting mammalian cell-surface glycans. While it is possible that the β-prism domains of RbmC might also bind to the exopolysaccharide abundant in V. cholerae biofilms (VPS), it is reasonable to posit that binding of this tetrasaccharide repeat is preferentially coordinated by any of the additional domains found in RbmC, or by other biofilm matrix proteins, and that the primary role of the RbmC β-prism domains is to target host cell glycans. This hypothesis is supported by the strong affinity (low nanomolar) of VCC and RbmC β-prisms for N-glycans and absence of hits that resemble the VPS tetrasaccharide repeat [33]. Additionally, another biofilm matrix protein, RbmA, has been implicated in VPS binding in the V. cholerae biofilm [34].

Mapping glycan binding by vibrio β-prism domains
To better understand of the binding footprint for complex N-glycan recognition by vibrio βprism domains, we measured the binding affinity of the VCC and RbmC β-prisms to complex N-glycan fragments using a top-down approach (Fig 3 and S1 Fig). Due to the variable availability of different glycan fragments, both isothermal titration calorimetry (ITC) and intrinsic tryptophan fluorescence spectroscopy were utilized in characterizing the glycan footprint of the vibrio β-prism domains. For glycan fragments that could not feasibly be obtained in the quantities required for ITC measurements, the intrinsic tryptophan fluorescence of purified βprism domains was used to monitor ligand association.
The footprinting results indicate that the VCC β-prism binds most tightly to the NGA2 heptasaccharide core, with a measured binding affinity of 0.2 μM. Glycan fragments smaller than NGA2 bound less tightly, including a 228.6 μM affinity for the mannotriose branch point and 2.7 mM for the methyl-α-mannose monosaccharide (as reported previously in [11]). We were unable to detect binding for the N-acetyl-D-lactosamine (LacNAc) disaccharide, which is present in one arm of a typical complex N-glycan. Glycans with arms longer than NGA2, including the NA2 and A2 glycans (which respectively extend galactose and sialic acid modifications onto the terminal ends of the biantennary arms), did not display tighter binding affinities. To investigate whether the vibrio β-prism domains are also capable of binding a mammalian-derived glycoprotein, we measured the binding affinity of VCC to bovine asialofetuin. While asialofetuin is a good indicator of general binding to complex N-glycans, analysis of these binding phenomena are overestimated by ITC because the glycoprotein contains three glycosylation sites occupied by heterogeneous mixtures of bi-and tri-antennary glycans [35]. VCC bound this protein as previously reported [36] with an affinity in the low micromolar range. Binding affinities for various glycans were determined using ITC (blue) or intrinsic tryptophan fluorescence spectroscopy (red). Errors are reported as 95% confidence levels (for ITC data) or the standard error of the mean (for fluorescence data). Dashes represent experiments not carried out due to lack of feasibility. Carbohydrates are represented as described in Fig 2A. Ã Asialofetuin is a glycoprotein containing three primary glycosylation sites, which contain a mixture of bi-and tri-antennary glycans [35]. Data were fit assuming three sites per glycoprotein, but due to the heterogeneity of glycosylation these numbers may represent an overestimation of the binding affinity. The asialofetuin schematic shows the predominant tri-antennary glycan. Binding studies were repeated for several of the glycan fragments with the RbmC1 and RbmC2 β-prism domains. In contrast to the 2.7 mM methyl-α-mannose affinity with VCC, RbmC1 and RbmC2 displayed tighter affinities of 178.8 μM and 502.6 μM, respectively (Fig 3). We were also unable to observe binding of RbmC1 or RbmC2 to the LacNAc disaccharide, while RbmC2 did bind N-acetylglucosaminyl-β-1,2-mannose (GlcNAc-Man) with an affinity of 158.8 μM. The mannotriose core (1,3-α-1,6-α-D-mannotriose) also bound to the RbmC1 and RbmC2 β-prism domains with affinities of 0.6 μM and 4.1μM, respectively. Binding of NGA2 to RbmC2 exhibited one of the highest affinities measured for any β-prism domain to a glycan fragment, with an affinity of 1.1 nM. A pentasaccharide fragment of NGA2 missing the two anchoring GlcNAc sugars provided a similar affinity of 1.5 nM, which is still much tighter than binding to mannotriose (4.1 μM). This suggests that at least for RbmC2, it is the pentasaccharide core that is the likely footprint targeted by the β-prism domain. RbmC1 and RbmC2 also bound tightly to asialofetuin, with apparent affinities of 15.9 nM and 18.4 nM, respectively.
If we assume that the three β-prism domains employ a similar footprint for glycan recognition (as supported by the glycan chip data), our results indicate that the binding footprint is likely centered around the mannotriose core. At least for VCC, glycans larger than NGA2 do not bind any tighter, likely because the extended chains do not strongly interact with the βprism. Similarly, for RbmC2 the pentasaccharide and NGA2 glycans had similar affinities (but better than mannotriose) and asialofetuin did not bind any better than the pentasaccharide. It is possible that the minimal recognition motif is comprised of a tetrasaccharide made up of mannotriose plus a single N-acetylglucosamine, however due to the difficulty in obtaining asymmetrically branched glycans, this interaction could not be tested. High-mannose type glycans (or oligomannose, containing only mannose past the mannotriose core) are present on the mammalian glycan chip, but did not elicit strong binding by any of the vibrio β-prism domains tested. In general, the RbmC β-prism domains bound more tightly to glycan fragments than the VCC toxin domain by a factor of approximately 50-100-fold.

Structure of the RbmC2 β-prism domain bound to mannotriose
To gain insight into to the mechanism of glycan recognition by vibrio β-prism domains, we performed crystallization trials in the presence of N-glycan core fragments. Although we were unable to obtain well-diffracting crystals of any β-prism domain with a bound pentasaccharide fragment, crystals were obtained for RbmC2 bound to mannotriose and N-acetylglucosaminyl-β-1,2-mannose (GlcNAc-Man). X-ray data were collected to 1.5 Å resolution and the structure solved by molecular replacement using the VCC β-prism lectin domain ( Table 1). The electron density map displayed unambiguous density for the mannotriose ligand in a single orientation (Fig 4A). Calculations using PISA [37] indicated 384 Å 2 of buried surface area between the ligand and the protein. As previously observed in VCC and other β-prism lectins, a conserved aspartic acid residue (D617 in VCC, D853 in RbmC2) mediates hydrogen-bonding interactions with hydroxyl groups on one of the mannose rings ( Fig 4B). In VCC, mutation of this residue to alanine results in a 50-fold loss in hemolytic activity of the toxin. In the RbmC2 structure, D853 interacts with the α1,6-linked mannose, whose O4 and O6 hydroxyl groups make putative hydrogen-bonding interactions with the D853 side-chain. This position is highly conserved across β-prism domains found throughout nature and is likewise found in all four of the vibrio β-prism domains expressed by V. cholerae ( Fig 1C).
Analysis of bonding interactions within a distance less than 3.35 Å indicated a total of ten direct hydrogen bonds between RbmC2 and the mannotriose ligand (Figs 4B, S2 and S3), with eight of these interactions targeting the α1,6-linked mannose. Of the eight hydrogen bonds with the α1,6-linked mannose, six are interactions with the peptide backbone, and two are with the oxygens of the D853 side-chain. The position occupied by the α1,6-linked mannose is the same as that observed in the structure of the VCC β-prism in complex with the monosaccharide, methyl-α-mannose [11]. The α1,6-linked mannose also makes the most extensive surface contacts with RbmC2, with 165 Å 2 of buried surface area. The central and α1,3-linked mannose rings each make a single hydrogen bond to the protein, both to side-chain atoms. Not surprisingly, the two more "loosely"-bound sugars had higher average crystallographic Bfactors (3.5 Å 2 , 7.4 Å 2 , and 14.0 Å 2 for the α1,6-linked, central, and α1,3-linked mannose sugars, respectively). Binding of RbmC2 to these saccharides also resulted in progressively less buried surface area at 121 Å 2 and 98 Å 2 for the central and α1,3-linked mannose, respectively. The high resolution of the structure confirmed that the central mannose best refined as the βanomer, as expected [30]. Also of note is a centrally located water molecule (B-factor = 4.2 Å 2 ) coordinated between ligand and protein groups (Figs 4B and S3), including W948 and N971. The latter residue is on a flexible loop that adopts a different conformation in the apo structure, suggesting that this particular water molecule is only present when the glycan is bound.

Structure of the RbmC2 β-prism domain bound to GlcNAc-Man
RbmC2 was crystallized in the presence of N-acetylglucosaminyl-β-1,2-mannose (GlcNAc-Man), and the structure solved to 1.8 Å resolution. Density for the GlcNAc-Man ligand was observed in the same binding pocket identified in the RbmC2/mannotriose and VCC β-prism/ methyl-α-mannose complexes. While occupancy refinement of the mannotriose sugar suggested near full occupancy, the N-acetylglucosaminyl-β-1,2-mannose ligand density suggested a mixture of the carbohydrate and glycerol (overlapping with the C4, C5, and C6 carbons of the mannose sugar). Alternate configurations for the two ligands were built into the density and their relative occupancies were refined, yielding a ratio of 65% GlcNAc-Man to 35% glycerol (presumably from the cryoprotectant). The mannose moiety occupies the same location as the α1,6-linked mannose in the mannotriose structure (Fig 4C), making a similar constellation of hydrogen-bonding interactions. Coordination of the disaccharide by RbmC2 resulted in a total of 338 Å 2 of buried surface area with a contribution of 160 Å 2 by the mannose moiety and 179 Å 2 by the N-acetylglucosamine. Interestingly, the centrally located water molecule observed in the mannotriose structure was also present, even though the fourth coordination site (previously made by the central mannose moiety) is missing in this structure (Figs 4D and S4). In the apo structure (with three protein molecules in the asymmetric unit), glycerol molecules from the cryo-protection solution are found in the glycan binding pocket.
In two copies of the asymmetric unit, a water is present bound to W948 as in the mannotriose and GlcNAc-Man structures. In the third copy, a hydroxide group from a glycerol molecule replaces this position. As previously stated, the loop containing the N871 residue is unstructured in two of three copies of the asymmetric unit and the ordered copy is in a different conformation than in the ligand-occupied structures. This suggests that binding of the ligand may lead to the ordering of this loop and that the water molecule coordinated by N871 and W948 may help mediate this interaction.
The N-acetylglucosamine moiety of the GlcNAc-Man structure was observed projecting away from the mannotriose binding cleft. Fewer hydrogen-bonding contacts are made between RbmC2 and this sugar, perhaps contributing to the higher B-factors and less welldefined electron density for this group. In fact, only a single hydrogen bond was observed with the GlcNAc group: between the acetyl O7 atom and the guanidinium group of R876. This residue is not conserved among the various vibrio β-prism domains, but is instead replaced by an asparagine in the RbmC1 sequence (and presumably structure), while in the VCC β-prism structure this location is occupied by a serine, which is also present in the Bap1 β-prism sequence. The only other observed contact between RbmC2 and the GlcNAc moiety is a van der Waals contact between the sugar ring and the side-chain of V874. The distance between the CG2 carbon of this valine and the C4 carbon of GlcNAc is 3.6 Å. This residue is not conserved across the other three β-prism domains, suggesting that interactions with this sugar may vary between the different vibrio β-prism domains.

RbmC2 β-prism domain apo structure
RbmC2 apo crystals were obtained with space group C222 1 , with three copies in the asymmetric unit. In general, RbmC2 only displayed slight differences between the apo, mannotriose, and GlcNAc-Man structures. The all-atom RMSD between the two ligand-bound structures is 0.21 Å 2 and approximately 0.34 Å 2 between liganded and apo (compared to 1.2 Å 2 for RbmC2 vs. VCC β-prism). Comparing glycan-bound structures to the three copies of RbmC2 in the apo structure asymmetric unit revealed differences in a loop (with the sequence PVQGT), which is not present in the other vibrio lectin domains (Figs 1C and S5). Threonine 870 is the only amino acid within this loop that directly contacts the ligand forming a hydrogen bond between the peptide backbone and O3 of the α1,6-linked mannose group. Aside from the rearrangement of this loop, only subtle side-chain rearrangements of F850, W896, and W948 are observed between the apo and mannotriose-bound, and GlcNAc-Man-bound structures.
Taking all of the observed protein-ligand interactions into account, the structural basis for the increased binding affinity of the RbmC2 domain over VCC is likely due to substitutions of several key residues that interact with the bound ligand. The F850 and W896 residues that participate in van der Waals and ring-stacking interactions with the mannotriose ligand are conserved in RbmC1 and RbmC2, but replaced by an alanine (A614) and tyrosine (Y654), respectively, in VCC. The increased surface area available for ring-stacking due to the substitution of tryptophan in RbmC1 and RbmC2 for the tyrosine in VCC may account for some of the increased affinity of the RbmC domains for glycans. Furthermore, the side-chain hydroxyl group of Y894 that forms a hydrogen bond with the central mannose group in the RbmC2/ mannotriose structure is conserved in RbmC1, but replaced with a hydroxyl-lacking phenylalanine in VCC. The remaining residues contacting the ligand through side-chain or backbone interactions are either conserved in all three domains, or between VCC and one of the two RbmC β-prism domains.

Effects of mutations on vibrio β-prism function
To gain a better understanding of the functional importance of complex N-glycan interactions, we made mutations to residues that line the ligand binding pocket. Because we do not have a functional assay for RbmC biofilm activity, mutations were made in VCC, where the lytic activity against mammalian cells (rabbit erythrocytes in our model system) can be monitored. As an enteric pathogen, Vibrio cholerae is unlikely to encounter erythrocytes during an infection. However, we demonstrated previously that VCC lyses rabbit blood cells as well as human T-cells, monocytes, and neutrophils [36], the latter of which is a likely target [10]. Furthermore, rabbit erythrocytes are highly decorated with biantennary complex N-glycans [39] making them a good model for targeting studies. The change in activity of the mutant is reported as the ratio of HD 50 values (the concentration that elicits 50% cell lysis, Fig 5A) between the mutant and wild-type toxin (which is typically around 100 pM [36]).
We classified the effects of individual mutations (Fig 5B) into those that elicited a 0-10-fold loss (10-100% WT activity), 10-100-fold loss (1-10% WT activity), and >100-fold loss in activity (<1% WT activity). To put these data into context, we built a composite ligand model in which we extended the mannotriose core outwards, including the next two β1,2-linked N-acetylglucosamine moieties seen in complex N-glycans (Fig 5C). We used the GlcNAc-Man structure to place the GlcNAc moiety connected to the α1,6-linked mannose, and modeled in the second GlcNAc moiety, followed by energy-minimization. D617K, a mutation targeting the residue that forms two hydrogen bonds with the α1,6-linked mannose group in both VCC and A previous study showed that D617K and D617A had similar deleterious effects on VCC activity [11]. Not expected was that mutation of L707 to alanine, a residue that forms the floor of the binding pocket and is conserved between three of the four vibrio β-prism domains (S2 Fig), led to a 505-fold loss in activity. Mutations resulting in 10 to 100-fold activity loss included three aromatic residues that contact the ligand in RbmC2: F652, Y654, and W706 (VCC numbering). W706 adopts two distinct rotamer conformations across several crystal structures of VCC [11] and could potentially form a stacking interaction with the N-acetylglucosamine residue attached to the α1,6-linked mannose (yellow in Fig 5C). To see how disrupting the N-acetylglucosamine residue that exits the opposite side of the binding pocket (attached to the α1,3-linked mannose) might affect binding, we mutated G613 (a glycine in all three lectin domains investigated here) to a lysine residue to sterically block the glycan exit channel. Lysine was selected as it is a bulky side-chain that is not tightly constrained in rotamer positions with a positive charge to maintain protein solubility. This mutation led to a 47-fold loss in activity suggesting that this side of the binding channel must remain unoccluded to allow the glycan arm to exit the pocket (Fig 5C). While mutation of glycine residues to any amino acid can cause issues with allowed backbone Ramachandran angles, the two glycine mutants could still be expressed in a soluble form and retained some hemolytic activity, suggesting that the mutations did not lead to a substantial misfolding of the domain.
Several mutations only displayed modest effects on the activity of the VCC toxin. N630 and G631 line the bottom lip of the binding pocket in a similar position to the attachment point of the PVQGT loop in RbmC2. The mutants N630A and G631K led to 1.7-fold and 4.9-fold losses in activity, respectively, indicating that these loop side-chains do not interact substantially with bound glycans in VCC. Likewise, Q658A had virtually no effect (1.2-fold loss) on VCC activity. This residue is at a position just outside W706 designating the outer boundary of the glycan footprint extending from the α1,3-linked mannose. Together, these results support a model where the bound glycan projects antennae outward through both sides of the binding pocket (Fig 5C). Such a bound conformation would tolerate both heterogeneity in the N-glycan branches, as well as core fucosylation of the first N-acetylglucosamine that attaches the glycan to cell-surface proteins (a common modification on glycosylated proteins [30]).

Binding of RbmC2 to mammalian cells
To determine whether the RbmC1 and RbmC2 β-prism domains can recognize mammalian cells with complex N-glycans on their surface, we used GFP UV fusion proteins to label rabbit whole blood cells and imaged with fluorescence microscopy. Attempts to label cells with a GFP UV -VCC were unsuccessful, possibly due to the lower affinity of VCC for complex N-glycans. To ensure that our imaging assay reflected specific binding to these cells, we made the analogous point mutation in RbmC1 (D539A) and RbmC2 (D853A) to the D617 mutation in VCC that greatly diminishes hemolytic activity against rabbit cells. ITC experiments against bovine asialofetuin confirmed that the wild-type fusion was active (S6 Fig) and that the mutant fusion exhibited greatly diminished binding. ITC experiments also identified a lack of binding of RbmC2 D853A to mannotriose. Both constructs expressed at similar levels, appeared identical on an SDS-PAGE gel, and exhibited monodisperse behavior on a size-exclusion chromatography column suggesting that the mutation did not disrupt the structure of the β-prism domain.
To further confirm that mutation of the key sugar binding aspartic acid did not disrupt the folding or stability of the protein, we subjected RbmC2 WT and RbmC2 D853A β-prism domains to a thermal melt monitored via circular dichroism (CD). CD spectra for RbmC2 WT and RbmC2 D853A before melting (at 20˚C) exhibited similar profiles (S7A Fig) and estimated secondary structure composition (calculated with β-structure selection (BeStSel) [40], S7B Fig), also consistent with the RbmC2 crystal structure. RbmC2 WT and RbmC2 D853A maintained mostly β-strand secondary structure through 96˚C and fitting with BeStSel showed an increase in parallel and left-twisted antiparallel β-sheets, suggesting that unfolding likely led to the formation of amyloid structures (S7B Fig). Although RbmC2 D853A displayed decreased CD signal beyond 54˚C (perhaps due to lower solubility caused by the D853A mutation or a different high-temperature state), our results suggest that mutation of the aspartic acid residue did not destabilize the RbmC2 domain at temperatures below 50˚C (S7C Fig), confirming validity of the blood binding assays and ITC, which were performed at room temperature. Due to the low solubility of the RbmC1 domain following cleavage from the GFP UV fusion protein, we did not perform a CD melt on the RbmC1 mutant. Both RbmC1 WT and RbmC1 D539A fusion proteins expressed and purified in a soluble form and led to monodisperse peaks on a sizing column suggesting that like RbmC2, they adopt a folded structure at room temperature.
Fluorescence images showed binding of both the RbmC1 and RbmC2 wild-type fusions to the rabbit blood cells (Fig 6A and 6B), while binding of the aspartic acid substitution mutants was not detected. RbmC1 and RbmC2 clearly recognized and bound tightly to mammalian blood cells known to display complex N-glycans in a specific manner (dependent on interactions with D539/D853), suggesting that RbmC1 and RbmC2 domains have the ability to target mammalian cell-surface glycans with high-affinity.

Discussion
Our results indicate that the β-prism domains of VCC and RbmC share a strong preference for complex N-glycans and are likely directed against similar targets, whether in the environment or in the host. Similar to results seen with VCC [11], screening against the glycan library failed to show binding of RbmC1 and RbmC2 to high-mannose type glycans, suggesting that vibrio β-prism domains prefer glycans highly represented on mammalian cell surfaces [29,41]. Our protein-glycan complex structures illustrate the basis for this selectivity. The vibrio β-prism domain sugar-binding pocket is able to accommodate branching of carbohydrate moieties extending from the α1,3-linked mannose and a select subset of carbohydrate extensions from the α1,6-linked mannose of the mannotriose core. In complex N-glycans, the GlcNAc moiety attaches to the O2-position of the core α1,6-linked mannose [30], a configuration tolerated by the vibrio β-prism binding pocket. In branched high-mannose glycans, α1,3-and α1,6-linkages to the α1,6-linked mannose of the mannotriose core are utilized, both of which are sterically incompatible with the vibrio β-prism domain glycan binding pocket. Complex N-glycans with more than two antennae on the α1,6-linked mannose residue are also unlikely to bind, because these additional branches extend with β1,4-and β1,6-linkages, and both positions are also sterically blocked in the sugar binding pocket. Furthermore, Vibrio cholerae VPS consists of modified gulose, glucose and galactose monomers linked in α-and β1,4-linkages [33]. This linear polymer conformation is distinct from the bent shape made by α1,6 and α1,3-linkages in the mannotriose core and lacks the O4 position necessary for crucial hydrogen-bonding interactions with the key aspartic acid position in the β-prism domain. Therefore, we expect that VPS is sterically unlikely to bind to VCC and RbmC β-prism domains.
We were unable to express the β-prism domain from the biofilm matrix protein Bap1 and therefore cannot comment on whether Bap1 has a similar preference for complex N-glycans as VCC and RbmC. While Bap1 maintains the key aspartate residue (D348) and ring-stacking tryptophan (W387) that anchor the mannotriose ligand in RbmC2 (S2 Fig), the core tryptophan and leucine residues (W706/L707 in VCC, W948/L949 in RbmC2) are missing in Bap1 and instead replaced by a seven-residue insertion. RbmC and Bap1 have similar, but not identical roles in biofilm formation [21,22] and could thus target a different assortment of glycan receptors. This may involve the attachment of growing colonies to a host cell surface, or the absorption of clusters of cells already surrounded by biofilms to host surfaces in the gut. To the best of our knowledge, the majority of experiments involving biofilm formation using V. cholerae strains are performed on glass coverslips, glass liquid culture tubes, polystyrene plates or Teflon surfaces, all of which are unlikely to contain N-glycan anchor points. Under these conditions, we would not expect β-prism domains to be utilized in making adhesive contacts. Future experiments investigating biofilms grown on cultured cells or tissue surfaces may uncover variations in how the biofilm matrix forms, particularly when these additional adhesive interactions are part of the complex system.
Our structural and functional results suggest a model for N-glycan recognition by vibrio βprism domains. Rather than targeting motifs on antennae termini that may vary heterogeneously across different cell types, they selectively bind to the invariant core. This arrangement is also less likely to be affected by further modifications, like core fucosylation of the N-acetylglucosamine stem [42]. Some glycans have an additional bisecting N-acetylglucosamine residue projecting out of the central mannose in the mannotriose core with a β1,4-linkage [42]. The orientation in which this core is bound in RbmC2 leaves little space for this modification, although binding could occur if W948 flips to a different rotamer. Plant complex N-glycans often contain an additional β1,2-xylose modification on the central mannose of the mannotriose core [28]. This position is relatively unhindered in our structures (Fig 4B), raising the possibility that vibrio biofilms could attach to plant surfaces, although these glycans were not represented on the mammalian glycan chip. By accommodating these and other modifications, the β-prism lectin may help toxins and biofilm proteins target a wider range of glycan structures in the environment and in hosts. Because the core structure described in this study is shared in all complex N-glycans, it would also be nearly impossible for a host to evolve resistance without substantial truncation of cell-surface glycans.
Our results also indicate that the affinity for N-glycans varies among the different vibrio βprism domains. We determined the affinity for mannotriose and heptasaccharide glycan cores to be at least 100-fold stronger for RbmC β-prism domains than for the VCC β-prism domain. This 100-fold difference theoretically requires approximately 2.7 kcal/mol of binding energy, which can be explained by the presence of one or two additional hydrogen bonds or <150 Å 2 of newly buried surface energy. The increased binding energy required for the 100-fold difference observed between RbmC and VCC β-prism domains can be accounted for by the additional hydrogen bonding interaction of Y894 (Y575 in RbmC1, F652 in VCC), the additional surface area provided by W896 (W577 in RbmC1, Y654 in VCC), or the additional hydrogen bond facilitated by T870 in RbmC2. VCC lyses cells at picomolar concentrations, so comparing these affinities, we expect RbmC to bind very tightly to cell surfaces containing N-glycan groups, particularly due to avidity effects from having two β-prism domains. Bap1, with only one β-prism domain, may bind less strongly. A mutant form of RbmC missing the C-terminal β-prism domain (RbmC2) is still able to rescue biofilm formation by a ΔrbmC/Δbap1 double mutant [21], likely due to the redundant nature of the β-prism domain. Both Bap1, with a single β-prism domain, and full-length RbmC are also able to rescue biofilm formation in a double knockout [12].
Our results demonstrate a common mechanism by which vibrio biofilm and toxin proteins utilize complex N-glycans to attach to and attack host cell membranes. While the role of biofilm attachment to complex N-glycans in the disease cycle has not yet been explored, it is possible that clusters of biofilm-encapsulated bacteria could strongly attach to the human intestinal epithelium following ingestion helping to seed new colonies. Or, newly established colonies in the gut could utilize epithelial surfaces as biofilm-dependent anchors. Interestingly, biofilm formation is negatively regulated by quorum-sensing pathways [43,44], suggesting that it may be adventitious to form biofilms during early stages of colonization, but not so when the pathogen prepares to leave the host. At the same time, VCC could help defend against immune cells [10,45] or cause localized inflammation by attacking the epithelial surface directly before quorum sensing eventually leads to the down-regulation of VCC [46] and activation of classical cholera toxin and other important virulence factors [47].
A better understanding of the role fulfilled by these glycan-interactions is important for devising interventions to block intoxication and biofilm formation by bacterial pathogens and may provide additional methods to target specific cell types. For example, cancer cells often display modified glycan structures on their cell membranes [48], a marker that might be exploited by agents specifically targeted against these glycans. These results also suggest further experiments aimed at studying biofilm formation on cellular substrates, where carbohydrate adhesion interactions may further modify the growing biofilm matrix.

Construction of RbmC and VCC β-prism lectin constructs
Full-length RbmC and Bap1 were cloned by PCR from Vibrio cholerae El Tor strain N16961 genomic DNA into the pET28b vector (Novagen, Inc.). Individual β-prism lectin domains (denoted RbmC1 and RbmC2, from residues S505 to T640 and S823 to Y957, respectively) were cloned into the pNGFP-BC vector to form GFP UV fusion proteins [49], confirmed by DNA sequencing, and transformed into T7 Express E. coli for expression. The full-length VCC gene (hlyA, from Vibrio cholerae O1 El Tor strain 8731) [50], the VCC β-prism lectin domain cloned into the pET-28b vector [11], and the RbmC2 β-prism domain in pNGFP-BC were subjected to site-directed mutagenesis using a previously described procedure [51]. Briefly, complementary primers containing the desired mutation were used to amplify the entire genecontaining plasmid by PCR and the resulting DNA was digested by DpnI to remove parental DNA. The reaction was then transformed into NEB5α-cells. The resulting clones were miniprepped and sequenced to confirm the introduction of the site-directed mutation.

Expression and purification of β-prism lectin and VCC full-length constructs
For expression of the RbmC1 and RbmC2 β-prism domains, LB broth supplemented with 100 mg/L ampicillin was inoculated with overnight cultures (1:60 dilution, v/v), grown at 37˚C to an O.D. of 0.6, induced with 1 mM IPTG, and incubated at 30˚C for 4 hours. Cells were pelleted by centrifugation at 3500 x g in a Sorvall LYNX 6000 centrifuge (F9-6x1000 LEX rotor) and lysed by passing three times through an Emulsiflex-C5 high-pressure homogenizer (Avestin, Inc.). The lysate was cleared at 40,000 x g for 30 minutes at 4˚C (F20-12x50 LEX rotor) and the resulting supernatant loaded onto a 10 ml TOYOPEARL AF-Chelate-650M nickel column (Tosoh Corporation) equilibrated in TBS buffer (20 mM Tris pH 7.6, 150 mM NaCl). The column was washed with 10 column volumes of TBS buffer containing 40 mM imidazole and the protein eluted in TBS buffer containing 250 mM imidazole. To remove the GFP tag, the fusion proteins were incubated with 1:100 (wt/wt) human α-thrombin (Haematologic Technologies) for 4 hours at room temperature or 1:500 (wt/wt) trypsin (Sigma Aldrich) for 1 hour at room temperature and the reaction stopped with 20 mM EDTA and 1 mM AEBSF. Wild-type RbmC β-prism domains were separated from the polyhistidine-tagged GFP fusion partner by passing the over a Sepharose S6 10/300 size-exclusion column (GE Healthcare) in TBS buffer. RbmC2 D853A was separated from cleaved GFP by passage over a Superdex 200 Increase 10/300 column (GE Healthcare) in TBS buffer. VCC full-length and VCC β-prism lectin domain constructs were expressed and purified similarly, with the following modifications. VCC full-length protein was expressed in Shuffle T7 E. coli cells (New England Biolabs) for 4 hours at 30˚C and the VCC β-prism lectin domain was expressed for 8 hours at 37˚C in T7 Express E. coli cells. Both VCC proteins were purified over a 5-ml HisTrap Ni-NTA column (GE Healthcare) followed by a Superdex 200 10/300 size-exclusion column (GE Healthcare) run in TBS buffer.

Glycan chip assay
Purified RbmC1 and RbmC2 GFP UV fusions were fluorescently labeled by primary amine chemistry using an AlexaFluor 488 succinimidyl ester reagent (Thermo Fisher Scientific). Proteins were concentrated to 5.5 or 3.4 mg/ml and diluted into 100 mM sodium bicarbonate buffer, pH 9.0. While stirring, 0.15 mg of the dye (resuspended in 0.15 ml DMSO) was added to 1.1 ml of protein and incubated for 1 hour at room temperature. Unreacted dye was bound by adding 0.1 ml 1.5 M Tris pH 8.5 and removed from labeled proteins by running over a Superose 6 10/300 size exclusion column equilibrated in TBS buffer with 1mM EDTA and 1mM sodium azide. Labeled proteins were sent to Core H of the CFG for analysis against the mammalian glycan screen v. 5.2. In brief, glycan chips were incubated with 180 μg/ml of labeled lectin for 1 hour, washed three times to remove non-specific binding, and dried under nitrogen before imaging using a Perkin Elmer ProScanArray XL4000 scanner. The data are reported as the average response units of six replicates after removing the highest and the lowest data points. The entire dataset is freely available through the CFG website (www. functionalglycomics.org).

Fluorescence binding assay
VCC β-prism lectin domain was buffer exchanged into PBS (20 mM sodium phosphate pH 7.4, 150 mM NaCl) by running over a Superdex 200 10/300 size exclusion column (GE Healthcare). Fluorescence assays were performed using a Fluoromax-2 spectrophotometer (Horiba Jobin-Yvon Inc.). Data were collected in 6 mm x 6 mm quartz cuvettes (Starna Cells, Inc.) continuously stirred throughout the span of the experiment. The intrinsic protein fluorescence was monitored by exciting protein samples at 295 nm with a band-pass of 4 x 4 nm. Upon addition of glycans (dissolved in PBS buffer), the increase in intrinsic fluorescence was measured and the fractional increase calculated and normalized to the highest point. All data were acquired in triplicate and the results plotted in Origin v. 8.0 (OriginLab Corporation) and fit using the RandoA function.

Hemolysis assay
Purified VCC was activated by proteolytic cleavage using α-chymotrypsin (1:350 wt/wt) for 30 minutes at room temperature and serially diluted. Activated VCC dilutions were added to wells in a 96-well clear bottom plate containing defibrinated rabbit whole blood diluted to an absorbance at 595 nm of 1.0 in blood dilution buffer (20 mM sodium phosphate pH 7.4, 150 mM NaCl, 1mg/ml BSA). The absorbance was monitored at 595 nm every 15 seconds at room temperature in an iMark 96-well plate reader (Bio-rad Laboratories, Inc.). Raw absorbance data were converted into % lysis and HD 50 values calculated as described previously [36] using KaleidaGraph v. 4.1.3 (Synergy Software).

Crystallization, X-ray data collection, and structure refinement
Purified RbmC2 lectin domain in TBS was concentrated to~5 mg/ml using a 3 kDa cutoff Vivaspin concentrator (GE Healthcare Life Sciences). Crystals were grown using vapor diffusion in 24-well tissue culture plates by mixing 1:1 (v/v) protein and precipitant solution (0.1 M sodium acetate pH 4.6, 2M ammonium sulfate for all crystals) and suspending drops on siliconized cover slips over a 0.5 ml reservoir solution. For crystallization with ligands, a 1:10 molar excess of sugar was added to the RbmC2 protein solution. Apo and mannotriose X-ray diffraction data were collected on crystals (cryoprotected in mother liquor supplemented with 20% glycerol and flash-cooled in liquid nitrogen) using an Oxford Xcalibur Nova X-ray generator with an Onyx CCD detector (Oxford Diffraction) and indexed using CrysAliasPro (Rigaku Corporation) and Aimless [54]. GlcNAc-Man X-ray data were collected on a Rigaku HighFlux HomeLab system with a Raxis IV++ detector and processed using iMOSFLM [55]. Molecular replacement using the VCC β-prism lectin domain (PDB ID 1XEZ) as a search model was carried out by Phaser [56]. Multiple rounds of XYZ coordinate and individual Bfactor refinement with phenix.refine [57] were interspersed with model building/rebuilding via Coot [58] using |2F o |-|F c | and |F o |-|F c | electron density maps. The progress of refinement was monitored by following the R work /R free ratio (R free consisting of 5-10% of reflections). Waters were selected and refined using the automated water picking feature of phenix.refine and ligands built into the electron density maps following the first round of rebuilding and refinement. Simulated annealing OMIT maps were constructed by removing the ligand from the final refined structure file, performing three macro-cycles of refinement with simulated annealing, and calculating a |F o |-|F c | map with phenix.maps. Occupancy refinement was carried out using phenix.refine for the GlcNAc-Man structure, which appeared to contain a mixture of the sugar and glycerol in the binding pocket. The two ligands were defined as alternate conformers and refined to a combined occupancy of 1.0 (final occupancies = 0.65 GlcNAc-Man, 0.35 glycerol).
To construct the RbmC2/pentasaccharide model, superimposed mannotriose and GlcNAc (β1-2)Man structures (which share an overlapping mannose) were used as a template to build the Man 3 GlcNAc core and a second GlcNAc residue built in from scratch. Five macro-cycles of geometry minimization and regularization were carried out using the phenix.geometry_minimization feature of Phenix [59]. Buried surface area calculations were performed using PISA [37] as implemented in Coot [58]. Contacts were determined using LigPlot+ v. 1.4.5 [60] with hydrogen bonds filtered using a cutoff of 3.35 Å. Model quality statistics were calculated using the MolProbity server accessed at http://molprobity.biochem.duke.edu.

Circular dichroism (CD)
Trypsin-cleaved RbmC2 WT and RbmC2 D853A domains were separated from the GFP UV fusion protein on Superose 6 and Sephadex 200 10/300 Increase size exclusion columns, respectively, in TBS. Samples were concentrated using a Vivaspin 5 kDa-cutoff centrifugal concentrator to a concentration of 340 μM and diluted to a final concentration of 12.5 μM with 10 mM sodium phosphate buffer pH 7.4. Samples were loaded into 0.2 cm quartz cuvettes (Cole-Parmer, Staffordshire, UK). CD data were collected on a Jasco J-810 spectrometer (Jasco Inc., Easton, MD). Thermal denaturation was monitored using the change in molar extinction at 222 nm while changing the temperature in 2 degree increments from 20˚C to 96˚C. Tm values were calculated from the mid-point of sigmoidal fits to the temperature data. Secondary structure composition was estimated using the β-structure selection server (BeStSel, http:// bestsel.elte.hu) using CD data from 200-250 nm [40].

Fluorescence microscopy
Defibrinated whole rabbit blood (Remel, ThermoFisher Scientific) was washed once with PBS buffer and resuspended cells incubated with a 0.5 and 2.5 μM concentrations of purified RbmC2-GFP UV for 5 minutes on ice, or a 2.5 μM concentration of RbmC1-GFP UV . As a control for non-specific binding, a point mutation shown to significantly reduce glycan binding was used as a control (D853A in RbmC2 and D539A in RbmC1). Cells were centrifuged at 500 x g in a microcentrifuge tube for 5 minutes to pellet cells, the supernatant removed, and the cells gently resuspended in 25 μL of PBS buffer. Bright-field and fluorescence microscopy images were acquired using a DeltaVision RT imaging system (Applied Precision) adapted to an Olympus (IX71) microscope. Images were acquired with a fixed exposure time (3 s for brightfield and 4 s for fluorescence) so that samples could be compared on the same intensity scales. Z-stack sections of 0.5 μm were collected using a 60X or 100X objective and images put on identical scales and superimposed using the Fiji distribution of Image J [61]. For GFP fluorescence images, excitation and emission filter wavelengths were 490 and 528 nm, respectively. Positions conserved in all four V. cholerae β-prism domains (VCC, RbmC1, RbmC2, Bap1) are in red type based on multiple sequence alignments. Positions with asterisks were mutated in this study. The type of contact made with mannotriose or GlcNAc-Man structures is noted with dashes representing residues not contacting ligands. Contacts were determined using LigPlot + v. 1.4.5 with hydrogen bonds filtered using a cutoff of 3.35 Å and hydrophobic interactions filtered at 3.9 Å. Also listed is whether the interaction primarily involves a peptide backbone hydrogen-bonding interaction, side-chain hydrogen-bonding interaction, or hydrophobic van der Waals (vdW) interaction. The fold loss in VCC hemolytic activity when mutated to the amino acid shown in parenthesis is also noted (see Fig 5A).  Table illustrating percentages of secondary structure elements for each protein at 20˚C and 96˚C as calculated by BeStSel. Calculation of secondary structure from the RbmC2 apo crystal structure using DSSP [65] indicates 2.2% α-helix, 52.9% β-sheet, and 44.9% other. The antiparallel β-sheet category is broken down into left-twisted, relaxed, and right-twisted sheets. (C) Plot showing unfolding of RbmC2 WT and RbmC2 D853A proteins based on the CD melt data from 34 to 76˚C. The mutant data is presented both as % unfolded based on the absolute shift of the CD signal and on the cumulative CD signal change from folded to unfolded states (a dip and then rise in the CD signal at 222 nm). From a sigmoidal fit of the curves, the T m is estimated to be 55˚C for WT and 54˚C for the mutant. (TIF)