Structure and functional analysis of a bacterial adhesin sugar-binding domain

Bacterial adhesins attach their hosts to surfaces through one or more ligand-binding domains. In RTX adhesins, which are localized to the outer membrane of many Gram-negative bacteria via the type I secretion system, we see several examples of a putative sugar-binding domain. Here we have recombinantly expressed one such ~20-kDa domain from the ~340-kDa adhesin found in Marinobacter hydrocarbonoclasticus, an oil-degrading bacterium. The sugar-binding domain was purified from E. coli with a yield of 100 mg/L of culture. Circular dichroism analysis showed that the protein was rich in beta-structure, was moderately heat resistant, and required Ca2+ for proper folding. A crystal structure was obtained in Ca2+ at 1.2-Å resolution, which showed the presence of three Ca2+ ions, two of which were needed for structural integrity and one for binding sugars. Glucose was soaked into the crystal, where it bound to the sugar’s two vicinal hydroxyl groups attached to the first and second (C1 and C2) carbons in the pyranose ring. This attraction to glucose caused the protein to bind certain polysaccharide-based column matrices and was used in a simple competitive binding assay to assess the relative affinity of sugars for the protein’s ligand-binding site. Fucose, glucose and N-acetylglucosamine bound most tightly, and N-acetylgalactosamine hardly bound at all. Isothermal titration calorimetry was used to determine specific binding affinities, which lie in the 100-μM range. Glycan arrays were tested to expand the range of ligand sugars assayed, and showed that MhPA14 bound preferentially to branched polymers containing terminal sugars highlighted as strong binders in the competitive binding assay. Some of these binders have vicinal hydroxyl groups attached to the C3 and C4 carbons that are sterically equivalent to those presented by the C1 and C2 carbons of glucose.


Introduction
Many bacterial survival strategies rely on the organisms' ability to adhere themselves to surfaces [1][2][3]. In this state, they can remain in favourable, high-nutrient zones of their environment for an extended time, without substantial energy costs. Adhesion proteins (or adhesins) PLOS  Many of the C-terminal segments contain sequences of unknown structure and/or function that might include novel adhesion domains. The diversity of adhesion proteins has attracted increasing interest for further research, as examples of protein-mediated interactions between bacteria and vastly different substrates continue to emerge [11]. One adhesion domain that appears often within RTX adhesins is the PA14 domain. This domain is named after its position in the Protective Antigen (PA) from the human pathogen Bacillus anthracis, where it was first discovered [24]. It is now known that the PA14 domain is a component of many proteins, both prokaryotic and eukaryotic, where it appears to consistently serve a role in carbohydrate-binding [25]. One of the best-studied examples is the epithelial adhesin (EpA) family of proteins from the pathogenic yeast Candida glabrata. The Nterminal domain of these yeast adhesins are PA14 domains that take on a characteristic beta sandwich-like fold. Structures for the PA14 domains from EpA adhesins 1, 6, and 9 (PDB: 4ASL, 4COY, & 4CP0) show a conserved calcium ion, coordinated via dual aspartate residues tandemly connected by a cis-peptide bond (D-cis-D motif) [26,27]. The calcium ion, in turn, coordinates carbohydrate hydroxyl groups, thus facilitating yeast colonization of mammalian surfaces and subsequent biofilm formation [28][29][30].
Given the PA14 domain's known role in the pathogenesis of yeast, its presence in bacteria at the distal end of many RTX adhesins is of interest. At present, the only RTX adhesin PA14 domain to have been structurally and functionally characterized is that of MpIBP, where it was found to foster contacts to the microalgal species Chaetoceros neogracile [11]. With only a single characterized example, very little can be speculated regarding the domains' affinities for specific sugars in polysaccharides, and how these affinities could factor into the adhesin's biological function. The following study was undertaken to characterize another RTX adhesin's PA14 domain, namely that from the Marinobacter hydrocarbonoclasticus long adhesion protein (MhLap) [31,32]. M. hydrocarbonoclasticus is an oil-eating bacterium that forms biofilms at the oil-water interface to improve the bioavailability of its preferred carbon source: mediumto long-chain alkanes [33][34][35]. Considering the species' potential as a bioremediation source, attaining a better understanding of the MhLap adhesin's ability to interact with sugars-possibly for the benefit of its oleolytic biofilms-is of particular interest [36,37]. Through our study, we have elucidated the structure of recombinant MhPA14 and shown it to be a calcium-dependent sugar-binding domain with affinity for dextran-based polymers. We have exploited this affinity to construct a resin-based competitive binding assay for comparing affinities to different sugars, which shows that the domain has a broad range of monosaccharide binding partners (such as fucose, glucose, N-acetylglucosamine and mannose), but is unable to bind strongly to sugars like galactose or N-acetylgalactosamine.

Materials and methods
Proteins referenced heavily throughout the manuscript are detailed in Table 1.

Molecular cloning, and protein expression of MhPA14
A gene encoding MhPA14 was synthesized with optimal codon usage for Escherichia coli (Gen-eArt). The gene was cut with the restriction enzymes NdeI (5' end) and XhoI (3' end) for ligation into the pET28a expression vector, which provides an N-terminal His tag. The plasmid was then transformed into Top10 cells for plasmid amplification, and then electroporated into BL21(DE3) cells for protein expression.
MhPA14 protein (sequence shown in S1 Fig) was expressed in the following manner. Single colonies were picked into 25-mL cultures of LB Broth + 0.1 mg/mL kanamycin and grown overnight at 37˚C. Overnight cultures were used to inoculate 1-L cultures, which were then grown at 37˚C until an OD 600 of 0.9 was attained. IPTG was then added to a final concentration of 1 mM to induce protein expression at 23˚C, and the culture was kept growing overnight.

Purification of MhPA14
E. coli cultures expressing MhPA14 were spun down at 4500 x g in a JS-4.2 rotor (Beckman Coulter). The medium was discarded, and the pellet resuspended in Ni Buffer (50 mM Tris-HCl pH 9.0, 500 mM NaCl, 2 mM CaCl 2 , 5 mM imidazole) along with a protease inhibitor cocktail tablet (Roche). Cells were then lysed by sonication, and the resulting cell debris removed via centrifugation at~30000 ×g in a JA-25.5 rotor (Beckman Coulter). The lysate supernatant was incubated with Ni-NTA Agarose Resin (Qiagen) and washed with~100 mL of Ni Buffer. The bound MhPA14 was then eluted with Ni Buffer supplemented with 400 mM imidazole. Eluted fractions were pooled and subjected to anion-exchange chromatography on a Q Sepharose Fast Flow column (GE Healthcare), with a running buffer containing 20 mM Tris-HCl pH 9.0 and 2 mM CaCl 2 . Proteins were eluted from the column with an increasing NaCl gradient. Fractions containing purified MhPA14 were then pooled and buffer exchanged into Protein Buffer (20 mM Tris-HCl pH 9.0, 150 mM NaCl, and 2 mM CaCl 2 ). Protein purity was assessed using SDS-PAGE.

Size-exclusion Superdex 200 affinity tests
A Superdex 200 10/300 GL column (GE Healthcare) was used to probe the resin-binding ability of MhPA14. Approximately 1 mg of MhPA14 was loaded onto the column and washed with 3 column volumes of Protein Buffer. An increasing gradient of either glucose or EDTA was then implemented, rising from 0 mM-55 mM glucose or 0 mM-5 mM EDTA over 45 min.

Calcium titration and thermal denaturation by Circular Dichroism (CD) spectroscopy
For calcium titrations, MhPA14 (25 μM) was dialysed against Protein Buffer containing no calcium and 5 mM EDTA to ensure calcium removal, before being re-dialyzed into 0.01 mM EDTA. Calcium was then titrated into the protein solution in 0.5 mM increments and mixed thoroughly. For each addition of calcium, eight spectra ranging from 180 nm to 260 nm were taken and averaged using a Chirascan CD Spectrometer (Applied Photophysics). Solutions were maintained at 20˚C, throughout.
For thermal denaturation experiments, MhPA14 (25 μM) in either 1 mM CaCl 2 or 0.01 mM EDTA was subjected to a 5-˚incremental increase in temperature from 20˚C to 65˚C. Multiple scans ranging from 180 nm to 260 nm were taken at each temperature until the spectra stabilized, at which point eight scans were taken and averaged to indicate the equilibrated structural state at that temperature. A refolding procedure, where the temperature was slowly dropped back to 20˚C, was also attempted.
All spectra were subjected to three-point smoothing using PROVIEWER software.

X-ray crystallography and structure solution of MhPA14
The microbatch method was used to screen several crystallization condition suites (Qiagen). MhPA14 at 26 mg/mL was buffer exchanged into 20 mM Tris-HCl pH 9.0 and 10 mM CaCl 2 . The protein solution and varying crystallization conditions were mixed (1 μL:1 μL) in 96-well plates, covered with 100% paraffin oil, and left to incubate at 23˚C. The PACT suite yielded many crystals of varying morphologies. These crystals were screened for diffraction using a home source diffractometer (Rigaku MicroMax-007HF X-ray source coupled to a R-AXIS IV ++ detector) in the lab of Dr. John Allingham. Crystals with diffraction around or above 3 Å were either stored or soaked in varying concentrations of sugars, before being analyzed at the Canadian Light Source (CLS) synchrotron. The condition that yielded a glucose-bound structure of MhPA14 contained 0.2 M potassium thiocyanate, 0.1 M Bis-tris Propane pH 7.5 and 20% w/v PEG 3350. The crystal had undergone thirty seconds of soaking in a 30% w/v glucose solution.
Data collected at CLS were indexed and integrated with XDS [38] and scaled with CCP4-Aimless [39]. The structure was solved using molecular replacement with Phenix-Phaser [40]; the sugar-less MpIBP PA14 served as a search model (PDB: 5J6Y). The initial model was made by Phenix-Autobuild [41] and manually corrected in Coot [42]. Subsequent refinement was done using Phenix-Refine [43].

Competitive binding assay
Superdex 200 (S200) resin stored in 20% ethanol was washed twice in Protein Buffer using 6600 xg centrifugation. A millilitre of 1 mg/mL MhPA14 fused N-terminally to GFP, named GFP_MhPA14 (S1 Fig), was incubated with 300 μL of equilibrated S200 resin. Following 30 sec of vortexing and 2 min of incubation with slight mixing (nutation), the solution was centrifuged at 6600 xg for 2.5 min and the supernatant discarded. Protein buffer (1 mL) was added to the tube, and the process of mixing followed by centrifugation was repeated. The A280 of the resulting supernatant was taken to account for the minimal amount of non-sugar-related protein disassociation from the resin, and was used as a baseline reading. The supernatant was then returned into its original resin mixture tube with the addition of 1.67 μmoles (3 μL of 555 mM) of saccharide. This was followed by 30 sec of vortexing, 2 min of nutation, and centrifugation at 6600 xg for 2.5 min. After reading the A280 of the supernatant, the above process was repeated for seven tandem saccharide additions of 1.67 μmoles, as well as a final 5-μmole addition. This process is shown as a schematic in S2 Fig. Data from the dextran-affinity assay were plotted using GraphPad Prism 7.03 and the A280 of non-sugar-related protein disassociation was subtracted. Next, the data were fitted to nonlinear regression of one-site-specific binding, which follows the model Y/Bmax = X/(Kd + X), with Bmax as the maximum specific binding and Kd as the equilibrium binding constant.

Isothermal titration calorimetry (ITC)
Isothermal calorimetric titration (ITC) measurements were performed using a MicroCal VP-ITC calorimeter (Malvern) set at 30˚C. MhPA14 in 2 mL of Protein Buffer at a concentration of 400 μM was mixed with 5-μl aliquots from one of four different 8-mM sugar solutions (fucose, glucose, 2-deoxy-D-glucose, and galactose). Sugars were added from the computercontrolled rotating syringe (400 RPM) at 5-min intervals into the MhPA14 solution for a total of 50 injections. The data were analyzed by Origin software Version 5.0 (Malvern).

Glycan array
Two glycan arrays were probed. The first was done by the Consortium for Functional Glycomics (Harvard Medical School). Version 5.4 of their printed Mammalian glycan array, containing 585 glycans, was incubated with 5 μg/mL, 50 μg/mL, and 200 μg/mL of GFP_MhPA14. The green fluorescence of the fusion protein was used to measure the relative fluorescence units (RFU) of the bound protein. Each glycan was present in six replicates on the array; the highest and lowest value from each set was omitted to avoid false hits, and an average of the remaining four replicates was used.
The second array was screened at the Carbohydrate Microarray Facility (Glycosciences Laboratory, Imperial College). GFP_MhPA14 (50 μg/mL) was exposed to the 'Fungal, bacterial and plant polysaccharide array set 2', which contained duplicates of 20 saccharide probes extracted from a variety of organisms. An Alexa Fluora 647-tagged anti-GFP antibody was used for detection, and the duplicates were averaged to give the final RFU values.

Recombinant MhPA14 is highly expressed and easily purified
Full-length MhLap is a 3443-residue protein (NCBI: WP_014422746) that follows the conserved domain architecture of most RTX adhesins (Fig 1). To extract the PA14 domain for recombinant expression, the Pfam database was used to locate the general N-and C-terminal ends of the domain. This was followed by a series of BLAST searches, multiple sequence alignments, and Phyre2 modelling sessions to designate specific start and stop sequences for the recombinant MhPA14 construct (S1 Fig). Alignment of several RTX adhesin PA14 domains from different Gammaproteobacteria species (Fig 2) shows that MhPA14 shares~50% sequence identity with domains from the two Pseudomonas species RTX adhesins, and~40% to the adhesins of both Vibrio cholerae and M. primoryensis, which agrees with the phylogenetic relationship of the bacteria's genera [44]. The residues that make up definitive calciumbinding sites in MpIBP [11] are highly conserved amongst the other RTX adhesins, as seen by the red boxed and bolded residues in Fig 2. However, when more-distantly-related proteins like Epa1 and PA are added to the alignment, only small areas of overlap remain, specifically the sequence surrounding the D-cis-D motif, shown in dark blue (Fig 2).
The codon-optimized version of the 22.5-kDa MhPA14 was well-expressed, and preferentially partitioned into the soluble fraction upon cell lysis ( Fig 3A). A two-step purification process of nickel-affinity chromatography, followed by Q-Sepharose anion-exchange chromatography, produced a relatively pure protein devoid of major contaminants (Fig 3A  and 3B) that migrated at the expected position for its molecular weight. The yield was~90 mg of protein per 1 L of E. coli culture. However, MhPA14 eluted from the anion-exchange column over five 3-mL fractions as a doublet peak. When run on SDS-PAGE, both doublet peaks contained indistinguishable MhPA14 bands, along with a miniscule amount of contaminant at 18 kDa ( Fig 3B). When size-exclusion chromatography was attempted, MhPA14 failed to elute from the column, even after several column volumes of buffer were passed through. An affinity for a component of the polysaccharide-based size-exclusion resin was suspected due to the proposed sugar-binding nature of the PA14 domain. The glucose polymer, dextran, was singled out as the most likely ligand, as the protein was retained by both Superdex (dextran and agarose) and Sephadex (dextran), but not Sepharose (agarose). To test this suspicion, a known quantity of purified MhPA14 was bound to a Superdex S200 column and subjected to an increasing glucose gradient. The protein began eluting at a glucose concentration of 25 mM and did so over the course of 20 mL, producing a broad, dispersed peak of MhPA14 (Fig 3C, top). SDS-PAGE was used to confirm that the pooled fractions that eluted off the S200 were, in fact, MhPA14, comparable in size and purity to the Q-Sepharose-purified protein ( Fig 3B).
Since the alignment in Fig 2 demonstrated that the D-cis-D motif-which uses calcium for sugar recognition-was conserved in MhPA14, the divalent ion-chelator EDTA was also used as a potential eluting agent (Fig 3C, middle). Indeed, 3.5 mM of EDTA led to the elution of MhPA14 as a sharper peak over < 10 mL. Both elution methods were compared to a control run where no gradient was added ( Fig 3C, bottom), which failed to elute any protein. Interestingly, both sub-peaks from the anion exchange-purified doublet peak were able to bind to the resin, indicating that these peaks likely contain different conformers of MhPA14, each retaining a common functionality.

MhPA14 requires calcium for proper folding and stability
The elution of MhPA14 from the dextran-based resin via EDTA made it clear that a divalent cation-presumably calcium-was important for the domain's sugar affinity. However, it was unclear if this was simply due to the D-cis-D calcium-binding motif previously mentioned, or if there was also a calcium requirement of MhPA14 for proper folding, as there is in previously studied domains taken from RTX adhesins [21,46]. Circular dichroism (CD) spectroscopy was used to assay the effect of calcium on the secondary structure of the domain (Fig 4, top). Purified MhPA14 was dialysed against excess EDTA to remove Ca 2+ , and then transferred to a buffer with minimal (0.01 mM) EDTA in which to titrate CaCl 2 . The CD spectra of MhPA14 in both 1 mM and minimal EDTA solutions took on the same shape, composed of two peaks , Candida glabrata (EpA1), and Bacillus anthracis (PA20) were aligned using the MergeAlign software [45]. Sequences with alignment scores above 60% are coloured dark blue, while sequences with alignment scores between 30 and 40% are shown in light blue. Proposed Ca 2+ -binding residues are boxed with red squares. Residue conservation is indicated as follows: � = 100%,: =~80%,. =~70%. The conserved residues are bolded. Residue numbers for the MhPA14 construct are given. Secondary structure taken from the solved structure of MpIBP PA14 domain is shown above the alignment, with beta-strands coloured dark blue, alpha-helices coloured green, and loops coloured black. Bold lines sit above the sequence for three loops that make up the supposed sugar-binding site of the bacterial PA14 domains. Note: sequences with previously solved protein structures are marked with an asterisk. https://doi.org/10.1371/journal.pone.0220045.g002 Analysis of a bacterial sugar-binding domain at 187 nm and 199 nm and a trough at~220 nm. Titration of CaCl 2 into the sample with minimal EDTA caused a drastic change in the spectrum. Following the addition of 1 mM CaCl 2 , the two peaks coalesced into one maximum near 195 nm, while the minimum at~220 nm was retained. Such spectra are characteristic of β-sheet dominated structures [47]. This spectral shape was maintained upon further additions of CaCl 2 , indicating that the MhPA14 calciuminteraction was saturated by 1 mM of titrant. The addition of EDTA back into the sample lead to an incremental return to the former shape.
Thermal denaturation studies were undertaken in both the presence (Fig 4, middle) and absence (Fig 4, bottom) of Ca 2+ to test the stability of the structures seen at the beginning and Analysis of a bacterial sugar-binding domain end of the calcium titration. In the presence of 1 mM CaCl 2 , the PA14 domain maintained its spectral shape until 45˚C, at which point a shift towards a characteristic "unfolded" spectrum was observed. By 55˚C, MhPA14 appeared entirely unfolded. In contrast, the dual-peak shape of the protein's CD spectrum in EDTA began to transition after only a 5-˚C temperature increase, and the protein was almost completely unfolded at 30˚C. In both conditions, MhPA14 was unable to fully regain its former structure upon returning to 20˚C, though the refolding was more successful in the presence of Ca 2+ (Fig 4).

MhPA14 crystal structure reveals three bound Ca 2+ , and one glucose
Purified MhPA14 was crystallized under several conditions-all of them sugar-free-with the best condition producing thick, prismatic crystals that diffracted to~1.2 Å. Crystal soaking experiments were conducted using readily available mono-and disaccharides. While many crystals remained devoid of obvious density in which to place a bound sugar, a glucose-bound dataset was eventually collected and solved using molecular replacement with the MpIBP PA14 domain as a template (Table 2).
MhPA14 takes on a beta-sandwich-type fold, comprised primarily of two anti-parallel betasheets held together by a hydrophobic core (Fig 5). Two small alpha helices contribute to the hydrophobic core, and a third helix lies across the outer surface of the larger beta-sheet. A series of long loops lacking defined secondary structure protrude from the top and bottom of the beta-sheets, as shown in Fig 5. The atoms within the bottom loops have higher B factors compared to the rest of the structure, possibly indicating that these loops are in a relative state of disorder. In contrast, the top loops have similar B factors to the core secondary structure elements, indicating a more ordered region. This is likely a result of the three calcium ions coordinated within these loops (Fig 5).
Calcium 2 and 3 appear to have purely structural roles by forming stabilizing contacts between otherwise disordered portions of the protein. Calcium 2 coordinates residues from two loop regions (using Gln47 and Asn53 from one, and Gln182 from another) and pins them to residue Asp 113, which is part of a beta strand (Fig 5, inset Calcium 2). Meanwhile, calcium 3 is involved in a more local interaction, coordinating three neighbouring residues (Asp107, Pro108 and Asp 110) into a sharp turn (Fig 5, inset Calcium 3). These interactions likely help facilitate the more thermo-stable, beta-strand-rich, calcium-dependent tertiary structure observed in Fig 4. Calcium 1 coordination is also likely to have structural ramifications, pinching two of the top loops together through contacts between the expected D-cis-D motif (Asp136 and 137) in one loop, as well as an additional aspartate and two backbone carbonyl groups from another loop (Asp181, Gly183 and Ala185) (Fig 5, inset Calcium 1). However, calcium 1 -and the three loops that surround it (Fig 6A)-also make up the sugar-binding site, where the structure of MhPA14 reveals a beta-D-glucose molecule to be bound. The sugar is coordinated to calcium 1 through the two vicinal hydroxyl groups attached to the first and second (C1 and C2) carbons in the pyranose ring (Fig 6B). Polar contacts directly between the glucose molecule and the protein are also made via the C1 -C2 hydroxyl pair, with C1 forming contacts with the sidechains of Asp 136, Asp 137 and Gln 182, and C2 connecting to Asp 137 and the backbone of Ala 185 (Fig 6B). The sugar ring also makes several non-polar contacts with the backbone of Loop 1. There is no evidence for an alternative coordination pattern within the distinct electron density (Fig 6B), nor is there any sign of the C1 anomer alpha-glucose.

MhPA14 crystal structure and sugar orientation compared to other known PA14 domains
Structures from other PA14 domains, including MpIBP PA14 (PDB: 5J6Y) and EpA1 (PDB: 4A3X) [48], have previously implicated calcium 1 as the prospective cofactor for sugar binding. Indeed, aligning the structure of MhPA14 against both MpIBP PA14 (RMSD = 0.897) and EpA1 (RMSD = 2.372) (Fig 7A and 7B) demonstrated the strong overlap of Loops 1, 2 and 3 that make up the sugar-binding site. As foreshadowed by the amino-acid sequence identities, the PA14 domains of MhLap and MpIBP are more similar in structure than they are to EpA1, yielding a common ligand-binding site for both structures with an open, flat topology in which their shared ligand-glucose-can reside (Fig 7C). While both structures bind glucose, they do so in different manners, with the glucose in MpIBP interacting with the calcium ion through the C3 and C4 hydroxyl groups, rather than the beta-C1 and C2 seen in MhLap. That said, both sets of hydroxyls are similarly oriented relative to one another-both being vicinal, trans isomeric, and equatorial-and therefore both appear capable of satisfying the calcium coordination sphere (Fig 7D).

Analysis of a bacterial sugar-binding domain
Contrarily, the EpA1 PA14 specifically binds galactose by coordinating its C3 -C4 hydroxyl pair, which are cis isomers, one equatorial and the other axially oriented. This orientation would appear to break the coordination scheme used by the other PA14 domains mentioned thus far. However, the EpA1 sugar-binding region differs greatly from that of MhPA14 and MpIBP, as seen in Fig 7C. The calcium is held in a narrower binding cleft, with large, bulky residues in Loops 1 and 3 restricting the accessibility of the site. The same loops in the other PA14 structures are comprised of residues with small sidechains, predominantly alanine and glycine. Due to the restricted access to the calcium in EpA1, a binding saccharide must be oriented at a different angle than the other two structures, an angle that would not allow trans, equatorial hydroxyl pairs to properly coordinate to the calcium. However, this orientation The structure for MhPA14 presented as a cartoon diagram from two viewpoints, 180˚rotated. The protein is coloured by progression of primary structure from the N terminus in blue, to the C terminus in red. Calcium ions are shown as grey spheres, coordinated water molecules are shown as smaller cyan spheres, and the glucose molecule is shown as a stick structure with white for carbon and red for oxygen. Inset panels detailing the coordination for each calcium ion are shown, with red dashed lines indicating coordinate bonds between the calcium and the labelled residue or water molecule. https://doi.org/10.1371/journal.pone.0220045.g005 Analysis of a bacterial sugar-binding domain does allow for the galactose C3 -C4 hydroxyl pair to satisfy the same coordination sphere, as seen in Fig 7D.

MhPA14 has varying affinity for different sugars, shown by both competitive binding assay and ITC
Attempts to solve MhPA14 structures with other sugars bounds-whether by soaking or cocrystallization-were fruitless. Therefore, an assay was developed to test the relative affinity of MhPA14 for different sugars. This simple, competitive binding assay was done in 1.5-mL microcentrifuge tubes with MhPA14 bound to Superdex 200 (dextran-agarose) resin and released upon addition of free sugar (S2 Fig). The stronger the binding between MhPA14 and the free sugar (relative to that of the resin), the less sugar it took to completely elute the MhPA14 from the resin. The amount of GFP-tagged MhPA14 released from the resin upon each sugar addition was measured using absorbance at 280 nm, which was plotted against concentration of free sugar to produce semi-quantitative binding curves (Fig 8). A range of sugars, including monosaccharide hexoses and disaccharides, were used for these assays (S3 Fig). The data for each sugar were fitted using non-linear regression to a simple ligand-binding equation assuming a single binding site, which allowed for the calculation of an apparent dissociation constant (K d app): a semi-quantitative measure of a sugar's ability to displace the MhPA14 from the resin ( Table 3).
The data from Fig 8 and Table 3 show that the monosaccharides cluster into four groups, in terms of binding affinity. The highest affinity group is populated solely by fucose, also known as 6-deoxy-L-galactose. This sugar has yet to be seen bound to a PA14 domain, though certain C-type lectins-such as LecB (PDB: 1OXC)-have been shown to bind fucose through calcium coordination via the C2 and C3 hydroxyls (Fig 7E).The second highest grouping contains the hexoses glucose, N-acetylglucosamine, and mannose, all with close to overlapping affinities. The third group holds only the intermediate binder 2-deoxy-D-glucose, with an apparent affinity two-fold lower than fully-oxygenated glucose. The final group consists of the weak Analysis of a bacterial sugar-binding domain binder galactose, and its derivative N-acetylgalactosamine, the latter being unable to bind MhPA14 at all. The disaccharides are more clustered together, with maltose (an α1-4 dimer of glucose) showing the highest affinity, followed closely by both trehalose (α1-1α dimer of glucose) and melibiose (galactose α1-6 glucose). Both lactose (galactose β1-4 glucose) and sucrose (glucose α1-β fructose) show comparably lower affinities.
Due to the many unknown variables at play in this assay (i.e. strength of binding to the dextran-based resin, number of closely spaced binding moieties presented by the resin, the disparity in diffusion between the resin and free sugars) these K d app values are not comparable to those attained through more quantitative methodologies, such as isothermal titration calorimetry (ITC) or surface plasmon resonance (SPR). However, they can be compared relative to each other, much like IC50 values, as long as the amounts of resin and protein are kept consistent.
To ensure that the ordering of sugar affinities is reproducible between the competitive binding assay and more tried-and-true methods, four sugars of apparently different affinities (fucose, glucose, 2-deoxy-glucose, and galactose) were chosen with which to perform ITC (Fig  9). The titration of fucose was the only sugar to show a definitive sigmoidal curve, with the other sugars showing gradual inclines that indicate weak affinity. This makes the inflection point (and therefore the stoichiometry) more difficult to interpret, although it appears to take place at a 1:1 molar ratio of ligand to protein. The fitted curves show a definitive decrease in K d from fucose to glucose to 2-deoxy-glucose to galactose, with values ranging from 58 μM for fucose to above 1 mM for galactose. This supports the order presented by the competitive binding assay, although-as expected-the values themselves are very different. To confirm the quantitative affinity for the strongest binders, fucose and glucose, the ITC experiments were run three times, and the Kd values shown are an average of the triplicate. Interestingly, the glucose 137 μM value is comparable to the dissociation constant between EpA1 and its sugar of choice, galactose (Kd = 115 μM) [48].

Glycan arrays show MhPA14's preference for branched glycans with terminal glucose, glucosamine and fucose
GFP_MhPA14 was incubated with a broad-spectrum glycan array from the Consortium for Functional Glycomics (CFG), in order to expand the assayed sugars beyond simple mono-and Analysis of a bacterial sugar-binding domain disaccharides. In keeping with the 0.1-1 mM affinity seen during ITC, MhPA14 failed to bind to any of the proposed glycans with visible intensity at 5 ug/mL of protein (S4 Fig). Even at 50 ug/mL, only a few potential binders were identified, necessitating a 200 ug/mL incubation to fully capture MhPA14's glycan binding profile (Fig 10A, left). The four glycans with substantially higher apparent affinity (8000 RFU<) for MhPA14 are all oligomers that contain terminal N-acetylglucosamine moieties: a monosaccharide denoted as a high-affinity binder during the competitive binding assay (Fig 10A, right). In fact, of the twenty glycans that bind with an RFU above 4000, eleven possess terminal N-acetylglucosamines. The nine glycans in the top twenty binders that do not contain terminal N-acetylglucosamine instead contain a terminal fucose moiety, the supposed strongest binding sugar. Indeed, fucose is also present on glycans 499 and 467, with an apparent role in affinity as the non-fucosylated version of these glycans is bound with less intensity (though still above 4000 RFU). Inversely, the vast majority of glycans that bind with an RFU below 500 lack terminal N-acetylglucosamine or fucose, instead sporting galactose, N-acetylgalactosamine, and neuraminic acid. Unfortunately, this particular array has few glycans containing terminal glucose, though those that are present surprisingly have middling to poor binding.
A more directed array that contains extracted glycans from fungal, bacterial, and plant sources was conducted at the Imperial College Glycosciences Laboratory (Fig 10B). While this array contains significantly fewer glycans to sample, the glycans are explicitly related to biologically-relevant polysaccharides, and the array has various examples of large glucose oligomers missing from the CFG array. Indeed, MhPA14 showed preference for several glucose oligomers, including modest affinities for dextran (an alpha 1-6 oligomer of glucose, and a component of size-exclusion resin), and pullulan (an alpha 1-6 / alpha 1-4 oligomer of glucose). But the strongest affinity was for two fungal beta 1-3 glucans: lentinan and grifolan (chains of beta 1-3 linked glucose molecules, with beta 1-6 branches at varying regularities). The latter, a polysaccharide from Lentinula edodes called lentinan, has a predicted structure of two glucose branches every five repeats of the main chain [49]. At a predicted molecular weight of~1, 000, 000 Da [50], that means this seven-glucose unit (Fig 10B, right) could be repeated almost 800 times, leaving 1600 terminal glucose molecules open for interaction. The second-strongest binder, grifolan, is predicted to have a similar structure. Meanwhile, the unbranched beta 1-3 glucan, curdlan [51], has almost no affinity for MhPA14, validating the importance of these terminal sugars for binding.

MhPA14 affinity for monosaccharides
The crystal structure of MhPA14 revealed a beta-D-glucose bound, despite minimal contacts with the protein itself. The sugar molecule used two equatorial hydroxyl groups to coordinate a calcium ion, namely the hydroxyls attached to beta-C1 and C2 of the glucose. The orientation of these two hydroxyls appears necessary, both to satisfy the calcium ion's coordination sphere and to partake in several other hydrogen bonds with residues Asp136, Asp137, Gln182, and Ala 185. Comparison of MhPA14 and MpIBP's PA14 show that the C3 -C4 hydroxyl pairs are similarly oriented and can also coordinate to calcium in a comparable manner, while hydroxyl pairs that are not similarly vicinal, trans isomeric, and equatorial are unlikely to be able to Analysis of a bacterial sugar-binding domain satisfy these requirements. For instance, it is likely that the alpha-C1 anomer, formed during the spontaneous cyclization reactions that saccharides like glucose undergo, could not bind to the MhPA14 in this way. Similarly, the C2 -C3 hydroxyls in glucose-while being a pair of tandem, equatorial, trans hydroxyls-are oriented opposite to each other in 3-D space relative to the other pairs and will not be able to satisfy the same coordinate sphere.
The competitive binding assay was used to compare the MhPA14's affinity for different monosaccharides. Considering glucose was the only sugar that bound to MhPA14 during the crystal soaking experiments, it was not surprising to see that glucose showed one of the strongest affinities for the protein with a K d app of 1.57 mM. However, it was surprising to see that N-acetylglucosamine binding rivalled that of glucose (K d app = 1.69), despite the C2 hydroxyl being replaced with an acetyl group. In addition, 2-deoxy-D-glucose was able to bind with an affinity only slightly below that of glucose, while galactose (which contains the same equatorial (200 μg/mL). The fluorescence measured for each glycan is an average of four replicate spots. The four glycan spots that fluoresced above 8000 RFU are labelled, and their structures are presented on the right. Blue squares = N-acetylglucosamine, blue circles = glucose, yellow squares = N-acetylgalactosamine, yellow circles = galactose, green circles = mannose, red triangle = fucose. Terminal sugars proposed to be strong-binders via the competitive assay are outlined in red. B) Fluorescent measurements from a second array, containing eighteen glycans extracted from biological sources following incubation with GFP_MhPA14 (50 ug/mL) and detected through anti-GFP antibody. The fluorescence measured for each glycan is an average of two replicate spots. The highest fluorescing glycan is coloured blue, and its repeating structure is shown on the right using the same colour scheme as in A). https://doi.org/10.1371/journal.pone.0220045.g010 Analysis of a bacterial sugar-binding domain beta-C1 and C2 hydroxyls as glucose) bound rather weakly. This ordering of glucose, 2-deoxyglucose, and galactose is corroborated by the ITC data, and is therefore not an artifact of the dextran resin-based technique. Additionally, the difference in glucose vs. galactose binding cannot be explained by a difference in the ratio of alpha to beta anomer in solution, as Angyal et al. observed consistent ratios for both sugars (alpha 36%, beta 64%) [52].
What most of the strong-binding monosaccharides (glucose, mannose, N-acetylglucosamine) do have in common is a pair of trans, vicinal, equatorial hydroxyls on the C3 and C4 carbons. This same orientation is lacking in the weak-binding galactose and N-acetylgalactosamine. Therefore, it is hypothesized that both beta-C1 -C2 and C3 -C4 hydroxyls are capable of binding to calcium 1 in MhPA14, but the C3 -C4 hydroxyl pair is predominantly responsible for binding in solution, while the C1 -C2 pair-though preferred in the artificial environment of a crystal-is less common due to its anomerization. The validity of this hypothesis can be tested by its ability to explain the affinity order seen for the hexose sugars. The 2-D sugar drawings in S3 Fig indicate properly-oriented calcium-binding hydroxyl pairs by circling the hydroxyls and colouring them green. The monosaccharides can be split into four groups: 1) those whose beta-C1 -C2 and C3 -C4 hydroxyl pairs are both properly oriented (glucose, mannose); 2) those where only the C3 -C4 hydroxyl pair is properly oriented or available (Nacetylglucosamine, 2-deoxy-glucose); 3) those where only the beta-C1 -C2 hydroxyl pair is properly oriented (galactose); and 4) those that lack both properly oriented pairs (N-acetylgalactosamine). The competitive binding assay shows that all hexoses that fall into the first two categories are strong binders, indicating the importance of the C3 -C4 pair, to the apparent irrelevance of the C1 -C2 pair. However, while the last group comprised solely of N-acetylgalactosamine is practically unable to bind MhPA14, the penultimate group still manages modest binding, which is likely due to the C1 -C2 pair.
The sugar with the most divergent structure, and therefore more difficult to explain, also happens to be the strongest binding ligand recorded here: fucose. Due to the L-conformation of fucose, its hydroxyls are differently oriented relative to the other hexose sugars. Because of this, the C2 and C3 hydroxyls are actually in the correct orientation for calcium coordination, as seen in Fig 7E. These same hydroxyls coordinate calcium in the C-type lectin, LecB, though this structure actually uses a dual calcium motif to further coordinate the sugar to the protein [53], possibly explaining LecB's enhanced affinity for fucose (Kd = 58 μM) relative to MhPA14 [54]. The improved binding of fucose relative to glucose is likely a result of the C4 axial hydroxyl; manual docking of fucose into the ligand-binding site of MhPA14 shows how this oxygen could form favourable hydrogen bonds with Gln 182 and Asn 154.

MhPA14 affinity for di-and oligosaccharides
Both EpA1 and the homologous Flocculin 5 (Flo5) from Saccharomyces cerevisiae show preferential binding to disaccharides over monosaccharides, due to extra contacts with the second sugar moiety [26,55]. However, none of the disaccharides tested against MhPA14 in the competition-based sugar-binding assay had higher affinity to the protein than glucose, with the highest-maltose (an alpha 1-4 connected dimer of glucose)-sharing the same affinity as glucose. It is likely that all other disaccharides bind through their glucose-based hydroxyl pairs but are hampered in binding by steric interference between the second sugar component and the protein.
The CFG glycan array results supports this reading of the competitive binding assay. Terminal N-acetylglucosamine and/or fucose moieties are present in all the top binders, with no strong preference for a single linkage type that attaches these terminal sugars to the glycan. Indeed, the top four binders contain combinations of beta 1-4, beta 1-2, and beta 1-6 linkages to galactose or mannose secondary sugars. That said, certain linkage types do appear to be detrimental to proper binding. For instance, an exact copy of glycan 188, but with an alpha 1-4 linkage between the terminal N-acetylglucosamine and the inner galactose instead of the beta 1-4 linkage, leads to abysmal binding. Once again, the implication is that steric hindrance between the secondary sugars and the structure can hamper binding to the terminal sugars of interest.
It is possible that MhPA14 makes additional contacts to oligosaccharides not available during these experiments. However, unlike EpA1, which has a more cluttered topology in its sugar-binding region, or the flocculin proteins, which contain an additional subdomain for contacting the second sugar, the sugar-binding regions of MhPA14 and MpIBP are far more open and possibly unable to facilitate additional favourable contacts. It appears likely that the MhPA14 relies mostly-if not entirely-on contacts with a single terminal sugar.

What is the function of MhPA14 in the long adhesion protein?
The domain architecture of MhLap places the PA14 domain in an optimal position for interaction with the extracellular environment. In such a role, the adhesin's sugar binding could facilitate contacts between cells in the Marinobacter biofilm through adhering directly to membrane-associated glycans-as occurs in yeast flocculation [55]-or perhaps through communal binding to the many secreted extracellular polysaccharides known to be vital for many bacterial biofilms [56,57]. As an alternative/complementary function, such sugar-binding domains can also forge connections between species, bringing multiple biological skillsets together in symbiotic communities. Such was the case for the previously studied PA14 from MpIBP, which connected its bacterial host to phototrophic diatoms to improve the availability of oxygen [11]. Indeed, examples of Marinobacter species forming consortia with different diatom species have been reported and characterized [58][59][60][61][62].
Identifying the ligand(s) that MhPA14 can bind in its environment is not an easy task, though the binding data and structure support the notion that a single sugar moiety can dictate the recognition and binding of the protein to its ligands. Such a broad system for ligand determination is reminiscent of the C-type lectin domains, which bind oligosaccharides using a calcium-based interaction with terminal sugars [63]. As such, any glycans that contain terminal C3 -C4 hydroxyl pairs-while avoiding major steric clashes through the second sugar moietycould be potential ligands for MhPA14. It, then, stands to reason that highly-branched sugars would serve as better-binding ligands, providing more termini with which to interact. The glycan arrays provide evidence for this preference, as most of the top-binding glycans in the CFG array contain multiple terminal sugars with open C3 -C4 hydroxyl pairs, as do the beta glucans lentinan and grifolan in the Imperial College glycan array. Indeed, the moderate binding affinity of the MhPA14 may play a part in ligand specificity, as only ligands that provide many potential binding sites (such as the dextran-based resins) for several MhLap adhesins to bind to will properly facilitate a cell-to-glycan interaction via this multi-valent effect. Another possibility that cannot be ruled out is that additional adhesion domains in the MhLap adhesin contribute to ligand recognition.

Conclusions
The PA14 domain found at the distal end of the RTX adhesin from M. hydrocarbonoclasticus is a confirmed sugar-binding domain that uses a coordinated calcium ion to preferentially bind hexose sugars like fucose, glucose, mannose and N-acetylglucosamine. The MhPA14's natural affinity for dextran-based size-exclusion resin allowed for the design and development of a competitive binding assay, where MhPA14 was competed off the resin through titration of free sugars. This methodology is a quick and cost-effective way to assay many sugars for relative binding affinity and could be a useful tool for studying the many PA14 domains that reside in RTX adhesins, or lectins in general.