Crystal Structures Reveal the Multi-Ligand Binding Mechanism of Staphylococcus aureus ClfB

Staphylococcus aureus (S. aureus) pathogenesis is a complex process involving a diverse array of extracellular and cell wall components. ClfB, an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules) family surface protein, described as a fibrinogen-binding clumping factor, is a key determinant of S. aureus nasal colonization, but the molecular basis for ClfB-ligand recognition remains unknown. In this study, we solved the crystal structures of apo-ClfB and its complexes with fibrinogen α (Fg α) and cytokeratin 10 (CK10) peptides. Structural comparison revealed a conserved glycine-serine-rich (GSR) ClfB binding motif (GSSGXGXXG) within the ligands, which was also found in other human proteins such as Engrailed protein, TCF20 and Dermokine proteins. Interaction between Dermokine and ClfB was confirmed by subsequent binding assays. The crystal structure of ClfB complexed with a 15-residue peptide derived from Dermokine revealed the same peptide binding mode of ClfB as identified in the crystal structures of ClfB-Fg α and ClfB-CK10. The results presented here highlight the multi-ligand binding property of ClfB, which is very distinct from other characterized MSCRAMMs to-date. The adherence of multiple peptides carrying the GSR motif into the same pocket in ClfB is reminiscent of MHC molecules. Our results provide a template for the identification of other molecules targeted by S. aureus during its colonization and infection. We propose that other MSCRAMMs like ClfA and SdrG also possess multi-ligand binding properties.


Introduction
Staphylococcus aureus (S. aureus), an important opportunistic pathogen, is a major threat to humans and animals causing high morbidity and mortality worldwide. It is responsible for a variety of infections ranging from mild superficial infections to severe infections such as infective endocarditis, septic arthritis, osteomyelitis and sepsis [1]. Such infections are of growing concern because of the increasing antibiotic resistance of S. aureus [2,3]. Multiple sites within the body can be colonized, including the perineum and the axilla, but the most frequent site of the carriage is the moist squamous epithelium of the anterior nares. Moreover, the organism can be disseminated from a superficial site via the bloodstream to internal organs where it can set up a metastatic focus of infection. Approximately 80% of invasive S. aureus infections are autologous in that they are caused by strains carried in the patient's nose prior to illness [4,5].
The ability of S. aureus to cause diseases has been generally attributed to two classes of virulence determinants: cell wallassociated proteins and extracellular protein toxins. The initial step in pathogenesis is often cell adhesion, mediated by surface adhesins called MSCRAMMs (Microbial Surface Components Recognizing Adhesive Matrix Molecules) [6,7]. To date, S. aureus is known to express more than 20 different potential MSCRAMMs [8,9].
SD-repeat-containing (Sdr) proteins are members of the MSCRAMM family, including clumping factor A (ClfA), ClfB, SdrC, SdrD and SdrE of S. aureus and SdrF and SdrG of S. epidermidis. The Sdr proteins are characterized by the presence of an R region composed largely of repeated SD dipeptides [10]. They exhibit a comparable structural organization including an Nterminal secretory signal sequence followed by a ligand-binding A region and a dipeptide repeat region (R) composed mainly of aspartate and serine residues. The LPXTG cell wall-anchoring motif (W) immediately follows the SD-repeat region and is followed by a hydrophobic membrane-spanning domain (M) and a short positively charged cytoplasmic tail (C). Despite their conserved structural organization, the Sdr proteins are not closely related in sequence, with only 20 to 30% identical amino acid residues in the ligand-binding A domain. This suggests that different Sdr proteins might play different roles in S. aureus pathogenesis [11].
ClfB is one of the best characterized surface proteins on S. aureus during the past decade [12][13][14][15][16][17][18]. The multi-functional character-istics are quite unique to this adhesin, unlike ClfA and SdrG that have been shown to bind only to fibrinogen [19][20][21]. ClfB plays a key role in establishing human nasal colonization by binding to the human type I cytokeratin 10 (CK10) expressed on squamous epithelial cells [17,18,22,23]. Consistently, recent studies have shown that the immunization of mice with ClfB reduces nasal colonization [24]. As a bifunctional MSCRAMM, ClfB also binds to fibrinogen a (Fg a), which is assumed to be significant in platelet activation and aggregation and has been shown to contribute to the pathogenesis of experimental endocarditis in rats [7,17,25,26]. Unlike ClfA, FnBPA and FnBPB, which bind to the c chain of fibrinogen, ClfB binds to repeat 5 (NSGSSGTGSTGNQ) of the flexible region of its a chain [15,16,27]. The repeat may form a loop, similar to the Tyr-(Gly/Ser) n V loops present in the Cterminus of CK10, to which ClfB also binds [18]. Fg a and CK10 harbor the same or overlapping binding sites on ClfB [18], but the detailed mechanism of ClfB recognition of Fg a and CK10 is unclear.
Structural studies suggest that ClfA and SdrG have different ligand binding characteristics and mechanisms [21,28], although the structural organizations of the adhesion domains of these two MSCRAMMs are very similar. A ''dock, lock, and latch'' (DLL) model was proposed for SdrG-ligand recognition, where SdrG adopts an open conformation that allows the Fg ligand to access a binding trench between the N2 and N3 domains [21]. In ClfA, however, the cavity is preformed in a stabilized closed configuration, into which the C-terminal of the c chain of fibrinogen threads. Therefore, the ClfA-Fg binding mechanism was proposed to be ''Latch and Dock'' [28].
Here we solved the crystal structures of the apo-ClfB adhesive domain and its complexes with peptides derived from Fg a and CK10. Our structures showed that ClfB recognizes its ligands in a similar manner with the DLL model. A previous study on the structures of ClfB complexed with Fg a and CK10 peptides suggested that the conserved peptide-derived motif (GSSGXG) is required for their binding to ClfB [29]. The data presented in the present study, however, support a minimal nine amino acids Gly-Ser-Rich (GSR) motif that is necessary and sufficient for binding to ClfB. Human genome mining using the motif as a template identified several candidates including Engrailed protein, TCF20 and Dermokine as potential ClfB-binding proteins. Interaction of Dermokine with ClfB was confirmed by biochemical and structural studies, which demonstrate that nearly identical mechanisms are utilized by ClfB to recognize its binding partners. Our data not only provides insights into the ligand binding mechanism of ClfB but also raises the possibility that ClfB targets multiple substrates during S. aureus infections. These results would be valuable for the development of new therapeutic strategies.

Results
Structure of apo-ClfB  Previous studies indicated that a segment of ClfB containing N2 and N3 regions ( Figure 1A) is sufficient for recognition of Fg a and CK10 [18,27,29]. We therefore cloned the segment encoding the two regions (amino acids 197 to 542) of the ClfB protein from S. aureus and purified the protein from E. coli for our structural studies. The structure of the ClfB (208-540) -Fg a (316-328) complex was solved by a Se-Met derived protein and was used as a starting model for determination of the other structures by the molecular replacement method ( Table 1).
The apo-ClfB  structure was solved at 2.5 Å resolution, consisting of residues Ser197-Ala534 ( Figure 1B). No electron density was observed for the C-terminal eight residues in the apo-ClfB structure. The polypeptide chain of apo-ClfB  is composed of two distinct domains N2 and N3, as previously described for other MSCRAMMs in S. aureus ( Figure 1A) [30] . The N-terminal N2-domain contains 146 residues (amino acids 213-358) and the N3 domain 170 residues (amino acids 359-528). In the crystal structure, both N2 and N3 have two layers of bsheets that pack tightly against each other ( Figure 1B). In contrast, packing between the two domains is much looser, resulting in the formation of a large groove between them where presumably ligands bind. In N3 domain, strands A, B, E, and D form one of the two principal sheets, while strands D9, D0, C, F, and G on the opposite face present the other. Similar to the structures of other Fg-binding MSCRAMMs [21,28,29,31], the structures of N2 and N3 display a typical Dev-IgG fold featured by the existence of the additional strands D9 and D0 as compared to the C-type IgG fold [30]. The structures of the N2 and N3 domains can be well superposed with an rms deviation of 0.98 Å for all Ca atoms. One structural difference between them, however, is the three-stranded b-sheet (A, B and E) on one side of N2 in comparison with a fourstranded b-sheet (D, F, C and G0) on its corresponding side in N3, as described in the structures of ClfA, SdrG and ClfB [21,28,29].
ClfB Ser197 and ClfB Leu198 or even a short N-terminally extended segment such as the unrelated His-tag were shown to be necessary to maintain the Fg binding activity of ClfB [16], though the mechanism of how the N-terminal segment of N2 participates in substrate binding is unclear. In the crystal structure of apo-ClfB, the N-terminus (Ser197-Ala201) of one ClfB  molecule binds to the N3 domain of a symmetry-related ClfB molecule, forming a b-sheet together with the strand G ( Figure 1C and Figure S1) mediated by 2 pairs of main-chain hydrogen bonds. Additional hydrogen bonds involving ClfB Gln235 and ClfB Val200 further contribute to the N-terminus-mediated interaction between ClfBs ( Figure 1D). These interactions may act together to stabilize the G-strand of the N3 domain, thus maintaining its Dev-IgG fold and mimicking the transition state of ligand binding.
(All structural figures in this paper were generated with PyMOL [32]).

Author Summary
Staphylococcus aureus (S. aureus), an important opportunistic pathogen, is a major threat to humans and animals, causing high morbidity and mortality worldwide. It is responsible for a variety of infections ranging from mild superficial infections to severe infections such as infective endocarditis, septic arthritis, osteomyelitis and sepsis. Such infections are of growing concern due to the increasing antibiotic resistance of S. aureus. In order to understand the mechanism of the S. aureus pathogenesis, we studied one of the bacterial surface proteins clumping factor B (ClfB) bound by the fibrinogen a (Fg a) and cytokeratin 10 (CK10). From analyses of the high resolution crystal structures we found that the ClfB-binding peptides harbor a stretch with consensus sequence (GSSGXGXXG) that is also conserved in Engrailed protein, TCF20 and Dermokines. The interaction between ClfB and a dermokinederived peptide was demonstrated using binding assays. Consistent with a role of ClfB in the inflammatory responses induced by S. aureus, expression of dermokines is predominant in epithelial tissues and upregulated in inflammatory diseases. The data presented in this study raise a possibility that multiple human proteins are targeted by ClfB during S. aureus infection. The multiligand binding feature of ClfB would be valuable for developing new therapeutic strategies.
Structural comparison of the apo-ClfB and the two complexes shows that the RMSDs of the Ca atoms in ClfB are 0.46 Å and 0.49 Å respectively, indicating that the overall ClfB remains unchanged upon binding of the ligands (Figure 2A). Marked conformational changes, however, occur to the C-terminus of ClfB (499)(500)(501)(502)(503)(504)(505)(506)(507)(508)(509)(510)(511)(512) in both complexes. In ClfB-Fg a (316-328) , the residues ClfB Arg529-Ser542 that are disordered in the structure of apo-ClfB become well defined following Fg a (316-328) binding. The distal C-   . A. Domain organization of ClfB. The numbers of the amino acid residues identifying the boundaries between adjacent domains are indicated below. S, signal sequence; N1-3, N-terminal fibrinogen binding region; R, serine-aspartate repeat region; W, wall-spanning domain; M, membrane anchor; C, cytoplasmic positively charged tail. The N2 and N3 domains were used in crystallization of the ClfB (197-542) -peptide complexes. B. Ribbon representation of the structure of apo-ClfB  , with its N and C terminus indicated. The N2 and N3 domains are shown in orange and magenta, respectively. The strands and loops are marked. C. Ribbon representation of the two symmetry-related molecules in the unit cell, shown in orange and magenta, respectively. The N and C termini of both molecules are indicated. D. Closer view of the interaction between the two symmetry-related molecules. The N-terminus of one molecule (amino acids 196-201) is shown as sticks and the other one is colored in magenta as in (B). The amino acids from both molecules are marked in red and black characters, respectively. The hydrogen bonds are shown as red dashed lines. doi:10.1371/journal.ppat.1002751.g001 terminus of ClfB (197-542) forms a short b-strand G9, which forms a parallel b-sheet with the b-strand E from the N2 domain. The formation of the b-sheet is mediated by several main chain and side chain hydrogen bonds ( Figure 2B). The ligand-induced stabilization of the C-terminal peptide of ClfB allows it to run across Fg a (316-328) on the top. This binding mode is consistent with the DLL model as demonstrated in SdrG-Fg b complex [21,28]. In contrast with Fg a (316-328) , the peptide CK10 (499-512) did not induce formation of the b-strand G9 in ClfB ( Figure 2A). Nonetheless, the C-terminal portion of strand G that interacts with Fg a (316-328) also becomes well defined and caps on the CK10 (499-512) peptide.
While we were preparing this manuscript, the structures of apoand ligand binding ClfB were reported by V.Ganesh et. al [29]. Interestingly, the structural features we observed here are noticeably distinct from those of Fg a/CK10-ClfB complexes solved by them [29]. In both of their structures, particularly in the Fg a-ClfB complex, although the peptide adopts a conserved conformation as that in our structure, the C-terminus of the G-strand exhibits a different orientation and is not inserted into the N2 domain to form an extra strand G9 with the strand E, and thus the peptide is not locked in the groove between N2 and N3 ( Figure 2C). In this way, their structures do not support the DLL model proposed based on the SdrG protein structure [21]. In addition, on peptide binding no rearrangement occurs to the loop between D and D9 in N2 ( Figure 2C). Although the C-terminus of ClfB in the CK10-ClfB complexes has similar conformation as that in our structure, the D D9 loop in N2 domain shows no rearrangement, either ( Figure 2D). The differences in the peptide conformations observed between our and Ganesh et al. works, could be attributed to the methodologies adopted in crystallization. While we co-purified the ClfB with the peptides to form a complex prior to crystallization, Ganesh et al. reported that they soaked the peptides into the apo-ClfB crystals [29]. In their structures, the conformational changes observed in our study to accommodate the peptide and then to lock it in place could have been hindered by crystal packing within the crystals. In all, our structures strongly support the DLL model for ClfBligand binding. Briefly, ''Dock'' of the peptide triggers the rearrangement of the C-terminus of the N3 domain, allowing ClfB Arg529 to form a hydrogen bond with the ClfB Asn238 from N2 domain. This would result in ''Lock'' of the peptide into the substrate binding groove, whereas the strong interaction between G9 and the E strand of N2 can ''Latch'' the peptide ( Figure 2B).

Structural comparison of ClfB with ClfA and SdrG
In spite of the low identities in the amino acid sequences, the structures of ClfB, ClfA and SdrG exhibit high similarities ( Figure 3). The most conserved residues are mainly located in the loop region of them ( Figure 3B). Although the adherence domain organizations of ClfB, ClfA and SdrG and their ligand binding sites are conserved, the ligand binding specificities of the three MSCRAMMSs vary ( Figure 3D) [18,21,28]. All the bound peptides form into a b-strand paired with the G-strand and pass through the tunnel formed by the N2, N3 and the end of the Gstrand ( Figure S3). In the ClfB-Fg a (316-328) /CK10 (499-512) structures, one peptide is bound to one ClfB, in the same orientation as the Fg c-chain peptide in ClfA and a reverse orientation compared to the Fg b-chain peptide in SdrG ( Figure 3D) [21,28].
In both ClfA-Fg c and SdrG-Fg b structures, the C-terminus of the N3 domain forms a b-stand G9 ( Figure 3D). ClfA Tyr338 that is conserved in the structures of SdrE and SdrD (data not shown), forms a hydrogen bond with the amino acid at the end of the G strand (Asn530 in ClfA), thus stabilizing the conformation of the G9 strand ( Figure S4). In ClfB, the amino acid at the corresponding position is substituted with phenylalanine (ClfB Phe328 ) ( Figure 3A). Comparison of the apo-and ligandbound form structures of ClfB indicates that the interactions between the ligands and the G strand of N3 play a vital role in the redirection of the C-terminus of N3. ClfB Arg529 , the last residue in the C-terminus of the G strand in ClfB, interacts with the ligand peptides in both complex structures. ClfB Asn238 and ClfB Arg529 form a stable hydrogen bond to lock the peptides into the GG9 covered tunnel. Interestingly, although in the ClfB-CK10 structure the G9 strand appears disordered, the ClfB Asn238-Arg522 hydrogen bond also exists ( Figure 2B), consistent with the DLL model. Taken together, our structures strongly support the DLL model for ClfB-ligand binding.

Peptides recognition of ClfB
In the crystal structures of the ClfB-Fg a (316-328) /CK10 (499-512) complexes, both peptides lie down into a tunnel between N2 and N3. The peptides are covered by the C-terminal end of b-strand G ( Figure S2). The C-termini of the two peptides have nearly identical conformations, with a turn formed at Fg a Gly326 and CK10 Gly510 ( Figure S5). In contrast, the N-termini of the peptides are notably different. A sharp twist at Fg a Gly318 allows the Nterminal portion of the peptide to exit the tunnel and point upward. Unlike Fg a (316-328) , CK10 (499-512) adopts a more extended conformation.
Numerous contacts with distances of less than 4 Å between the protein and the peptides are observed (Figure 4 and Figure S6). The interactions between ClfB with the peptides are primarily mediated through a number of hydrogen bonds. The conserved hydrogen bonds are observed between ClfB and the middle region of the two peptides ( Figure 4). Hydrophobic contacts of the middle region of both peptides with the G strand of ClfB (208-542) , the loop between b-sheet A, B and the loop between b-sheet C, D of N2 domain also contribute to peptide-protein interactions.
In the ClfB (208-531) -CK10 (499-512) complex structure, four pairs of hydrogen bonds were observed between the main chains of the peptide and the G strand of N3 domain, resulting in the formation of a parallel b-sheet. Polar groups in side chains of ClfB Trp522, and Asn524 in N3 domain form two hydrogen bonds with the hydroxyl group of CK10 Ser503 . The hydroxyl group of CK10 Ser503 forms the third hydrogen bond with ClfB Ser376 of N3 and side-chain hydroxyl group of CK10 Ser504 forms another hydrogen bond with ClfB Ser236 of N2.
Residues from the middle region of CK10 interact with ClfB Ser236, Asp270, and Asn526 via main chain-main chain hydrogen bonds ( Figures 4A and Figure S6A). Hydrogen bonds were also observed between the amino groups of CK10 Ser504, and Gly506 and side chain hydroxyl or carbonyl groups of ClfB Asn234, and Asp270 in the loop region of N2. The carbonyl groups of the C-terminal residues CK10 Ser508, and Ser509 interact with the side chain hydroxyl group of ClfB Tyr273 in the CD-loop of N2 ( Figure 4A and Figures S6A, B). The aromatic ring of ClfB Trp522 of the G strand of N3 plays an important role in anchoring the N-terminus of CK10 peptide through hydrophobic interactions with CK10 Gly501 and CK10 Gly502 . The C-terminal segment of the peptide lies in the hydrophobic trench formed by residues of the loop region of N2 and is covered by the G strand of N3 ( Figure 4A and Figures S6A,B).
Foster's study demonstrated that substitution of CK10 Gly507 with the bulky residue tyrosine resulted in loss of interaction of CK10 with ClfB [33]. Structural analysis showed that the space surrounding CK10 Gly507 is significantly circumscribed by its neighboring residues ClfB Val528, Gly269, Val271, and Phe328 . Modeling studies (data not shown) indicated that any residue with a side chain would generate steric hindrance and cannot be accommodated in the pocket defined by the above four ClfB residues ( Figure 4B and Figures S6C,D).

Mechanisms of specifically recognizing repeat 5 of Fg a (Fg a5) by ClfB
The Fg a C-terminal domain (amino acids 221-610) of human Fg contains ten 13-residue tandem repeats, within which up to eight residues are glycines or serines [34]. Despite the similar sequences among the repeats, only Fg a5 was shown to be recognized by ClfB [18]. The reason for this was proposed to be the presence of proline or arginine residues in the center of the putative V loops in the other repeats though the precise underlying mechanism remains unknown [27]. The crystal structures presented here offer an explanation for this observation. Structural comparison of the two complexes revealed that interactions of the peptides with ClfB are primarily mediated through a conserved motif in the peptides: G-S-S-G-S/T-G-S-X-G ( Figure 5A). Sequence alignment of the repeats indicates that Fg a5 differs from the other repeats at the 5 th , 7 th and 9 th positions ( Figure 5B). The hydroxyl group of S/T at the 5 th position is involved in hydrogen bonding interactions. On the other hand, the size of the residue at this position is limited by its neighboring residues. Thus, other residues except S/T at this position are expected to compromise the interactions between the repeat and ClfB either because of loss of hydrogen bonding interaction or generation of steric hindrance. The 7 th position appears to play a role in maintaining the local conformation of the peptide by forming a cturn with the 9 th position. In the structure of CK10-ClfB complex solved by V.Ganesh et al., the 7 th position was replaced with a histidine residue, suggesting that the residue at this position can be varied (Figure S6). The G 9 residue was headed to the end of the bsheet D and the ClfB Met280 and ClfB Pro281 in N2 limit residues with any side chain which would generate clash against them. In addition, a turn at the G 9 is required to permit the peptide out of the tunnel, explaining why the repeat 2 with an alanine at this position cannot bind to ClfB ( Figure 5B) [18].

Importance of the GSR motif in recognition by ClfB
After carefully analyzing the sequences and the peptide binding specificities of ClfB, we propose that a small motif G 1 -S 2 -S 3 -G 4 -G/ S/T 5 -G 6 -X 7 -X 8 -G 9 is responsible for ligand binding to the ClfB adhesive domains. Taking the Fg a (316-328) -ClfB complex as an example, within this motif, the G 1 is limited by the side chain of ClfB W522 with the limitation of the space and is also required for the Fg a peptide making a turn thus exiting the tunnel. The S 2 is the most critical residue because it not only forms two hydrogen bonds with the side chains of ClfB W522 and ClfB Q377 but also binds to the main chain of ClfB S376 . Similar to the S 2 , the S 3 forms two hydrogen bonds with the side chain of ClfB Q235 and ClfB S236 in the N2 domain, which could be replaced by a smaller residue such as alanine. The following residues, especially the G 4 , G/S/T/ 5 and G 6 , are necessary for the formation of the stable proteinpeptide complex because they form hydrogen bonds with ClfB and the size of the b-sheet G covering tunnel does not accommodate residues with larger side chains. The S 7 might play a role in maintaining the local conformation of the peptide by forming a cturn with the G 9 . The space for the S 8 appears to be enough for residues with larger side chains ( Figure 5B and Figure S6). Finally, the G 9 needs to form a turn to allow the peptide out of the tunnel. Thus, the somewhat soft binding trench of ClfB would be able to bind to a series of peptides with this feature.
To further confirm our hypothesis regarding the importance of the nine-amino-acid GSR motif, we did the alanine scan using the SPR (Surface Plasmon Resonance) system with a synthetic 9residue peptide derived from the GSR motif (GSSGSGSNG). The results are highly consistent with our structural observation and clearly show that the nine-amino-acid peptide is necessary and sufficient for binding to ClfB in vitro ( Figure 6).

Dermokine is a potential ligand of ClfB
Our results suggest that proteins carrying the GSR motif are able to bind to ClfB. To find other potential ligands of ClfB, we searched the NCBI protein database for additional proteins containing the sequence of G 1 -S 2 -S 3 -G 4 -G/S/T 5 -G 6 -X 7 -X 8 -G 9 . Three proteins, TCF20, Engrailed protein and Dermokine (Derm) were found to be the hits, out of which Dermokine was evaluated more in detail in this study ( Figures. 5 and 6). Dermokine is expressed in many epithelial tissues, localized to intracellular or pericellular spaces and overexpressed in inflammatory diseases. The two major isoforms a and b are transcribed from different promoters at the same locus. Recently, additional transcript variants c, d and e have been identified [35,36].
Firstly, Derm was tested for its interaction with ClfB. To this end, we synthesized a 15-amino-acid-peptide (250-264; GQSGSSGSGSNGDNN, designated as Derm15 hereafter) derived from Derm and then characterized its binding to ClfB using the SPR (Surface Plasmon Resonance) assay. In the assay, ClfB bound to the peptide with a dissociation constant of 2.37 mM ( Figure 7A). Interestingly, the results also showed that the Derm peptide interacted with ClfB with slow kinetics, further supporting the DLL model ( Figure 7A). To understand the molecular mechanism underlying this interaction, we solved the crystal structure of ClfB (208-542) bound to the peptide at 2.5 Å resolution. As expected, Derm15 interacts with ClfB in a nearly identical manner with Fg a (316-328) and CK10 (499-512) (Figure 7 B). The ClfB Arg529 forms a hydrogen bond with ClfB Asn328 and the Cterminus of N3 forms an extra strand, which is similar as that in the ClfB-Fg a (316-328) and ClfB-CK10 (499-512) complexes (Figure 7 C). Mutagenesis studies were conducted to further verify the binding of Derm15 to ClfB. We replaced the residues ClfB S236, W522 that participate in interactions with the peptide with alanine respectively. The mutant proteins were purified to homogeneity and tested for their interaction with the Derm peptide using SPR. While the wild type ClfB bound tightly to Derm15, the mutant proteins ClfB (197-542) S236A or W522A exhibited much lower binding affinities with the peptide in mM range ( Figure S8). Interestingly, besides the low binding affinities, both mutant proteins exhibited rapid association and dissociation behaviors in the experiments, as compared to the slow association and scarcely any dissociation behaviors observed for the wild type protein. These results indicated that the residues ClfB S236, W522 are not only involved in the binding with ClfB, but also participate in stabilizing or ''locking'' the peptide in place. Collectively, our results strongly support the interaction between ClfB and Derm in vitro and suggest that Derm may involve in the infection process and pathogenesis caused by S. aureus in vivo.

Discussion
The colonization of the host nares by the Gram-positive bacterium S. aureus is mediated by a family of cell surface proteins which promote its adhesion to the extracellular matrix, that is, the MSCRAMMs [13,18]. ClfB, as a component of this family protein, has been studied for the past decade and was unique in its multi-functional characteristics, as compared to ClfA and SdrG that only bind to fibrinogens [18,19,21,29].
Consistent with the studies of the SdrG-fibrinogen complex [21], data from this study support the DLL binding mechanism of ClfB with the Fg a/CK10-derived peptides, but not the mechanism suggested in the previous study by V. Ganesh et al. [29]. In their work, due to the absence of the ''Latch'' procedure observed in the crystal structure, the binding mechanism was ascribed to the ''DL'' model. However, the structures of ClfBpeptide complexes solved in this study, together with the SPR data, indicate that the DLL model should be the mechanism utilized by ClfB to bind to its ligands. Our results also indicate that the DLL model may be the principal mechanism of MSCRAMMligand complexes.
In V. Ganesh et al.'s studies of the ClfB complexes, they proposed a common GSSGXG motif constituting the ClfB binding site [29], which is inconsistent with the previous studies on ClfB. For example, within the ten tandem Fg a repeats, repeat 2, 3, 4 and 5 all contain the GSSGXG motif but only the repeat 5 can bind to ClfB ( Figure 5B) [18]. Our structural and the alanine screening analyses demonstrate that a 9-residue peptide G 1 -S 2 -S 3 -G 4 -G/S/T 5 -G 6 -X 7 -X 8 -G 9 is necessary and sufficient for binding to ClfB in vitro. It is therefore predicted that a protein incorporating such a motif is able to interact with ClfB. Indeed, our biochemical assays showed that a Dermokine-derived peptide containing the ClfB binding motif interacted with ClfB ( Figures 7B, 7C). Further supporting this prediction, our structural studies revealed that the binding mode of the Dermokine-derived peptide to ClfB is nearly identical with that of the Fg a/CK10-derived peptide ( Figure 7B). Collectively, these findings raise a provocative possibility that ClfB might act on multiple targets during S. aureus infections. Given the fact that ClfB acts as a key determinant of S. aureus nasal colonization, this may not be totally surprising.
Interestingly, Dermokine was first identified as a gene expressed in the suprabasal layers of the epidermis, and more recently, other isoforms of this gene besides its a and b isoforms have also been found. This gene is expressed in various cells and epithelial tissues and over-expressed in inflammatory conditions [35,36], suggesting that Dermokine might play a role in inflammatory processes since the over-expression of the mediators in immune cell activation characterizes many inflammatory diseases. ClfB is involved not only in the S. aureus's colonization of human nares but also in the diseases caused by this bacterium. Additionally, S. aureus has also been implicated in several inflammation processes including corneal inflammation. ClfB's binding to Dermokine raises the possibility that ClfB might play a role in the S. aureus caused inflammation and the Dermokine gene's over-expression might serve as biological markers whose products could bind to ClfB and participate in this process. Obviously more investigations are needed to verify ClfB-Dermokine interaction during S. aureus infections as well as the biological significance of the interaction.
The characterization of ClfB as a multi-ligand binding protein will be meaningful for the identification of putative substrates and for furthering our understanding of the S. aureus infection pathway. Our findings also provide important leads towards the development of new therapeutic agents capable of eradicating S. aureus carriage in individuals and efficiently interfering with staphylococcal infection. This is particularly important since new antibacterial strategies are in urgent need to combat the drug resistant bacteria that continuing to emerge [37,38].

Cloning, expression and purification of the recombinant proteins
The fragment of the ClfB gene (corresponding 197-542 aa) was amplified by PCR from the S. aureus Newman genomic DNA. After digestion with BamHI and HindIII (NEB), the amplified genes were cloned into the prokaryotic expression vector pQE32 (GE Healthcare Life Sciences) to produce His-tagged fusion protein and were confirmed by DNA sequencing. The expected protein was expressed in E.coli strain BL21 with a high yield. Recombinant His-tagged protein was purified by Ni-affinity column chromatography and ion exchange chromatography. For the purification of protein-peptide complexes, the synthesized peptides were added into the concentrated protein samples at a 10:1 ratio and further subjected to gel filtration chromatography (Superdex-75 column) using buffer (10 mM Tris-HCl pH 8.0, 150 mM NaCl, 2 mM DTT) on the FPLC system (GE Healthcare Life Sciences). The proteins from different stages of purification (i.e. affinity and gel filtration chromatography) were monitored by SDS-PAGE. The selenomethionine (Se-Met)-substituted ClfB derivative was expressed and purified similarly.

Crystallization and structure determination
The apo-ClfB and its complexes with different peptides were concentrated to 30 mg/ml in 10 mM Tris-HCl pH 8.0, 150 mM NaCl and 2 mM DTT. Crystals were produced by the hangingdrop vapor diffusion method [39] using sparse-matrix screen kits from Hampton Research (Crystal Screen reagent kits I and II), followed by a refinement of the conditions through the variation of precipitants, pH, protein concentrations and additives.
Crystals were grown at 18uC by mixing 1.1 ml of protein with 1.1 ml of reservoir solution and equilibrating against 200 ml of reservoir solution. The apo-ClfB crystals are grown in 0.2 M LiSO 4 , 0.1 M Tris-HCl pH 8.5, 30% polyethylene glycol 4000 and all the complexes with peptides are grown in 0.1 M sodium citrate tribasic dehydrate pH 5.6, 20% 2-propanol and 20% polyethylene glycol 4000. Similar conditions were used for generation of the crystals of Se-Met-substituted ClfB. Native and Se-SAD data were collected at Shanghai Synchrotron Radiation Facility (SSRF) at a wavelength of 0.919 Å and 0.979 Å respectively using a MAR225 (MAR Research, Hamburg) CCD detector at 100 K and processed with HKL2000 [40]. Further processing was carried out using programs from the CCP4 suite (Collaborative Computational Project, 1994).
The selenium sites were located using SHELXs [41] from the Bijvoet differences in the Se-SAD data. Heavy atom positions were refined and phases were calculated with PHASER's SAD experimental phasing module [42]. The real-space constraints were applied to the electron density map in DM [43]. The resulting map was of sufficient quality for model building of the ClfB molecules in COOT [44]. The structures with other peptides were solved with molecular replacement methods in CCP4 and all the structures were refined with the PHENIX [45] packages. Data collection and structure statistics are summarized in table 1.

Synthetic peptides
The synthesis and purification of the peptides were described previously [18,27]. For the following peptides, the amino acid residue numbers are given and the sequences are as follows: peptide from repeat 5 of the C terminus of the a-chain of Fg (NSGSSGTGSTGNQ); a peptide in the tail region of CK10 (YGGGSSGGGSSGG); peptide 15 from Dermokine protein (SQSGSSGSGSNGDNN); The peptide 9 of GSR motif (GSSGSGSNG) and its mutated forms by alanine scan; The sixamino-acid peptide (GSSGSG).

Surface Plasmon Resonance spectroscopy
Binding of ClfB  to peptide 15 was assessed by SPR using the ProteOn XPR36 equipment (Bio-Rad Laboratories, Inc.). Each SPR experiment used multichannel detection. The system was equilibrated with buffer (10 mM HEPES pH 7.2, 150 mM NaCl). At each channel, peptide was captured to a ProteOn NLC Sensor Chip (BIO-RAD) at 25uC, using a flow rate of 100 ml/min. This resulted in peptide coupled at response levels of 460 RU. For binding measures, ClfB  was injected simultaneously at different concentrations at a flow rate of 100 ml/min. The experiments were repeated three times.
The binding affinities between ClfB and the ten 9-amino-acid peptides and the 6-amino-acid peptide were determined by surface plasmon resonance (SPR) using BIAcore T200 instrument (GE Healthcare) at 10uC. The ClfB protein was immobilized to about 5300 Response Unit (RU) on a research-grade CM5 sensor chip in 10 mM sodium acetate, pH 5.0 by standard amine coupling method. The flow cell 1 was left blank as a reference. For the collection of data for affinity analyses, the 11 peptides in a buffer of 10 mM HEPES pH 7.4, and 150 mM NaCl, plus 0.005% (v/v) Tween 20, were injected over the flow cells at various concentrations at a 30 ml/min flow rate. The ligands were allowed to associate for 60 s and dissociate for 120 s. Data were analyzed with the BIAcore T200 evaluation software by fitting to a 1:1 Langmuir binding fitting model. Figure S1 The two symmetry-related molecules in the unit cell. A. Ribbon representation of the two symmetry-related molecules in the unit cell. The two molecules are shown in orange and cyan, respectively. B. Electron densities showing the interaction between N terminus of one molecule and the G strand from the other one in the unit cell. S197 and L198 of the N terminus, F and G strand from the other one are marked. (TIF) Figure S2 The electron density of Fg a and CK10 peptides. A. Ribbon representation of ClfB (208-542) -Fg a (316-328) complex. The peptide is shown in sticks and the 2Fo-Fc map around the peptide contoured at 1.5s is also shown. The color scheme is the same as in Figure 1B. The N and C-termini of both the protein and the peptide are designated, respectively. B. Ribbon representation of ClfB (208-531) -CK10 (499-512) complex. The peptide is shown in sticks and the 2Fo-Fc map around the peptide contoured at 1.5s is also shown. The color scheme is the same as in Figure 1B Figure S5 Closer view of the ligand binding tunnel of ClfB. A, The N termini of the peptides. ClfB is represented as an electrostatic surface model with negative and positive charges indicated by red and blue, respectively. The Fg a peptide was superposed onto the CK10 peptide and they are shown as sticks in blue and yellow, respectively. B, The C-termini of the peptides. The color scheme is the same as in Figure S5A.  Figure S7 The electron density of Derm15 peptide. Ribbon representation of structures of ClfB (208-542) binding to Derm15 peptide from Dermokine. The peptide is shown in sticks and the 2Fo-Fc map around the peptide contoured at 1.5s is also shown. The color schemes of both the protein and the peptide are the same as in Figures S2 A and B. (TIF) Figure S8 The ClfB S236A and ClfB W522A single mutants cannot bind Derm15 peptide. A and B. surface plasmon resonance (SPR) shows the binding of different concentrations of synthetic Derm15 peptide to ClfB (197-542) S236A or W522A single mutants immobilized on a GLH Sensor Chip. Red, 4 mM; green, 2 mM; blue, 1 mM; pink, 0.5 mM; orange, 0.25 mM. C. Kinetic and affinity binding values of the ClfB (197-542) mutants S236A or W522A with Derm15 peptide. (TIF)