Crystal Structure of the Lamprey Variable Lymphocyte Receptor C Reveals an Unusual Feature in Its N-Terminal Capping Module

Jawless vertebrates represented by lampreys and hagfish use variable lymphocyte receptors (VLRs) as antigen receptors to mount adaptive immune responses. VLRs generate diversity that is comparable to immunoglobulins and T-cell receptors by a gene conversion-like mechanism, which is mediated by cytosine deaminases. Currently, three types of VLRs, VLRA, VLRB, and VLRC, have been identified in lampreys. Crystal structures of VLRA and VLRB in complex with antigens have been reported recently, but no structural information is available for VLRC. Here, we present the first crystal structure of VLRC from the Japanese lamprey (Lethenteron japonicum). Similar to VLRA and VLRB, VLRC forms a typical horseshoe-like solenoid structure with a variable concave surface. Strikingly, its N-terminal cap has a long loop with limited sequence variability that protrudes toward the concave surface, which is the putative antigen-binding surface. Furthermore, as predicted previously, its C-terminal cap lacks a highly variable protruding loop that plays an important role in antigen recognition by lamprey VLRA and VLRB. Recent work suggests that VLRC+ lymphocytes in jawless vertebrates might be akin to γδ T cells in jawed vertebrates. Structural features of lamprey VLRC described here suggest that it may recognize antigens in a unique manner.


Introduction
All classes of jawed vertebrates (gnathostomes) ranging from mammals to the cartilaginous fish employ immunoglobulins (Igs) and T-cell receptors (TCRs) as antigen receptors [1]. In contrast, jawless vertebrates (cyclostomes), such as lampreys and hagfish, have neither Igs or TCRs. Evidence indicates that variable lymphocyte receptors (VLRs) play roles equivalent to those of Igs and TCRs in the adaptive immune system of cyclostomes [2,3]. VLRs acquire their diversity through rearrangement of leucinerich repeat (LRR) modules [4,5], whereas the recombination of variable (V), diversity (D) and joining (J) gene segments is responsible for the generation of diversity in both B-cell receptors and TCRs [6,7]. VLRs consist of an N-terminal cap (LRRNT), the first LRR (LRR1), multiple (usually up to seven) 24-residue variable LRRs (LRRVs), a terminal or end LRRV (LRRVe), a connecting peptide (CP) and a C-terminal cap (LRRCT), followed by an invariant 39-terminal region [4]. Sequence diversity of mature VLRs is generated by the step-wise insertion of an LRRencoding module into the immature, incomplete VLR gene during the development of lymphocytes [8,9]. This assembly of VLR genes is thought to occur through a gene conversion-like mechanism mediated by cytosine deaminases of the AID-APOBEC family [10,11].
Previously, two types of VLRs, VLRA and VLRB, were identified in lampreys and hagfish [5,10,12]. Recently, a third VLR, designated VLRC, was identified in lampreys [13,14]. VLRA and VLRC are believed to be type I membrane proteins with a C-terminal transmembrane region. In contrast, VLRB is attached to the cell membrane via a glycosyl-phosphatidylinositol (GPI) anchor [4]. Like Igs, VLRB can be secreted by plasma cells as pentamers or tetramers of dimers, which function as strong agglutinins [15]. VLRA, VLRB and VLRC are expressed on distinct populations of lymphocytes [4,13,16], and functional VLR gene assembly occurs monoallelically, enabling expression of a single VLR on each lymphocyte. Interestingly, VLRA+ cells express orthologs of genes typically expressed in gnathostome Tlineage cells, whereas VLRB+ cells show gene expression patterns similar to gnathostome B-lineage cells [16], indicating that T-like and B-like lymphocyte-like cells emerged before the divergence of gnathostomes and cyclostomes. VLRC sequences are more closely related to VLRA than to VLRB sequences, leading to the speculation that VLRC+ cells are T-lineage cells and that, like jawed vertebrates equipped with two lineages of T-cells (ab and cd T-cells) and one lineage of B-cells, lampreys may have two lineages of T-like cells (analogous to ab and cd T-cells) and one lineage of B-like cells [13]. Recent work demonstrated that lamprey VLRA+ and VLRC+ cells are distinct lineages of T-like cells and that they might be functionally akin to ab and cd T-cells of jawed vertebrates, respectively [17].
Crystallographic analysis of hagfish VLRA and VLRB revealed that VLRs have a horseshoe-like solenoid structure typical of the LRR family of proteins, such as Toll-like receptors [18]. Subsequently, antigen recognition mechanisms of VLR were unveiled from the structures of lamprey VLRB in complex with Htrisaccharide [19], hen egg white lysozyme (HEL) [20], or the immunodominant glycoprotein of Bacillus anthracis spores [21]. Surprisingly, lampreys immunized with HEL produced not only specific VLRBs, but also specific VLRAs exhibiting higher affinity than VLRBs [12]. The crystal structure of lamprey VLRA in complex with HEL revealed that VLRA can recognize antigens directly [22], which was suggested to be analogous to direct recognition of the non-classical major histocompatibility complex (MHC) class I molecule T22 by the mouse cd TCR [23,24]. Here we present the first crystal structure of VLRC from the Japanese lamprey Lethenteron japonicum. Interestingly, the N-terminal cap of VLRC has a long loop with limited sequence variability that protrudes toward the concave, putative antigen-binding surface.
Very recently, Li et. al. reported the presence of a third VLR in hagfish and suggested that the newly identified VLR is the counterpart of lamprey VLRA whereas the hagfish VLR molecule previously known as VLRA is the counterpart of lamprey VLRC [25].

Production of the VLRC ectodomain
The DNA fragment encoding the ectodomain of VLRC including LRRNT, LRR1, LRRV1, LRRV2, LRRV3, LRRVe, CP and LRRCT (amino acid residues 252246; accession no. AB507271) was amplified from cDNA [13] using 59-GGGAATTCCATATGGCTTGCCTTGCGGTCGGCAAGG-ATGAC-39 as a forward primer and 59-CCGGAATTCT-TAATTGCAAGTCACATTCTTGATTTTTTC-39 as a reverse primer (underlined bases indicate restriction sites NdeI and EcoRI introduced for cloning purposes). The ''stalk domain'' was not included in the construct because its amino acid sequence is invariant. The resultant fragment was digested with NdeI and EcoRI, and ligated into the modified expression plasmid pET-26(b) (Novagen, USA) in fusion with an N-terminal 106 His tag. Modified pET-26(b) contained the downstream box sequence (ATGAATCATA) [26] before the start codon. The protein was overproduced in E. coli strain C43 (DE3). Single colonies were selected and grown overnight at 310 K in preculture media containing Luria broth with 25 mg ml 21 kanamycin. The precultures were then transferred into flasks containing 1 l of 26 YT medium with 25 mg ml 21 kanamycin. When the cell density reached an OD 600 of 0.6, isopropyl 1-thio-b-D-galactopyranoside (IPTG) was added to the media to a final concentration of 1 mM to induce protein expression. Cells were cultured for a further 6 h at 310 K, and harvested by centrifugation at 50006 g for 30 min at 277 K and washed with a buffer containing 50 mM Tris-HCl (pH 8.0) and 150 mM NaCl.
Inclusion bodies were isolated from cell pellets by sonication and washed repeatedly with a wash solution containing 0.5% Triton X-100. Purified VLRC inclusion bodies were solubilized in a denaturant solution that included 6 M guanidine hydrochloride. By using the refolding buffer (0.1 M Tris-HCl (pH 8.5), 0.6 M L-arginine, 2 mM EDTA, 3.73 mM cystamine, 6.73 mM cysteamine), the solubilized protein solution was diluted slowly (2 ml min 21 ) to a final concentration of 122 mM and stirred for 72 h at 4uC. The solution, which included refolded VLRC, was then concentrated with a VIVAFLOW50 system (Sartorius, USA) followed by gel filtration with a HiLoad 26/60 Superdex 75 prep grade column (GE Healthcare, USA). The purity of the protein was assessed on a 15% SDS-PAGE. A sole band with a molecular mass band ,25 kDa was observed, corresponding to the molecular mass of the VLRC ectodomain.

Crystallization
Prior to crystallization trials, VLRC was concentrated to a final concentration of 10 mg ml 21 in a buffer containing 10 mM Tris-HCl (pH 8.0) and 50 mM NaCl. Concentration was carried out using a Millipore centrifugal filter device (Amicon Ultra-4, 10 kDa cutoff; Millipore, USA). Screening for crystallization was per- Data collection and structure determination X-ray diffraction data were collected at beamline NW12 of the Photon Factory Advanced Ring (PF-AR, Tsukuba, Japan) using an ADSC CCD detector Q210. Prior to diffraction data collection, crystals were cryoprotected by transfer into a solution containing 25% (v/v) glycerol for a few seconds and flash-cooled. The data set was integrated, merged and scaled using HKL-2000 [27]; the crystal diffracted up to 2.1 Å . The VLRC crystal belonged to space group P2 1 2 1 2, with unit-cell parameters a = 102.2, b = 37.2, c = 55.1 Å . Based on the value of the Matthews coefficient (V M ) [28], it was estimated that there was one molecule in the asymmetric unit with V M = 2.21 A 3 /Da (Vsolv = 44.5%). Details of the data collection and processing statistics are given in Table 1.
The structure was solved by the molecular replacement method using the program Molrep [29]. The crystal structure of VLRA from hagfish (PDB ID: 2o6q) was used as a search model. The sequence identity between lamprey VLRC and hagfish VLRA is 53.9%. Structure refinement was carried out using Refmac5 [30] and Phenix [31]. The final model was refined to an R free factor of 19.0% and an R factor of 21.6% with a root mean square deviation of 0.008 Å in bond length and 1.16u in bond angle for all reflections between 50 and 2.3 Å resolution. Table 1 also presents a summary of the statistics for structure refinement. The stereochemical properties of the structure were assessed by Molprobity [32] and COOT [33], and showed no residues in the outlier region of the Ramachandran plot. The final model comprises His25 to Cys247 out of a total of 320 residues of the VLRC molecule. The N-terminal residues with an extra 106 Histag and the C-terminal residues are missing. Structure comparison with other VLR molecules was carried out using Secondary-Structure Matching (PDBeFold) [34].

Sequence variability analysis
All VLR sequences used here were derived from previous studies [4,5,8,10,11] [13]. For analysis of LRRNT or LRRCT regions, multiple sequence alignments were carried out using ClustalW [35]. Shannon entropy (H) was computed on the HIV sequence database server (http://www.hiv.lanl.gov/content/ sequence/ENTROPY/entropy_one.html). The Shannon entropy (H) for each aligned position was calculated by the following equation.
Pi log 2 Pi Gap openings were excluded from the calculation. The logarithm was calculated to base 2. The Shannon entropy was calculated using inshore hagfish (Eptatretus burgeri) VLRA or VLRB, sea lamprey (Petromyzon marinus) VLRA or VLRB, or Japanese lamprey VLRC. VLRC sequences with an equal number of LRRV modules were chosen for alignment of LRR modules. All multiple alignments were checked by eye and if needed, corrected to maximize the extent of similarity using MEGA 5 [36].

Overexpression and purification of the VLRC ectodomain
Protein expression was not detected when the VLRC ectodomain (residues 252246) construct was introduced into the pET-26(b) vector. However, high level expression of recombinant protein was obtained when a 106 His-tagged VLRC ectodomain was expressed with the modified pET-26(b) vector, in which, the downstream box sequence was inserted before the start codon. The recombinant VLRC protein was obtained as inclusion bodies, and following the refold step eluted at about the volume of the monomer species in gel filtration chromatography.

Overall structure of the VLRC ectodomain
The crystal structure of the lamprey VLRC ectodomain (residues from His25 to Cys247) was determined by molecular replacement at a resolution of 2.3 Å . The structure has a horseshoe-like solenoid conformation, which is typical of proteins that belong to the LRR protein family (Figure 1). Lamprey VLRC is composed of eight modules, including LRRNT, LRR1, LRRV1, LRRV2, LRRV3, LRRVe, CP and LRRCT in this order. The concave surface of lamprey VLRC is formed by the bsheet made up of eight b-strands (two from LRRNT, five from LRRs and one from CP); the b1-strand is anti-parallel whereas the remaining b-strands are arranged in a parallel fashion (Figure 1). These b-strands are the only secondary structures assigned by DSSP [37].
We calculated the Shannon entropy [20,38,39] for each position based on the multiple alignments of 102 Japanese lamprey VLRC sequences [13] and mapped variable residues onto the VLRC crystal structure (Figures 3 and 4). This revealed that variable residues are located predominantly on the concave surface, in particular in the region around the core b-sheet (Figure 3), suggesting that, similar to VLRA and VLRB, VLRC recognizes its ligands through the concave surface [19,20,21,22] (Figure 2). In the known structures in complex with ligands, both hydrophilic and hydrophobic residues are involved in ligand recognition [19,20,21,22]. The distribution of hydrophilic and hydrophobic residues on the concave surface of our crystal structure suggests that this is also the case with VLRC.
The ectodomain of lamprey VLRC contains three potential Nglycosylation sites, one in the LRRNT (Asn41) and two in the LRRCT (Asn228, and Asn244). However, glycosylation at these residues is unlikely to affect ligand recognition, because these residues are located outside of the concave region (Figure 1, panel  B).

A protruding loop in the N-terminal cap
A striking feature of lamprey VLRC is that its N-terminal cap has a long loop protruding toward the concave surface ( Figure 1). This protrusion is formed by amino acid residues 41-48 located in the region connecting two b-strands b1 and b2 ( Figure 5). These eight residues show only low sequence variability (Figure 4, panel E). Indeed, ,80% of known VLRC sequences from Lethenteron japonicum have the same sequence, NKTDSSPE, and the remaining ,20% have closely related sequences ( Table 2). Essentially the same observation was made with the VLRC sequences from two other lamprey species, Petromyzon marinus and Lampetra planeri (Table 2).
Flexibility in the LRRCT loop is important for VLRB molecules to bind antigens [20,22]. However, the atomic b-factors in the LRRNT loop of the VLRC structure are around the same values as in other regions, suggesting that the flexibility in this region is not particularly high. This is probably because the loop is packed closely within adjoining molecules in the crystal lattice.

Discussion
The most striking observation made in this study is that the Nterminal cap of lamprey VLRC has a long loop protruding toward the concave surface ( Figure 1). Importantly, the stretch of residues constituting the LRRNT protrusion is highly conserved in length and amino acid composition in lamprey VLRC molecules (Table 2 and Figure 5). Therefore, the protrusion in LRRNT appears to be a shared feature of all lamprey VLRC molecules. In contrast, sequence alignment suggests that all VLRA molecules and .90% of VLRB molecules lack the potential to form comparable protrusions in their N-terminal caps ( Figure 5).
Among the members of the LRR family of proteins, plateletreceptor glycoprotein Iba has a similar long protrusion in the LRRNT that extends toward the concave surface [40]. In Iba, its ligand, von Willebrand factor, is recognized by both N-and Cterminal protrusions. Therefore, it is reasonable to postulate that the protrusion in LRRNT of lamprey VLRC may be involved in antigen recognition. In lamprey VLRC, neither the 59-LRRCT nor the protrusion in LRRNT exhibits high levels of variability. Hence, as suggested earlier [13], the N-and C-terminal caps of VLRC might interact with conserved epitopes of restricted sets of antigen or invariant regions of molecules involved in antigen presentation.
In hagfish, only two types of VLRs, VLRA and VLRB, had been known until a third VLR was identified quite recently [25]. Detailed phylogenetic analysis of this newly identified VLR indicated that it is the counterpart of lamprey VLRA and that, in reality, the hagfish VLR molecule previously known as VLRA is the counterpart of lamprey VLRC, thus necessitating the change Figure 5. Amino acid sequences of LRRNT in known VLR molecules. Representative VLR sequences from lampreys and hagfish were aligned, and the number of residues between b-strands, b1 and b2, was counted. Frequency of sequences with a given number of residues between b1 and b2 (shown on the right side of the figure) was calculated using publicly available sequences. Note that all known lamprey VLRC sequences have eight residues between b1 and b2 (highlighted in yellow). The number of corresponding residues is four in VLRA, and .90% of VLRB sequences have two residues (highlighted in green). In hagfish VLRC, amino acid residues which form 3 10 -helices are highlighted in orange. VLR sequences are from sea lamprey (Pm, Petromyzon marinus), Japanese lamprey (Lj, Lethenteron japonicum), brook lamprey (Lp, Lampetra planeri), inshore hagfish (Eb, Eptatretus burgeri), and Pacific hagfish (Es, Eptatretus stoutii). VLR molecules were named according to the new nomenclature. Thus, EbVLRC and EsVLRC correspond to the previously reported EbVLRA and EsVLRA molecules, respectively. EsVLRA represents the newly described VLR [25]. doi:10.1371/journal.pone.0085875.g005 in nomenclature [25]. Consistent with this is the observation that, like lamprey VLRC, the hagfish VLRC (formerly known as VLRA) lacks a hypervariable insert, and hence a prominent protrusion in the LRRCT [20] (Figure 4). There are, however, some structural differences between lamprey and hagfish VLRC molecules. First, unlike lamprey VLRC, hagfish VLRC appears to lack the capacity to form a protrusion in the LRRNT (Figures 4  and 5). Second, unlike lamprey VLRC and other VLR molecules, hagfish VLRC has a 3 10 -helix upstream of the b1 strand. Whether these structural differences bear any functional significance is an issue that warrants further investigation. Accumulating evidence indicates that the lymphocyte-based adaptive immune systems of gnathostomes and cyclostomes show remarkable similarity despite the fact that they use structurally unrelated molecules as antigen receptors [41]. Particularly striking is the occurrence of two major populations of agnathan lymphocytes, one involved in humoral immunity and another presumably involved in cell-mediated immunity. VLRB can be anchored to the cell membrane via a GPI linkage [4] or secreted like antibodies as pentamers or tetramers of dimers [15]. Polymerization of VLRB increases its avidity in a manner similar to multimer formation of IgM and IgA molecules. Overall, VLRB+ cells resemble B cells in that they both secrete antibodies in response to an antigen challenge. On the other hand, VLRA+ cells resemble T cells; not only do they undergo blastoid transformation in response to a T-cell mitogen, but they also express IL-17, GATA2/3 and NOTCH whose jawed vertebrate counterparts are expressed in T cells, and involved in their development and differentiation. The similarity between the adaptive immune systems of gnathostomes and cyclostomes was further bolstered by the recent proposal that VLRC+ cells constitute a second lineage of T-like cells and might be akin to cd T cells [17]. In addition, a highly polymorphic hagfish membrane protein, NICIR3 (also called ALA), was identified as a predominant allogeneic leukocyte antigen recognized by VLRB antibodies [42], raising the possibility that NICIR3 might play a role comparable to that of MHC molecules of jawed vertebrates. It would be important to understand whether VLRC can recognize antigens directly like mammalian cd TCRs or requires a functional MHC analog for antigen recognition.