Identifying Patient-Specific Epstein-Barr Nuclear Antigen-1 Genetic Variation and Potential Autoreactive Targets Relevant to Multiple Sclerosis Pathogenesis

Background Epstein-Barr virus (EBV) infection represents a major environmental risk factor for multiple sclerosis (MS), with evidence of selective expansion of Epstein-Barr Nuclear Antigen-1 (EBNA1)-specific CD4+ T cells that cross-recognize MS-associated myelin antigens in MS patients. HLA-DRB1*15-restricted antigen presentation also appears to determine susceptibility given its role as a dominant risk allele. In this study, we have utilised standard and next-generation sequencing techniques to investigate EBNA-1 sequence variation and its relationship to HLA-DR15 binding affinity, as well as examining potential cross-reactive immune targets within the central nervous system proteome. Methods Sanger sequencing was performed on DNA isolated from peripheral blood samples from 73 Western Australian MS cases, without requirement for primary culture, with additional FLX 454 Roche sequencing in 23 samples to identify low-frequency variants. Patient-derived viral sequences were used to predict HLA-DRB1*1501 epitopes (NetMHCII, NetMHCIIpan) and candidates were evaluated for cross recognition with human brain proteins. Results EBNA-1 sequence variation was limited, with no evidence of multiple viral strains and only low levels of variation identified by FLX technology (8.3% nucleotide positions at a 1% cut-off). In silico epitope mapping revealed two known HLA-DRB1*1501-restricted epitopes (‘AEG’: aa 481–496 and ‘MVF’: aa 562–577), and two putative epitopes between positions 502–543. We identified potential cross-reactive targets involving a number of major myelin antigens including experimentally confirmed HLA-DRB1*15-restricted epitopes as well as novel candidate antigens within myelin and paranodal assembly proteins that may be relevant to MS pathogenesis. Conclusions This study demonstrates the feasibility of obtaining autologous EBNA-1 sequences directly from buffy coat samples, and confirms divergence of these sequences from standard laboratory strains. This approach has identified a number of immunogenic regions of EBNA-1 as well as known and novel targets for autoreactive HLA-DRB1*15-restricted T cells within the central nervous system that could arise as a result of cross-reactivity with EBNA-1-specific immune responses.


Introduction
Epstein-Barr virus (EBV) is the only human-adapted member of the Lymphocryptovirus genus, belonging to a lineage of Old World primate gamma-1 herpesviruses that was transferred to a hominid ancestor approximately twelve million years ago, and which is now responsible for near-universal and lifelong human infection [1,2]. Viral transmission is generally via saliva, with evidence that age of infection is associated with cultural and socioeconomic factors [3]. Uniquely, chronic infection is established within 'immortalised' B-lymphocytes that are transformed by an array of viral proteins that functionally mimic host proteins to create long-lived memory cells [4,5]. Viral persistence is then promoted through mechanisms that reduce antigen presentation to the adaptive immune system [6], including the involvement of latency programs that limit viral protein expression to a minimal subset critical for replication; most notably Epstein-Barr Nuclear Antigen-1 (EBNA-1), which maintains host chromosomal attachment of viral episomal DNA thus linking viral and cellular replication cycles [7].
These mechanisms of viral persistence would predict limited viral sequence diversity, in keeping with the relatively slow evolutionary rate of EBV and other gamma-1 herpesviruses [2] and evidence of geographically-defined viral subtypes [8]. Nevertheless, evidence of diversifying selection involving latency genes including EBNA-1 has been identified [9], including preferential variation within human leukocyte antigen (HLA) binding sites (viral epitopes) suggesting that antigen presentation can promote HLA-specific viral escape mutations [10,11]. Thus, EBNA-1 is not immunologically 'silent' as once thought [12] but is an antigenic target for both CD4 and CD8 T-cell responses [12][13][14] as well as antibodies [15], in keeping with finely tuned immune surveillance mechanisms that generally maintain persistent but stable cycles of EBV infection involving both epithelial and B-lymphocyte compartments [5]. Within this paradigm, mechanisms of viral antigen display [13,14] and the general hierarchy of EBVspecific immune responses including regulatory as well as effector T cell responses are being examined [14,[16][17][18]. These have particular relevance to the therapeutic application of EBVspecific T-cell adoptive immunotherapy against EBV-related malignancies including Burkitt's and Hodgkin's lymphoma and nasopharyngeal carcinoma [19], now supported by positive findings in clinical trials [20,21]. This strategy is underpinned by knowledge of EBV sequence diversity in tissue samples [9,22,23] and its utilisation to predict viral epitope targets [24].
Our own investigations have focused on multiple sclerosis, an inflammatory demyelinating disease of the central nervous system that often leads to neurodegeneration and long-term disability despite current treatment strategies [25]. While a comprehensive explanation of multiple sclerosis pathogenesis remains incomplete, it is clear that the major component of genetic risk is associated with the HLA-DR locus [26][27][28][29], thus implicating HLA-restricted antigen binding and presentation [30], as well as genetic determinants that predominantly relate to Tcell activation [29]. Several lines of evidence link Epstein-Barr virus-specific immunity to multiple sclerosis risk. Both serological [27,31,32] and CD4 T cell responses [33] directed against EBNA-1 have been associated with multiple sclerosis, with further evidence that EBNA-1-specific antibodies differentiate disease-discordant identical twins [34]. Several groups have demonstrated higher EBV seroprevalence in MS patients compared to controls and it has further been demonstrated that EBV infection late in life, in particular if manifested as infectious mononucleosis, increases a person's MS risk [27,35,36]. A recent study has also explored the use of Epstein-Barr virus-specific adoptive immunotherapy for progressive multiple sclerosis, with promising preliminary results [37]. Further observations include that cerebrospinal fluid oligoclonal bands that are a hallmark of MS specifically can target EBNA-1 [38] and one group has additionally identified the presence of EBV-infected B cells within white matter MS lesions at all disease stages [39], although this result has not been replicated in other studies [40].
In this study, we have utilised DNA obtained from buffy coat samples of patients with multiple sclerosis to analyse EBNA-1 sequence variation using both Sanger and FLX 'next-generation' sequencing technologies, without any requirement for primary culture techniques or the creation of cell lines through ex vivo EBV transformation. We have then sought to identify potential HLA-DRB1 Ã 1501-restricted viral epitopes within the EBNA-1 protein sequence using standard HLA binding algorithms [41], and investigated potential homology with similarly HLA-restricted antigens in a dataset of human central nervous system proteins [42]. The results of this study, which follow from our previous investigations of the contributions made by HLA alleles and Epstein-Barr virus immunity to multiple sclerosis risk [27,28], highlight the divergence of autologous 'wild-type' EBNA-1 sequences from those of laboratory strains commonly used for experimental purposes, and suggest possible avenues of investigation that acknowledge both host and viral genetic diversity in higher-resolution analyses of the role of host-pathogenic interactions in autoimmunity [30,43,44].

Research participants
A total of 79 study participants in the Perth Demyelinating Disease Database (PDDD) were included in the study. The study protocol was approved by the Sir Charles Gairdner Hospital Human Research Ethics Committee, and written informed consent was obtained from all participants.

DNA extraction
DNA was isolated from buffy coats (stored at -80°C) using an automated robotic setup using Genfind according to the manufacturer's instructions. Briefly, 100ul of buffy coat were lysed and Proteinase K added to rupture cell membranes and digest protein. DNA was then immobilized on magnetic particles by the addition of a magnetic bead binding reagent. DNA was separated from contaminants using a magnetic field and washing steps. DNA was eluted in 125μl from the magnetic particles. A minority of samples were manually extracted using Qiagen with the provided protocol. 200μl of buffy coat were lysed and Proteinase K added to remove protein and other contaminants. DNA was absorbed on to the silica-gel membrane during centrifugation of columns and then washed twice to ensure complete removal of any residual contaminants. Finally, samples are recovered from the membrane using 200μL elution buffer. Concentration of all eluted samples was determined using Nanodrop and 1 μl of each sample was loaded on a 1% agarose gel to test for presence and integrity of DNA. All samples were stored at 4°C until further use.

HLA typing
PCR and sequencing based HLA genotyping of the MS cohort resolved to at least the 4-digit level was performed as previously described using heterozygous ambiguity resolving primers where applicable [27,28].

EBV amplification
The N-and C-terminal ends excluding the glycine-alanine rich regions of the EBNA-1 gene were amplified using semi-nested PCRs and fully automated setup utilising Biomek FX robots. EBV reference strain B95-8 was extracted from a B95-8 transformed cell line and was diluted and used as a control in each EBV PCR. All PCR reactions were performed using Roche High Fidelity Taq in 25μl reactions with forward and reverse primers at a concentration of 25pmol/ μl. All primers used for amplification have been previously published and named according to the position in the reference strain B95-8 [11,45,46], as summarised below: For amplification of the N-terminal end the first round primer pair 107754F-109135R (PCR1) resulting in a 1381 base pair fragment (bp) was used. Detailed information about size and location of EBNA-1 PCRs with reference to the EBV strain B95-8 can be found in S1 Fig. Second round amplification was then performed using either 107881F-109135R (PCR2) or 107754F-108160R (PCR3), resulting in 1254bp or 406bp fragments respectively. C-terminal EBNA-1 PCR was performed as described previously [27]. Briefly, first round amplification with the primer pair EBV109111F-EBV109970R (PCR 4) resulted in an 859bp fragment. A semi-nested PCR was followed using the primer combination 109111F-109869R (PCR5) resulting in a final 758 base pair product. Alternatively, shorter semi-nested PCRs were performed using the primer combinations 109111F-109759R (PCR6, 648bp) and 109111F-109459R (PCR7, 348bp) respectively. In some cases PCRs with alternative primers have been performed. For an overview of primer pairs used in each PCR, nucleotide coordinates of primers within the B95-8 reference strain as well as primer melting temperatures, elongation times and product sizessee S1 Table. Successful PCR samples were purified using magnetic particles with AMPure (Beckman Coulter) on Biomek FX robots and stored at 4°C until further use.

Sanger sequencing and analysis
Samples were directly sequenced on an automated 96 capillary ABI 373 DNA Sequencer, followed by purification of sequencing products with magnetic particles using CleanSEQ (Beckman Coulter) on Biomek FX robots. Analysis of electropherograms was performed using the ASSIGN V4.0.1.36 software (Conexio Genomics). Threshold for mixture detection in Sanger sequencing has been established to be~30%. For construction of the Phylogenetic tree, 53 MS sequences and reference strains B95-8, AG876, GD1 and HKNPC1 covering the majority of nucleotide positions B95-8: 109135-109815 (EBNA-1: 1186-1866) were included. Genetic distance was visualized using the Neighbour-joining method based on the p-distance model with pairwise deletion within the PHYLIP (Phylogeny Inference Package) version 3.695 [47].

FLX sequencing
For the 454 FLX sequencing strategy, 24 samples were pooled in a single FLX lane. 20 MS samples were selected based on successful Sanger sequencing for three epitopes of interest (EBNA-1 aa PPP: 401-416, AEG: 482-496 and MVF: 563-577). Three additional MS samples were included that displayed a band of correct size on a 1% agarose gel but were not successfully sequenced by Sanger methods. The B95-8 strain of EBV was used as a control in each EBV PCR and was also selected as a control for FLX sequencing.
First round C-terminal EBV PCRs products with the primer combination EBV109111-F-EBV109970R (PCR4) were used as templates to generate shorter second round PCRs for the selected deep sequencing samples. For PCR length and location please refer to S1 Fig. This nested PCR was performed with molecular barcoded primers. These tags consisted of eleven nucleotides unique extension to the 5' end of the second round primers EBV109111F and EBV109869R (PCR5b) resulting in 780bp amplicons. This method facilitates sample multiplexing while also increasing the ability to accurately assign reads back to the sample. Resulting PCR amplicons were pooled at equimolar ratios (3x10 11 copies each) to achieve similar number of reads. Standard Library was constructed using 454 Roche Titanium Chemistry protocol. The denatured DNA library was immobilized onto beads and emulsified with the amplification reagents in a water-in-oil mixture and clonally amplified (emPCR). Following emPCR, the capture beads with bound DNA were enriched according to the 454FLX titanium manual and used for pyrosequencing on one lane of an eight lane 454 FLX sequencing run according to the 454 sequencing manual.
The reads obtained from the sequencing were separated according to the unique tags and linked back to the original samples, using the NextGENe software package from SoftGenetics, Inc (State College, PA, USA). Further analysis was performed with inhouse software.

FLX data analysis
The NextGENe software package version 2.3.0 (SoftGenetics, Inc, USA) was used to create a consensus sequence present at >45% for each sample based on the B95-8 EBV reference strain (GenBank accession number: V01555.2). For each sample, all reads were aligned to the consensus using a pairwise alignment. The pairwise alignments were combined into a multiple alignment by matching the reference positions for all aligned pairs. The aligned files were used to detect homopolymers, which are known to occur in FLX sequencing as artefacts. During analysis, all homopolymers not present in the respective consensus sequence have been excluded. Minorities present at <1% were not taken into consideration and insertion deletions were also excluded except for a strain specific in-frame insertion of three amino acids (glycine (GGA), aspartic acid (GAT), aspartic acid (GAC)) at position 2367 of EBNA-1 position (109818/109819 of reference strain B95-8) in several samples, which has been described previously [11]. All mutations detected were additionally manually analysed in the raw FLX data file of each sample to exclude any contribution of homopolymer errors. A nucleotide change was considered a real mutation if the mutation could not have been caused by a nucleotide insertion/deletion before or after the mutation. Additionally, mutations were only taken into consideration if they were present in at least three sequences, independent of the total number of individual reads per sample. Mutations within three nucleotides from the beginning or end of a read were excluded. Furthermore, unresolved nucleotide mixtures within reads were few, but indicated low signal quality in this position and were not taken into consideration.

Epitope predictions
HLA binding algorithms (NetMHCII, NetMHCIIpan) were utilized to identify potential HLA-DRB1 Ã 1501 class II HLA epitopes within the EBNA-1 Sanger derived sequences. Additionally, predictions were performed for two known HLA-DRB1 Ã 1501 class II EBNA-1 epitopes (denoted AEG and MVF) using all FLX consensus and Sanger sequences generated. All predicted 15mer peptides which resulted in HLA-DRB1 Ã 1501 strong (<50 nM) and weak binders (50-500 nM) were selected and tested for potential cross-reactivity on a dataset of CNS proteins enriched for axoglial proteins (human protein reference database (HPRD.org) and [42], as well as a selection of brain proteins derived from NCBI (S2 Table). Amongst these potential cross-reactive epitopes, a subset of epitopes sharing the majority of peptide amino acid residues within the epitope core HLA-binding sequence of nine amino acids, were identified.

Statistical analysis
We assessed whether EBNA-1 nucleotide polymorphism at each position was significantly associated with MS risk alleles by grouping alleles according to previous genetic analysis by our group [27]. Samples were categorized as carrying high MS risk, low MS risk or neutral risk alleles, respectively, if they carried at least one risk allele (HLA-DR1 Ã 08, Ã 15, Ã 16); at least one protective allele (HLA-DR1 Ã 04, Ã 07, Ã 09) and no risk allele; or two neutral risk alleles. Tables of nucleotide frequencies by risk groups were created at each nucleotide position and associations assessed by Fisher exact tests. We assessed clustering based on all 62 C-terminal Sanger sequences using the "partitioning around medoids" (PAM) method [48]. All nucleotide positions demonstrating some nucleotide variation were included and clusters displayed via a plot of the first two principal components [48]. Associations of HLA risk groups with viral clusters found in the cluster analysis were also assessed using Fisher exact tests. Analyses were carried out using TIBCO Spotfire S+ 8.2 (Somerville, MA). primary culture to enrich for EBV episomes. All samples were obtained from participants in the Perth Demyelinating Disease Database (PDDD) with confirmed MS, reflecting a wide range of age and disease severity (Fig 1). Females were more prevalent (73%) than males (27%) in the study population, with a slightly higher age (median: 53) compared to males (median: 49). Within this dataset, 62 samples were successfully sequenced across the EBNA-1 C-terminal region, which is known to contain the majority of MHC Class II-restricted T-cell epitopes [49][50][51], and a subset of 37 samples was sequenced in the N-terminal EBNA-1 region additionally. Sequences for both the N-and C-terminal end were obtained from 23 samples.  Wild-type EBNA-1 sequence variation The C-terminal region of EBNA-1 demonstrated sequence variation within four major clusters with additional minor variation (Fig 2). In keeping with previous analyses of wild-type EBNA-1 sequences [9,11], wild-type sequences showed strong similarity to the EBV reference strain B95-8 in only a minority of cases (9/53, 16.9%) and none clustered with the type 2 AG876 strain. No nucleotide mixtures were identified by Sanger sequencing methods that would indicate the presence of multiple EBNA-1 populations, either as a result of mutation or superinfection with multiple strains. N-terminal sequence analysis showed high conservation and identified only three different variants: 19 samples demonstrated 100% sequence similarity with the N terminal sequence of the EBV reference strain B95-8, while 12 sequences matched the previously identified type 2 EBV strain AG876 which differs to B95-8 in the positions: Q16E, E18G, D24E, S27G and A85T [52]. Three of these positions (Q16E, E18G, S27G) have previously been described to occur in combination (44). The third variant occurred in seven samples and aligned well with AG876 but contained two mismatches to it: EBNA-1 amino acid positions: V70A), (Q74P. Cluster analysis based on the C-terminal Sanger sequences described in Methods revealed two distinct populations (Fig 3), a main cluster (A) of 47 cases and a smaller cluster (B) of 15 cases. Across the nine positions with significant HLA association there were 373 consensus nucleotides and seven non-consensus among the 47 cases in cluster A and just 10 consensus nucleotides and 109 non-consensus in cluster B. Hence at these nine positions the two clusters were almost mutually exclusive.

Association of HLA and viral sequence variation
When assessed for HLA MS risk group, cluster A contained 5, 9 and 33 individuals classified as MS risk neutral, protective and high, respectively, compared with the smaller cluster (B) which contained 4, 6 and 5 individuals with neutral, protective and high MS risk. The high MS risk group was thus significantly over-represented in the large cluster (p = 0.016). ; increasing to 5.0% at a 2% cut-off (37/749 nucleotides) and 8.3% at a 1% cut-off (62/749 nucleotides). As noted in Table 1, most of the mutations that defined minority variants were unique to individual samples (42 individual positions) ( Table 1).
Sequence conservation and relevance to HLA-DRB1*15 binding As shown in Fig 3, EBNA-1 sequences were mapped against the standard reference strain B95-8 in order to illustrate known and putative HLA-DRB1 Ã 15-restricted epitopes and their relationship with EBNA-1 sequence conservation at the amino acid level. This approach, which utilises the ConSeq server [53], reflects amino acid conservation in terms of the influence of physicochemical properties of amino acid substitution as well as the frequency of sequence variation. Hence, 'dips' in the conservation plot (highlighted in red) reflect sites of variation that are likely to influence protein structure. Above these plots, predicted HLA-DRB1 Ã 15 binding sites are denoted along with the core binding regions, derived from NetMHCIIPan analysis [41]. As shown, this approach revealed two known HLA-DRB1 Ã 1501 restricted epitopes (AEGLRAL-LARSHVER ('AEG': aa 481-496) and MVFLQTHIFAEVLKD ('MVF': aa 562-577), as well as two overlapping putative epitopes covering a region between positions 502-543. In contrast, no HLA-DRB1 Ã 15 epitopes were identified within the N-terminal region of EBNA-1. The most frequent sequences at these epitope sites were AEGLRTLLARCHVER and IVFLQTHIFAEGLKD (differences to B95-8 reference underlined). Variant sequences within these epitope regions are described in Table 2, along with comparisons of HLA-DRB1 Ã 15 binding affinity. As noted, there are no wild-type EBNA-1 variants within these known epitopes that would be predicted to abrogate HLA-DRB1 Ã 15 binding completely, although further studies will be required to establish if minor variations in binding affinity could influence the nature of the T cell response, noting that a previous study has demonstrated both regulatory and effector EBNA-1-specific CD4 + T cells with identical epitope specificity [18]. Interestingly, eight of the nine polymorphic sites from the cluster analysis fell within our predicted HLA-DRB1 epitopes including two changes in the previously described 'AEG' and 'MVF' epitopes, respectively.
Identification of HLA-DRB1*15 epitopes within brain proteins homologous to EBNA-1 For this analysis we utilised three datasets enriched for CNS proteins [42], to identify candidate cross-reactive proteins that shared a propensity for HLA-DRB1 Ã 15 binding as well as homology to natural EBNA-1 sequences at these sites. This analysis was predicated on the hypothesis that (1) EBNA-1-specific T cell immunity reflects a standard model of HLArestricted binding and antigen presentation, providing a long-term stimulus for T cell responses that could then (2) cross-react with CNS-specific antigens in a manner that requires HLA-restricted presentation but which may be less predictably associated with HLA binding affinity given the constraints of negative selection against high-affinity autoantigens, and the known altered topology of many autoreactive HLA-peptide-TCR interactions [54]. This approach is also informed by the previous demonstration of EBNA-1-specific CD4 + T cells capable of producing pro-inflammatory responses against myelin antigens in a seminal study by Lunemann and colleagues [33].
Thus, we initially selected EBNA-1 epitopes of interest based on HLA binding affinity, and then identified candidate CNS protein epitopes that would be predicted to bind HLA-DRB1 Ã 15 with sufficient affinity to allow antigen presentation (strong and weak binders with affinity threshold 500 nM) and which exhibited homology with the EBNA-1 epitope (threshold !3 residues within the 9 amino acid core binding region identified by NetMHCII and NetMHCIIPan analysis). Applying this approach to myelin proteins of known interest in MS research in the   first instance reviewed in [55], we identified potential cross-reactive responses involving a number of major myelin antigens (Table 3) including experimentally confirmed HLA-DRB1 Ã 15-restricted epitopes associated with encephalitogenic T cell responses (asterisked) including 2',3'-cyclic-nucleotide 3'-phosphodiesterase, alpha B crystallin, myelin basic protein and oligodendrocyte-specific protein. Additionally, we identified several novel candidate antigens within glial fabrillary acidic protein, myelin proteolipid protein, neurofilament heavy polypeptide and myelin-oligodendrocyte glycoprotein. We then extended this analysis to a larger set of central nervous system antigens enriched for axoglial proteins that maintain myelinated nerves and nodes of Ranvier critical for saltatory conduction reviewed in [61], noting recent evidence that the axoglial apparatus may be targeted in the earliest phases of multiple sclerosis lesion development [62,63]. As described in Table 4, which presents a subset of results based on optimal EBNA-1 epitope binding and core match values !3, this analysis identified a larger set of potentially cross-reactive CNS proteins  Table 2. EBNA-1 sequence variation identified with next-generation sequencing technology, and impact on HLA-DRB1*15 binding affinity within known epitopes.
The two most frequent variants for the A) AEG epitope and B) MVF epitope are printed in bold, the most frequent variant is underlined. All epitopes are predicted to be weak binders (affinity 50nM-500nM).

Table 2A
NetMHCII  including neurofascin [62] as well as a number of other proteins involved in actin organisation and paranodal assembly like ankyrins, contactin-associated proteins as well as gelsolin [64].

Discussion
In this study we have proven the feasibility of obtaining EBNA-1 sequences directly from buffy coat samples, without any requirement for primary cultures that could theoretically be associated with preferential selection of viral sequence variants through ex vivo expansion. In this regard our findings are in keeping with those of Burrows and colleagues [11], who demonstrated similar patterns of EBNA-1 sequence variation predominantly within the C-terminal region in both MS cases and controls, in a study that did involve primary B lymphocyte cultures. Both studies, as well as a more recent analysis of spontaneously outgrown human lymphoblastoid cell lines [65] are in agreement in demonstrating that the majority of autologous sequences do not align closely with the widely used B95-8 laboratory strain-a result that is perhaps not surprising given that this strain was originally identified following transfusionassociated EBV in an elderly woman and subsequently selected for its ability to efficiently immortalise B lymphocytes [66].
Our results are also in agreement with other studies that have identified similar patterns of EBNA-1 [67] and EBNA-2 [68] sequence variation when comparing MS cases and controls, albeit at low resolution in these cases, suggesting that MS susceptibility is not likely to be readily explained by an 'encephalitogenic strain' of EBV. We have also explored sequence variation within individual samples using next-generation sequencing techniques, to investigate if the presence of multiple viral sequence variants could indicate sites of immune selection pressureas has been suggested previously for EBNA-1 [11]-and/or that infection with multiple EBV strains could represent a risk factor for MS disease as has been previously proposed [69]. We identified nucleotide mixtures present at 10 percent in two samples. Mixtures were primarily caused by point mutations leading to amino acid changes in 12 different positions compared to silent mutations in five positions only. This could indicate EBV superinfection or viral immune escape. While we were able to identify low-level EBNA-1 sequence variation in these samples (involving 8.3% of nucleotides at a 1% threshold), our results do not support a strong influence of intraindividual EBV sequence variation in MS disease risk and we cannot exclude that some of these point mutations are due to technical artefacts. It is however interesting to note that EBNA-1 sequence conservation described in Fig 4 (reflecting genetic variation as well as the impact this has on amino acid properties), does appear to map to HLA-DRB1 binding regions, although we were unable to identify natural sequence variants that were associated with Underlined is the core of the peptide. Binding score (nM) prediction using NetMHCII and NetMHCIIpan. Binding score <50: strong binder, binding score 50-500 weak binder. *Known CNP epitope [56] **Known aB-crystallin epitope [57] ***Known MBP epitope [58] †Known MOG epitope [59] † †Known OSP epitope [60].
doi:10.1371/journal.pone.0147567.t003 Table 4. Extended axoglial brain protein dataset with HLA-DRB1*1501 predicted brain epitopes overlapping with predicted EBV binders.  abrogation of HLA binding using the NetMHCIIPan prediction algorithm. However, in the HLA-viral sequence variation association analyses, we could identify eight out of nine EBNA-1 polymorphic nucleotide positions significantly associated with MS risk alleles within these HLA-DRB1 binding regions, including two in the previously described HLA-DRB1 Ã 15 'AEG' and 'MVF' epitopes respectively, noting in each case that the more common (wild-type) viral sequence was favoured in the presence of disease-associated HLA-DR alleles. These differences will be explored further in terms of their impact on epitope-specific CD4 + T-cell immune responses, acknowledging in relation to MS pathogenesis that important differences may relate to the selection of regulatory versus effector EBNA-1-specific CD4+ T cells [18], rather than simply reflecting immune evasion. We have also explored the potential for autoantigens to be selected by cross-reactive EBNA-1-specific T cells, according to a shared propensity for HLA-DRB1 Ã 15-restricted antigen presentation as well as evidence of sequence homology. This concept is in keeping with previous experimentally-proven examples of this phenomenon (albeit without identification of specific epitopes involved) [35], with additional support from observations that HLA-DRB1 Ã 15restricted immune responses are characterised by a relatively high level of TCR degeneracy that would favour cross-reactivity [70]. These results are preliminary and require experimental confirmation of their functional validity.The main purpose of this analysis was to create a platform for experimental design that acknowledges natural patient derived EBNA-1 sequence variants as the basis for epitope selection, while also expanding the possibilities of identifying novel candidate CNS antigens that may have a role in MS pathogenesis. As noted by Ben-Nun and colleagues [55] and Lassmann and colleagues [71], MS research is increasingly moving away from reductionist experimental models towards an interest in a wide array of myelin and axoglial antigen targets, which would be in keeping with a model of MS pathogenesis that allows for cross-reactive T cell (as well as humoral) responses that are initially driven by viral-specific responses-with EBNA-1 representing a legitimate candidate target based on previous work [27,[31][32][33][34][35][36][37][38][39].
These observations, along with continuing evidence of patient-specific heterogeneity of MS lesion pathology [72] and oligoclonal TCR repertoire [73] would support a model of MS disease pathogenesis in which virus-specific immunity, which is oligoclonal in nature as determined by viral sequence variation seen in this and other studies [9][10][11] as well as by polymorphic HLA-restricted antigen presentation, could then trigger cross-reactive autoimmune responses. We now hope to investigate these possibilities further, with a particular focus on the roles of both antigen-presenting B cells as well as antigen-specific T cells in provoking inflammatory immune responses. In this respect, we would anticipate that targeted T-cell immunotherapy is likely to require a patient-specific approach as recently performed by Pender and colleagues [37], while targeting EBV-infected B cells may have the potential to provide a more universal treatment strategy, particularly in light of recent evidence that antigen-experienced B cells within the central nervous system in MS cases are likely to be derived from the peripheral blood and lymph nodes [74,75].