Genetic Diversity in the Major Capsid L1 Protein of HPV-16 and HPV-18 in the Netherlands

Objectives Intratypic molecular variants of human papillomavirus (HPV) type-16 and -18 exist. In the Netherlands, a bivalent vaccine, composed of recombinant L1 proteins from HPV-16 and -18, is used to prevent cervical cancer since 2009. Long-term vaccination could lead to changes in HPV-16 and -18 virus population, thereby hampering vaccination strategies. We determined the genetic diversity of the L1 gene in HPV-16 and -18 viral strains circulating in the Netherlands at the start of vaccination in order to understand the baseline genetic diversity in the Dutch population. Methods DNA sequences of the L1 gene were determined in HPV-16 (n = 241) and HPV-18 (n = 108) positive anogenital samples collected in 2009 and 2011 among Dutch 16- to 24-year old female and male attendees of the sexually transmitted infection (STI) clinics. Phylogenetic analysis was performed and sequences were compared to reference sequences HPV-16 (AF536179) and HPV-18 (X05015) using BioNumerics 7.1. Results For HPV-16, ninety-five single nucleotide polymorphism (SNPs) were identified, twenty–seven (28%) were non-synonymous variations. For HPV-18, seventy-one SNPs were identified, twenty-nine (41%) were non-synonymous. The majority of the non-silent variations were located in sequences encoding alpha helix, beta sheet or surface loops, in particular in the immunodominant FG loop, and may influence the protein secondary structure and immune recognition. Conclusions This study provides unique pre-vaccination/baseline data on the genetic L1 diversity of HPV-16 and -18 viruses circulating in the Netherlands among adolescents and young adults.


Results
For HPV-16, ninety-five single nucleotide polymorphism (SNPs) were identified, twentyseven (28%) were non-synonymous variations. For HPV-18, seventy-one SNPs were identified, twenty-nine (41%) were non-synonymous. The majority of the non-silent variations were located in sequences encoding alpha helix, beta sheet or surface loops, in particular in the immunodominant FG loop, and may influence the protein secondary structure and immune recognition.

Introduction
Human papillomaviruses (HPVs) are small double stranded DNA viruses that have the potential to infect mucosal and cutaneous human epithelial cells. Over 160 HPV types have been identified based on the DNA sequence of the L1 open reading frame of which about 60 types infect the human anogenital tract [1] and are sexually transmitted. A persistent infection with any of at least 12 high-risk HPV types is associated with the development of human cervical cancer [2]. Of these high-risk types, HPV-16 and HPV-18 are responsible for about 70% of cervical cancers [3].
Naturally occurring intratypic molecular variants of HPV-16 and -18 are known to occur and have been shown to be specific or more prevalent in certain parts of the world [4]. Intratypic molecular variants of HPV-16 and -18 are characterized as isolates of the same HPV type that differ in the nucleotide sequence of L1 for less than 10%. Despite phylogenetic relatedness HPV intratypic variants can differ in pathogenicity and in oncogenic potential [5][6][7][8][9][10]. For HPV-16 phylogenetic analyses revealed 4 major intratypic variant lineages:1. European (E/A), 2. Asian-American (AA/ D), 3. African-1 (Af-1/B) and 4. African-2 (Af-2/C) [1]. These variants have been shown to differ in persistence and progression to cancer. Intratypic variants for HPV-18 are also described but the evidence for differences in persistence or progression to (pre)cancer is less clear [1].
HPV vaccination of sexually naïve girls has been introduced in many countries in order to prevent cervical cancers caused by persistent infection with HPV-16 or -18. To what extent HPV vaccination will affect the distribution of HPV types and variants in a (partially) vaccinated population is not yet known. Therefore, monitoring of type-specific HPV prevalence before and after the start of HPV vaccination is of great importance. Previous HPV monitoring studies have shown that the three most prevalent high-risk HPV types (among women) at the introduction of HPV vaccination in the Netherlands were HPV-16, HPV-51 and HPV-52 [11][12][13].
Since 2009, a bivalent vaccine, containing virus-like particles (VLPs) composed of pentamers of recombinant major capsid protein L1 from HPV-16 and -18, is used for vaccination of young girls in the Netherlands. This HPV vaccine is known to elicit high-titre neutralizing antibodies directed against the L1 protein and confer type-specific and long-lasting protection against a persistent infection and cervical abnormalities caused by HPV-16 and -18 [14]. An additional protective vaccine efficacy against infections associated with HPV-31, -33 and -45 have been shown [15]. On the surface of the pentamers, specific loop structures of the L1 protein containing hypervariable immunodominant regions are exposed [16]. Polymorphism within these loops is likely to result in the generation of neutralizing antibodies of different binding affinities [17]. At present, it is not known if immunity elicited against one HPV-16 or -18 variant can protect against infection from another variant with equal efficiency. Long-term vaccination with one HPV-16 or -18 variant of the L1 proteins could therefore lead to changes in HPV-16 and -18 virus population induced by selective pressures.
In order to detect a long-term effect of L1 vaccination on the HPV-16 and/or -18 virus population, knowledge of the genetic diversity of L1 gene in HPV-16 and -18 viruses circulating in the Netherlands at the start of vaccination is required. The aim of this study was to determine the genetic diversity in the L1 protein and HPV variant distribution of circulating HPV-16 and -18 viral isolates found in young adolescents before the start of HPV-16/-18 vaccination, at baseline.

Study population and design
Between February and April 2009, a cross-sectional study was started in 12 sexually transmitted infection (STI) clinics spread throughout the Netherlands prior to the introduction of vaccination. Follow-up surveys in this setting have been conducted every two years since 2009. For the results presented here only samples obtained in 2009 and 2011 were analyzed. The study population and methods have been described in detail previously [12,13]. The study was approved by the Medical Ethical Committee of the University of Utrecht, the Netherlands. This committee has confirmed in writing that they have waived the need for separate ethical approval and the need for written consent.

Clinical specimens, HPV DNA detection and Sample selection
Samples were obtained from anogenital (i.e.vaginal/penile/anal) swabs collected in 2009 and the follow up year 2011 within the PASSYON (PApillomavirus Surveillance among Sti clinic YOungsters Netherlands) study among Dutch 16-to 24-year old male and female attendees of STI clinics. Details of the PASSYON study were described elsewhere [12,13]. All swabs were suspended in 1 ml universal transport medium buffer and stored at -20°C until processing. After thawing, swabs were vortexed and 200 μl of the sample was spiked with phocine herpes virus-1. DNA was subsequently extracted using the MagnaPure platform (Total Nucleic Acid Isolation Kit, Roche) and eluted in 100 μl elution buffer. HPV-DNA was amplified using the SPF10 primer set according to the manufacturer's instructions (DDL Diagnostic Laboratory, the Netherlands). HPV-specific amplicons were detected using the DNA enzyme-linked immunoassay (HPV-DEIA, DDL Diagnostic Laboratory, the Netherlands). Amplicons of HPV-positive samples were subsequently analyzed in the Line probe assay (HPV-LiPA, DDL Diagnostic Laboratory, the Netherlands) in order to determine the specific HPV type present in the sample. HPV isolates for sequencing were selected from a large set of samples previously typed for HPV based on successful L1 gene PCR results. These samples were analyzed further, in order to determine the genetic diversity in the L1 gene.

HPV-16 L1 genetic diversity
L1 HPV-16 sequences were determined and analyzed by aligning the entire L1 gene from 241 HPV-16 positive samples collected in the PASSYON study rounds 1 (2009) and 2 (2011) [13]. Forty-six percent (46%) of all HPV-16 positive samples in these two rounds (Table 1), which are considered as pre-vaccination samples, were sequenced in the present study. The characteristics of the total study population and the strains selected for sequencing are shown in Table 1. Phylogenetic analysis of all L1 sequences clustered the HPV-16 variants in two groups. The majority of the HPV-16 strains 223/241 (93%) had high similarity to the HPV-16 European reference (GenBank reference AF536179, lineage E/A) and other lineage A reference strains (Fig 1). A small subset 18/241 (7%) clustered with the African types (GenBank references AF472508 (Af-1/B), AF472509 (Af-2/C)) and Asian-American type D3 (AF402678). Non-European HPV-16 variants were collected from persons who self-reported to belong to the Surinam or Antillean population group, Dutch population and from unknown population group respectively, in 11%, 50% and 33% of 18 non-European variants (not shown). The most common lineage E/A variant was detected in 31% (75/241, Fig 1) of the samples and differed from the HPV-16 European reference (GenBank reference AF536179) by 2 silent variations and from the HPV-16 vaccine strain (GenBank reference AF043286) by 3 silent variations.
In total we identified ninety-five single nucleotide polymorphisms (SNPs) among the observed Dutch HPV-16 strains compared to the reference strain AF536179. Of these SNPs, 68/95 (72%) were synonymous variations and 27/95 (28%) were non-synonymous variations. The majority of the non-synonymous variations 21/27 (78%) was located in the L1 region encoding alpha helix, beta sheets, surface loops or connecting loops, in particular in the immunodominant FG loop, and therefore may influence the protein secondary structure. The position of the non-synonymous variations is shown in Table 2. Twelve of these non-synonymous variations were located in structured areas of the L1 protein and were, to our knowledge, not described previously. Ten of these twelve variations were identified each in one strain only, eight were found in strains similar to European strains and two in those more similar to the non-European lineages. Resequencing confirmed the presence of the variations found infrequently (in one, two or three strains) as was indicated in Table 2.
We calculated the non-synonymous/synonymous substitution rates for HPV-16 L1 amino acid residues. Based on the HPV-16 sequences analyzed here the dN/dS ratio was 0.176 suggesting that for the entire L1 protein no evidence for positive selection was found (P<0.01).

HPV-18 L1 genetic diversity
L1 HPV-18 sequences were determined and analyzed by aligning the L1 contigs for in total 108 HPV-18 positive samples from the PASSYON study, which is 35% of the HPV-18 positive samples (Table 1). Table 1 shows the characteristics of the total study population and the strains selected for sequencing. Phylogenetic analysis of all L1 sequences showed that the majority 93/108 (86%) of the HPV-18 viral strains have similarity to the HPV-18 European reference (GenBank reference X05015 lineage E/A) (Fig 2). A smaller subset (15/108 (14%)) clustered with the African types (GenBank references EF202154, EF202153 lineage Af/B). Fifteen non-European variants were found, 73% of these were isolated in persons with self-reported non-Dutch origin mostly with a self-reported Surinam or Antillean origin, 27% persons reported to be Dutch and for 0.3% the origin was unknown (not shown). The most common L1 sequence type was detected in 31/ 108 (29%) of the samples and differed from the HPV-18 European reference (GenBank reference X05015) by 8 (2 silent and 6 non-silent) variations and by 4 (1 silent and 3 non-silent) variations from the vaccine strain (GenBank reference AY863161).  TCT  AAT  AAT  AAT  AAA  ACT  ACA  TCC  GCA  TTG  TCA  ACT  AAA  AAA   S>P282  N>H285  N>T285  N>S290  K>R309  T>P353  T>S389  S>P396  A>V435  L>F474  S>P492  T>A497  K>T501   Seventy-one (71) single nucleotide polymorphism (SNPs) were identified among the 108 Dutch HPV-18 strains related to the reference strain X05015. Forty-two of seventy-one (42/71) (59%) were synonymous variations and 29/71 (41%) were non-synonymous variations. For HPV-18 fourteen of the non-synonymous variations 14/29 (48%) were located in sequences encoding the alpha helix, beta sheet, surface loops or connecting loops. The position of the non-synonymous variations is shown in Table 3. To our knowledge only 6 of the 29 non-synonymous variations were described before. Novel variations were mostly found in multiple viral strains (Table 3) and both in strains similar to European and non-European lineages. All variations found in one, two or three strains were confirmed by resequencing.
Non-synonymous and synonymous substitution rates for HPV-18 L1 amino acid residues were calculated based on the HPV-18 L1 gene sequences analyzed here and shown be significantly different (P<001). No evidence for positive selection was found for the entire L1 gene since the dN/dS ratio was 0.172 suggesting that it was subject to purifying selection.

Nucleotide position from start
Variation frequency (n) 3* 1* 1* 6 107 10 1* 1* 2* 9 1 7 1 5 5 9 Variation frequency (%) 2.8 0.9 0.9 5.5 99.1 9.2 0.9 0.9 1.8 8.3 15.7 13.9 54.6 Change to nucleotide     89  163  263  308  818  829  848  967  969  1013  1014  1309  1312  1313  1337  1474   30  55  88  103  273  277  283  323  323  338  338  437  438  438  446  492   CCC  GGT  ACT  GCT  CAA  ATT  CCT  GTT  GTT  CCC  CCC  GAA  AAT  AAT  AAG  ACT   P>R30  G>S55  T>N88  A>V103  Q>P273  I>L277  P>R283  V>I323  V>I323  P>R338  P>R338  E>K437  N>H438  N>T438  K>R446 T>A492 AA phenotypic variants within HPV types have been recognized, with some evidence for geographic association of this diversity and differences in progression to cervical intraepithelial neoplasia grade 3 (CIN3). At present it is not known if immunity to viruses within an HPVtype is equally efficient for all variants, with some evidence that this may not be the case: although studies have suggested that immunization with L1 VLPs of an European variant induces antibodies able to neutralize different HPV-16 variants [21]. Other studies have shown evidence for variant specific neutralizing epitopes [22]. Understanding such diversity is important as the selective immunological pressure in a (fully) vaccinated population could lead to selective displacement of variants within the HPV types covered by the vaccines. Here we demonstrated the presence of HPV-16 and -18 L1 genetic diversity (at the introduction of vaccination), which is important as baseline for the post vaccination surveillance. In the past, several studies have addressed the intratypic variation of the major capsid protein from HPV-16 and/or HPV-18 [23][24][25][26]. Most of these studies were focused on small fragments of the L1 gene or were performed with limited sample numbers collected in Europe. Furthermore, in this regard no isolates collected in the Netherlands have been described previously. In this study we have evaluated the HPV-16 and HPV-18 L1 diversity based on 241 and 108 full-length HPV-16 and -18 L1 sequences, respectively, obtained from HPV isolates collected in the Netherlands around the introduction of HPV vaccination. The sequenced strains were representative for the total group in the PASSYON study, since the characteristics were non-significantly different from the total group, except for gender. A relatively low number of male sample were sequenced, but we expect no gender specific sequences. To our knowledge this is the largest study where the genetic diversity of the L1 gene in HPV-16 and -18 in HPV isolates collected in Europe is studied.
As HPV variants have been shown to differ in geographic origins [4] it was to be expected that European HPV-16 and -18 variants (lineage E/A) were identified most frequently in the Dutch isolates. Indeed, only a minority of our isolates concerned variant lineages, 7% and 14% for HPV-16 and -18 respectively, similar to the non-European variants from (sub)lineages B1, B2 (African1a and 1b), C (African 2a), D1 (North-American) and D2 and D3. Here we have found that all persons with a self-reported Surinam or Antillean origin of whom HPV-18 variant was analyzed carried a non-European HPV-18 variant. An earlier study had described that the distribution and persistence of HPV-16 and -18 variants may be related to the racial composition of individuals [27]. The present study supports this observation although the number of isolates from non-European variants was very small and the information on population group is based fully on self-reported data, the distribution of HPV-18 variants in the Netherlands seems to be ethnicity related. For HPV-16 variants this ethnicity-related distribution of variants in our Dutch isolates is not seen. It is not known if the ethnicity-associated variant distribution is caused by (long-term) mixing patterns in the population or if genetic factors which preferentially predispose persons are involved [27].
Our study identified known and new variants of HPV-16 and -18, with amino-acid variations in or close to epitopes. The new variants of HPV-16 and -18 were found both in European and non-European group of viral strains. Whether these changed L1 proteins are less susceptible to the vaccine induced immunity is unclear at this time. In general, human papillomaviruses mutate slowly [28] because they are double-stranded DNA viruses and use the proofreading DNA polymerase from their host. Nevertheless, nucleotide polymorphism can occur and become established in the population. Based on what is known about the evolution of the HPV genome, it is considered unlikely that HPV will show significant change in response to the human vaccination [29]. Accordingly, in our study population where HPV isolates collected around the introduction of HPV vaccination were studied, we found no evidence for selective pressure on the HPV-16 L1 gene (dN/dS ratio 0.176) or on the HPV-18 L1 gene (dN/dS ratio 0.172). Both ratios were less than one indicating that these sequences are under purifying selective pressure in the pre-vaccination period.
Some study limitations should be noted. First, we have sequenced a relative small number of HPV strains, especially strains sequenced from males and persons from the non-Dutch population group were low in number. Specific variants are not expected in males but have been seen in persons from non-Dutch population group. A second limitation is that we have selected swabs collected in 2009 and 2011 for studying the pre-vaccination genetic diversity. However, vaccination was introduced in the Netherlands in 2009 in a catch-up campaign in girls aged 14-to 16-years old before introduction in 12-year old girls in 2010. Therefore it is possible that samples collected in 2011 from [16][17][18] year old girls were from vaccinated girls. Since the number of strains collected in 2011 from 16 to 18 years girls is very low and the vaccination uptake at that time was about 50% we believe that no (or very little) dilution of vaccination effects have been introduced the pre-vaccination genetic diversity.
Whether in the future we will observe the emergence or expansion of (new) HPV-16-and or HPV-18 variants caused by the selective pressure induced by mass vaccination will remain unclear until investigated. Knowledge of the genetic diversity of HPV-16 and -18 at the start of vaccination (baseline), as is presented in this study is essential in order to understand the genetic variability of these proteins over time.
Supporting Information S1 Table. Primers used for the molecular characterization of HPV-16 and HPV-18 L1 gene. (XLSX)