Mucosal and Cutaneous Human Papillomaviruses Detected in Raw Sewages

Epitheliotropic viruses can find their way into sewage. The aim of the present study was to investigate the occurrence, distribution, and genetic diversity of Human Papillomaviruses (HPVs) in urban wastewaters. Sewage samples were collected from treatment plants distributed throughout Italy. The DNA extracted from these samples was analyzed by PCR using five PV-specific sets of primers targeting the L1 (GP5/GP6, MY09/MY11, FAP59/64, SKF/SKR) and E1 regions (PM-A/PM-B), according to the protocols previously validated for the detection of mucosal and cutaneous HPV genotypes. PCR products underwent sequencing analysis and the sequences were aligned to reference genomes from the Papillomavirus Episteme database. Phylogenetic analysis was then performed to assess the genetic relationships among the different sequences and between the sequences of the samples and those of the prototype strains. A broad spectrum of sequences related to mucosal and cutaneous HPV types was detected in 81% of the sewage samples analyzed. Surprisingly, sequences related to the anogenital HPV6 and 11 were detected in 19% of the samples, and sequences related to the “high risk” oncogenic HPV16 were identified in two samples. Sequences related to HPV9, HPV20, HPV25, HPV76, HPV80, HPV104, HPV110, HPV111, HPV120 and HPV145 beta Papillomaviruses were detected in 76% of the samples. In addition, similarity searches and phylogenetic analysis of some sequences suggest that they could belong to putative new genotypes of the beta genus. In this study, for the first time, the presence of HPV viruses strongly related to human cancer is reported in sewage samples. Our data increases the knowledge of HPV genomic diversity and suggests that virological analysis of urban sewage can provide key information useful in supporting epidemiological studies.


Introduction
Papillomaviridae (PV) is a family of small epitheliotropic viruses of approximately 50-60 nm, with circular double stranded DNA genome 7-8 kb long, detected in all vertebrates. This family contains 16 genera named with the letters of the Greek alphabet. Human Papillomavirus (HPV) strains are classified into 5 genera: alpha (a), beta (b), gamma (c), mu (m), and nu (n). The HPV members of the a genus primarily infect oral and genital mucosal surfaces and external genitalia, while HPVs belonging to the b, c, m, and n genera infect non-genital mucosa and skin. Papillomaviridae is a rapidly growing family of viruses. In fact, most of the sequences of new viruses from humans and other vertebrates, that are uploaded on to databases, belong to this family [1]. Among the 120 HPV genotypes detected so far in the a genus, 30 infect anogenital epithelia and are the cause of sexually transmitted diseases (STDs). Of these, 15 have oncogenic potential and are called high-risk (HR). Women and men involved in the transmission of HPVs can be both asymptomatic vectors and victims of these infections. The HR genotypes HPV16, 18,31,33,35,39,45,51,52,56,58, and 59 have been recognized as causal agents of cervical cancer (CC), the second most common cancer among women worldwide [2]. The genotype HPV16 is detected in 61% of CC clinical samples [4]. HPV16 and 18 have also been found to cause vaginal, vulval, anal and penile cancers. Moreover, half of oro-pharyngeal cancers are linked to HPV16 [2,3]. Seven genotypes, HPV26, 53, 66, 67, 70, 73, and 82, could also be considered as probable carcinogenic candidates, while HPV6, 11,40,42,43,44, 54, 61, 72 and 81, causing anogenital warts, are considered low risk genotypes (LR-HPVs) [4]. It is worthwhile to note that genital warts represent a heavy burden among the female population; these symptoms are usually the impetus for the initial presentation by patients in consulting gynecologists. HPV6 and 11 are the most common genotypes detected in oro-pharyngeal cancer after HPV16, suggesting that these anogenital LR HPVs may indeed be malignant for the oral mucosa [5].
Skin HPVs are ubiquitous viruses involved in a variety of skin pathologies [6] but are also detected at a high prevalence in the normal skin of healthy subjects [7,8]. Skin warts are caused most frequently by HPV1, 63 (m genus), HPV2, 3, 4, 7, 10, 27, 57, 66 (a genus), and HPV4, 60, 64, 65, 75-77 (c genus) [4,6]. The association between b-HPVs and skin cancer was first recognized in patients with epidermodysplasia verruciformis (EV), a rare genetic disease. The EV-HPVs (HPV5, 8, 12, 14, 15, 19-25, 28, 29, 36-38) have been linked to non-melanoma skin cancer (NMSC), a pathology particularly frequent in immunosuppressed patients such as HIV-infected individuals and people receiving organ transplantation. Although some genotypes (HPV5, 8 and 38) display transforming activities in several experimental models [9,10], epidemiological studies do not yet support any single genotype causing skin cancer in the general population [4]. HPV colonization of healthy skin occurs very early in life, and increases with age, as well as the prevalence of b and c HPV species [11]. Conversely, the acquisition of specific immunological competence increases for the HPVs linked to NMSC while it decreases for those causing skin warts, as monitored by serological studies [12,13]. Cutaneous HPVs persist in hair follicles, suggesting these sites as possible reservoirs. In contrast to CC, where HPV DNA is always present [4], no HPV DNA genome, or only few copies of it, persist in skin cancers. Interestingly, precursor lesions of skin cancer such as actinic keratosis show high viral loads compatible with a carcinogenic role of b-HPVs in the early events of cancer development [11].
HPV infections occur due to a failure of the innate or adaptive immune system. Immunosuppressed and HIV-infected individuals have a high prevalence of HPV infections, considered as opportunistic infections in AIDS. These infections do not appear to be diminished by HAART in HIV-positive individuals [14].
WHO recognizes the importance of CC and other HPV-related diseases as global public-health problems and recommends adopting and implementing routine vaccination in national immunization programs. Recent results indicate that the elimination of the genotypes included in Gardasil (HPV16, 18, 6 and 11) or Cervarix (HPV16, 18) vaccines may alter the evolutionary trajectory of circulating viruses and promote the evolution of new genotypes [15]; therefore, identification of the genotypes circulating in the population is of great importance for the development of both preventive and therapeutic programs of the HPV-associated diseases.
In the past few years, several studies have demonstrated the benefit of environmental surveillance as an additional tool in determining the epidemiology of different viruses circulating in a given community [16][17][18][19][20][21][22][23]. Sewer systems collect pathogens excreted in a range of body fluids from a wide area into a central facility. Therefore, the monitoring of centralized wastewater allows the detection of natural, accidental, or intentional contamination events [24]. Moreover, untreated wastewater provides a rich matrix in which novel viruses could be identified at the same time as studying virus diversity.
Recently the presence of human Polyomaviruses (HPy) has been described in urban sewage systems, enabling the assessment of the excretion levels and the potential risks of waterborne transmission by these viruses [25]. Moreover, the exploration of viral diversity by deep sequencing nucleic acids obtained from raw sewage, allowed the identification of 234 known viruses, including the newly discovered HPV112 and the HPy6 [26]. Both these viruses are skin-related, suggesting that viruses of human skin as well as those of stools can find their way into sewage.
The objective of the present study was to investigate whether DNA sequences of HPVs can be recovered from urban wastewaters, as has already been reported for other non-enteric viruses, and to assess the occurrence and distribution of HPV genotypes.

Sewage samples, concentration and DNA extraction
Forty-two inflow grab samples were collected from March 2011 to October 2011 at 14 Wastewater Treatment Plants (WTPs) of 8 cities from north to south of Italy: Torino, Genova, Venezia, Bologna, Roma, Cagliari, Bari, and Palermo. All the necessary permits were obtained for the described field study. Sample collection was performed in collaboration with the Italian Regional Agencies for Environmental Protection and Prevention (ARPA). Upon arrival each sample was named by a Sample ID (1528 to 1771), a unique individual identifier. Twenty ml of each sample were purified with glycine/chloroform and 10 ml used for DNA extraction by the NucliSens miniMAG nucleic acid isolation kit (BioMerieux). DNA was then eluted in 100 ml buffer and stored in aliquots at 280uC until use. A PostgreSQL database was previously created to keep track of all the primers, PCR products and samples used; an internet connection to this database is available for registered users at https://cosmos.bio.uniroma1.it. A murine Norovirus was added to the samples as an internal control to both calculate viral recovery efficiency and check for potential inhibitors by quantitative and qualitative PCR (data not shown), in case of negative PCR results.

PCR assays and sequencing
Samples were analyzed using different sets of previously described primers, listed in Table 1, designed to detect both mucosal and cutaneous PVs. Each of the four diagnostic PCR assays, here called Method A, B, C and D, was performed using a 2-step amplification to either increase sensitivity or mitigate the inhibitory effect of substances possibly present in the samples. PCR conditions were as previously described, with some modification.
Method A. the MY11/MY09 [27] primers were used for the first cycle of amplification and GP5+/GP6+ [28] for the nested reaction. The amplicon of 138 bp maps in the L1 major capsid protein-coding region. We also performed a hemi-nested assay (GP5+/MY09 primers) obtaining a 406 bp fragment. For each PCR reaction, 3 ml of the extracted DNA and 22 pmol of each primer were used in a final mixture of 25 ml, with 12.5 ml of GoTaq Green Master Mix 26 (Promega). Two ml of this mixture were subjected to a second round of PCR with 40 cycles of amplification. Samples positive to this method are reported in Table 2.
Method B. the FAP59 and FAP64 degenerate primers were designed to target the L1 major capsid protein-coding region [29]. The first cycle of PCR was performed as a touchdown PCR to reduce nonspecific amplification. Two ml of this mixture were subjected to a second PCR round with annealing at 50uC, using the same primers to further amplify the fragment of interest, 478 bp long.
Method C. two pairs of degenerate primers SKF1/SKR1 and SKF2/SKR2 were used in a multiplex PCR obtaining a 210-238 bp fragment of the L1 region. The assay is able to detect b, c and m HPVs causing the cutaneous warts. A touchdown PCR was used in the first cycle like in method B. Two ml of this mixture were subjected to a second round of PCR with annealing at 45uC [30].
Method D. the PM-A/PM-B primers designed to detect 25 b-PVs were used [31]. The PCR assay amplifies a 117 bp fragment of the E1 helicase-coding region. The samples positive to method D are listed in Table 3.
All standard precautions adhering to strict laboratory practices were followed to prevent any PCR contamination. PCR reactions were purified with Montage SEQ96 Sequencing Reaction Cleanup kit (Millipore Corporation) and labeled using ABI-PRISM BigDye Terminator Cycle Sequencing kit (Applied Biosystems). Unincorporated dye terminators were removed using a Montage SEQ96 Sequencing Reaction Cleanup Kit (Millipore Corporation). Sequencing analyses were performed in a capillary automatic sequencer (ABI-PRISM 310 Genetic Analyser, Applied Biosystems).
In order to identify the PV genotypes, the PCR products of the expected size were both directly sequenced and cloned into the pGEM-T vector (Promega). Recombinant DNA plasmids were extracted by FastPlasmid Mini kit (Eppendorf) and sequenced (3 clones per sample) by vector-specific primers. The sequences obtained from different clones are marked with an alphabetical code (ID sample ''a'', ''b'') in Tables 2 and 3

Genotyping and phylogenetic analysis
The sequences obtained from PCR amplicons were aligned to prototype sequences of the Papillomavirus Episteme database Phylogenetic analysis was performed using MEGA software version 4.0 [32]. Nucleotide sequences were aligned using the Clustal W algorithm. The phylogenetic tree was constructed using the Maximum Likelihood method based on the Kimura 2parameter model, integrated into the MEGA software. The

Results
Inflow grab samples were analyzed by PCR using four PVspecific molecular methods employing the MY09/MY11 and GP5/GP6 (Method A), FAP59/64 (Method B), SKF/SKR (Method C), and PM-A/PM-B (Method D) set of primers as described in Materials and Methods. Nucleic acid extraction efficiency, calculated on randomly selected samples, showed an average recovery exceeding 35%. PCR inhibitors were not detected in the negative samples. The amplicons were both directly sequenced and sequenced after cloning. In order to identify the more closely related PV genotypes, the sequences were submitted to BLAST analysis and compared to both the interactive Papillomavirus Episteme database (PaVE), a database for the Papillomaviridae family containing 241 reference complete genomes, and to the NCBI GenBank database.
Only methods A and D gave positive results, summarized in Tables 2 and 3, which report sample ID numbers, sites of the WTPs, PCR results, the more closely related prototypes and percentage of nucleotide (nt) identity. Eighty-one percent of the samples (34/42) were HPV-positive using method A or D, or both. PCR assays using methods C and B resulted in multiple, nonspecific amplification products. In Table 2, thirty-eight % (16/42) of the samples were positive by method A. Three sequences related to the a-types HPV6, 11, and 16, and 2 sequences related to the btypes, HPV25 and 120, showing more than 97% of nucleotide (nt) identity with the PAVE sequence database, were identified by sequence analysis. In some cases, different results were obtained from the same sewage sample by either GP5+/GP6+ in nested or GP5+/MY09 in hemi-nested PCR assays ( Table 2). Out of the 16 positive samples, 9 were positive by both the assays, 6 were positive only by hemi-nested and one was positive by nested PCR. The 406 bp amplicons detected by hemi-nested PCR in the samples ID 1552, 1568 and 1570, showed less than 89% of nt identity with prototypes of the PAVE database (see Table 2), related to HPV107 and HPV120. The comparison of these sequences with the GenBank database, showed greater matches for samples 1552 and 1568: the first showed 96% nt identity with a HPV partial sequence, previously detected in an esophageal carcinoma sample from China, submitted as an unclassified sequence (AJ000150); the second showed 100% of nt identity with an unclassified sequence of a human clinical oral sample from USA (AF048746), thus making it impossible to assign these sequences to known genotypes. As for sample ID 1570, similarity search against PaVE or GenBank gave an identical result which was of 88% nt identity with the reference strain HPV120 (see Table 2). Since the L1 sequences of the 1552, 1568, and 1570 samples differ by more than 10% with prototype genomes, we speculate that they may belong to putative new genotypes [1,33].
Different results were obtained by sequencing the nested and hemi-nested products. In one case (sample ID 1568, Table 2), a sequence related to HPV25 and a putative novel genotype was detected by nested and hemi-nested assay, respectively. In samples 1530 and 1545 (Table 2, nested PCR results) different sequences belonging to the same genotype were detected by direct sequencing or sequencing after cloning of the PCR products. These results suggest the co-presence of several strains in a single sewage sample.
The results of the phylogenetic study, performed on the sequences obtained by method A (L1 region), are presented in Figure 1. The tree includes the sequences obtained in this study, in addition to the prototype sequences from the PaVE database and from GeneBank (see figure legend). Sequences from sewage samples are grouped into two different clusters corresponding to alpha and beta HPVs, according to the genotyping results by BLAST analysis. The alpha cluster is subdivided into two main groups, referred to as a9 (HPV16, 98% bootstrap) and a10 (HPV6 and 11, 91% bootstrap) species. Sewage samples form wellsupported groups with their corresponding prototype strains: HPV6 (ID 1539, 1545, 1547, 1550 and 1565), HPV11 (ID 1530, 1542 and 1549), and HPV16 (ID 1543 and 1556). In the b cluster, samples 1528, 1546 and 1535 are grouped with the prototype HPV120 (73% bootstrap); sample 1568 clusters with HPV25 (71% bootstrap). The HPV sequences detected by hemi-nested PCR in ID 1552 and 1568, showing percentages of nt identity lower than 89% with the HPV107 and HPV120 prototypes from PaVE databank, cannot be clustered in supported groups with these genotypes, but are grouped with the unclassified partial sequences AJ000150, and AF048746 from GeneBank, mentioned above (bootstrap 100% and 96% respectively). Not even the partial L1 sequence of sample ID 1570 can be placed with certainty in a definite cluster. This leads us to believe that in all these cases the phylogenetic analysis supports the hypothesis that these sequences could belong to putative new HPV genotypes.
The results of the phylogenetic study performed with the sequences obtained by Method D, are presented in Figure 2. All b-HPV prototype sequences from the PaVE database were included into the tree to make an accurate comparison (see figure legend). The prototype sequences segregate into separated branches of the tree, thus indicating that partial sequencing of the E1 region permits to distinguishing among different genotypes. All sequences showing more than 94% of nt identity with the PaVE prototypes can be clustered with the corresponding reference genomes, confirming the results obtained by BLAST analysis, even if the bootstrap values in several branches are low (see Figure 2). Of note, the 32 sequences sharing a percentage of nt identity lower than 90% with HPV9, HPV104, and HPV113 (unassigned 1-7, Table 3), formed a well-supported group (91% bootstrap) but did not cluster with any of these prototypes (Figure 2, top). The genetic variability within the unassigned sequences, assessed by computation of pairwise distances, was 0.016 (number of base substitutions per site), attesting that these sequences may belong to the same genotype, possibly a not yet discovered b-PV.
The sequences detected in this study have been submitted to GeneBank with the following accession numbers: HE80564 to HE805665.

Discussion
The objective of the present study was to investigate the occurrence and genetic diversity of HPVs in urban wastewaters through molecular screening of raw sewage samples. The results showed the presence of sequences related to a-HPVs (HPV6, 11,16) and b-HPVs (HPV9, 20, 25, 76, 80, 104, 110, 111, 120, and 145). In our study, human Papillomavirus belonging to c, m or n genera were not detected.
The different assays used allowed us to establish that multiple virus genotypes and multiple virus strains belonging to the same genotype (up to three different sequences) were present in a single sewage sample, confirming that urban wastewaters represent a rich matrix for studying viral diversity.
Our data show that the MY11/MY09 and GP5+/GP6+ primer sets, mapping in the major capsid L1 protein (method A) and validated on clinical samples, are powerful tools for HPV detection even in sewage samples. The use of hemi-nested primers in addition to nested primers, allowed us to increase the number of genotypes detectable and to detect a higher percentage of positive samples, in agreement with data obtained by other authors [34]. Surprisingly, the low risk HPV6 and 11 were the most prevalent a genotypes detected in sewage samples. The high risk HPV16 genotype was detected in two samples. It is noteworthy that HPV16, 6 and 11 are the genotypes most frequently detected in cancers of the oral cavity [2,4,5,11]. In addition to viruses of the a genus, the L1-primers were also able to detect viruses belonging to the b-group species such as HPV25 (b1) and HPV120 (b2). The 405 bp sequences found in the ID samples 1552, 1568 and 1570 showed a percentage of identity lower than 88% with HPV107 and HPV120 prototypes of PaVE database and could not be clustered in supported groups with these reference genomes in the phylogenetic tree. On the basis of the results obtained by BLAST and phylogenetic analysis, these sequences could belong to putative novel genotypes. In fact, according to the guidelines for PV nomenclature (Study Group of Papillomaviruses of the International Committee on Taxonomy of Viruses), a PV genotype species differs in the L1 gene sequence by at least 10% from the genes of other known genotypes [1,33]. However, in order to be officially recognized as a unique genotype, the complete genome must be sequenced and deposited in the form of clones to the Reference Centre for Papillomaviruses in Heidelberg, Germany [1,33]. However, sequencing of the full genome will not be an easy task due to the complexity of the urban wastewater matrix. Putative new HPV genotypes are continuously being discovered within the Papillomaviridae family, and an even larger number is expected to exist. Therefore a great effort is required to characterize these viruses and elucidate their implication in human health.
Method B and C, based on FAP and SK couples of primers, did not give any positive result. In fact, the PCR amplification products consisted of a mixture of fragments differing in length that hampered the identification of HPV amplicons of the expected size. These tests use degenerated primers which, in a complex matrix such as wastewater, may amplify non-specific products. Wastewater indeed represents the most challenging matrix for molecular identification of pathogens, due to the presence of a spectrum of microorganisms, cell debris, proteins, lipids, and various potential inhibitors.
The phylogenetic tree constructed in the E1 region allowed us to cluster some sequences with the reference genotypes with a high degree of reliability, confirming BLAST results, even if the bootstrap values in several branches were low. The E1 region targeted by the PM-A/PM-B primers codes for the helicase gene, involved in viral replication. Since this region is highly conserved in PVs, it has a lower discrimination power compared to the major capsid protein L1, which is indeed the region of choice for HPV genotyping and classification.
Here we demonstrate that this primer pair has an even broader range of action, being able to detect also the recently classified beta HPV 104, 110, 111 and HPV145 [1].
The majority (78%, 32 sequences) of E1 amplicons obtained by Method D, were not assigned to any HPV genotype by both sequence similarity searches and phylogenetic analysis. These unassigned sequences showed a very low genetic diversity from each other, indicating that they may belong to the same genotype, still unknown. This virus could represent a beta genotype ubiquitously present on human skin, which explains its presence in the vast majority of our WTPs.
The unexpectedly high HPV DNA prevalence in raw sewage samples probably reflects a high diffusion of HPVs in the human population. Cutaneous and mucosal viruses can find their way into sewage through both healthy and torn skin, and washing of mucous secretions. It is well known that the treatment procedures cannot completely eliminate viruses present in wastewater [19,35,36], thus permitting their spread in the environment. However, the presence of HPV-DNAs in sewage samples does not necessarily imply the infectivity of these samples. Unfortunately, the absence of conventional in vitro systems able to support the HPV replication prevents the study of infectivity for any HPV isolated from any source. Since DNA viruses show a high resistance to inactivating treatments and a slow loss of infectivity [37], we cannot exclude the possibility that infectious HPVs spread in the environment even after wastewater treatments and cause infections through contact with skin or mucosa. The HPV16, 6 and 11 are known causes of sexually transmitted infections; however, current evidences suggest that these HPVs can also be transmitted non-sexually. In fact, cases of vertical and horizontal transmission of HPVs and autoinoculation through contact of the body have been clearly documented for HPV16, 6 and 11. Therefore, the HPV presence in environmental samples opens up the possibility of a fomite or waterborne transmission. To our knowledge, no evidence has been reported that contaminated toilet seats, doorknobs, towels, soaps, swimming pools, or recreational waters, can transmit anogenital HPVs. Nevertheless, these modalities of infection could explain some cases of anogenital HPV transmission in children, virgins and nuns, none of them being involved in sex-abuse or sexual intercourse [4,38]. Certainly, understanding the implications in human health of the presence in the environment of oncogenic viruses including HPVs, and the potential viral waterborne transmission are great challenges for future studies [39].
Regarding the genotypes detected in our study, the presence of HPV16, 6 and 11 in sewage samples probably reflect the high pathogenicity of these viruses. On the other hand, the epidemiology of the b-HPVs is less known. Several genotypes detected in our study are EV-HPVs (HPV9, 20, 25, 80) that have been recently linked to precursor lesions of skin cancer with HPV110, recently classified [40]. We noticed that, with the exclusion of HPV76, the b-HPVs found in our study have been detected also in mucosal tissues; consequently, the majority of the HPVs found in wastewaters were of mucosal origin [4,41]. We were not able to determine whether this was due to screening a relatively small number of samples or whether it reflected higher viral load and greater resistance of mucosal HPVs compared to cutaneous HPVs, in wastewaters.
The present study is the first report on the occurrence of HPV viruses, strongly related to human cancer, in sewage samples, and increases our knowledge on the HPV genomic diversity.
In conclusion, our results suggest that the surveillance of sewer systems, successfully employed for monitoring of enteric viruses, could be also applied to epitheliotropic viruses such as the oncogenic HPVs. Although studies on a larger-scale are required to obtain a well-defined picture of the HPVs present in environmental samples, our study contributes significantly to this effort. The environmental data, integrated and compared to published clinical data, can be used in statewide surveillance programs to monitor prevalent, novel and emerging genotypes circulating in the community and to assess any discrepancies with genotypes included in the current vaccines.