Novel Bovine Papillomavirus Type Discovered by Rolling-Circle Amplification Coupled with Next-Generation Sequencing

Currently, fifteen bovine papillomavirus (BPV) types have been identified and classified into four genera: Deltapapillomavirus, Epsilonpapillomavirus, Dyoxipapillomavirus, and Xipapillomavirus. Here, the complete genome sequence of a new BPV type (BPV 04AC14) recovered from a papillomatous lesion is reported. The genome is 7,282 bp in length and exhibits the classic genetic organization and motifs of the members of Papillomaviridae. Maximum likelihood phylogenetic analyses revealed that BPV 04AC14 clusters with members of the Xipapillomavirus genus. The nucleotide sequence of the L1 capsid protein of the novel BPV is closely related to its counterpart, BPV3, with which it shares 79% similarity. These findings suggest that this virus is a new BPV type of the Xipapillomavirus genus.


Introduction
Papillomaviruses (PVs) are small viruses whose genomes consist of double-stranded DNA molecules of approximately 8 kb; PVs are widely distributed and probably infect all amniotes [1]. Most PVs are part of the skin microbiota; however, in some cases, infections by certain types manifest in distinct clinical presentations, from highly productive, self-limited warts to invasive cancers [2]. In cattle, bovine papillomavirus (BPV) infections are probably primarily asymptomatic, although on occasion, certain BPV types can induce skin warts or neoplasias in the mucosa of the urinary bladder and upper digestive tract [3,4].
The multiply primed rolling-circle amplification (RCA) strategy has been successfully used for the identification of the circular genomes of a number of viruses [8], e.g., anelloviruses [9], circoviruses [10] and papillomaviruses [11]. The method utilizes bacteriophage ϕ29 DNA polymerase for the selective amplification of circular DNA [8]. Unlike PCR, the primers used in the amplification reaction are exo-resistant random. Therefore, as the technique does not need any specific primer, previous knowledge of the nucleotide sequences is not necessary. Furthermore, ϕ29 DNA polymerase has linear kinetics at 30°C, eliminating the need for thermal cycling. By strand displacement synthesis, repeated copies of the complete genome are synthesized, leading to a high molecular mass double-stranded DNA.
In 2014, a specimen consisting of papillomatous-like lesions was received in the laboratory for the production of a BPV autogenous vaccine. Confirmation of the presumptive diagnosis and typing of the BPV involved was performed by PCR with consensus PV primers and Sanger sequencing [12]. Nucleotide sequencing indicated a BPV type not previously described. The full genome of this novel BPV type was recovered directly from the papillomatous lesions by multiply primed rolling-circle amplification (RCA) followed by NGS. The genome sequence was characterized and the phylogenetic relationship between this novel BPV and the other BPVs was determined.

Sample
A specimen consisting of~20 grams of papillomatous lesions (skin warts; sample 04AC14) was received in the lab. The specimens were derived from one animal in Acre State (within the Brazilian Amazon region), clinically diagnosed as papillomatosis, and collected for the production of a BPV autogenous vaccine. The lesions were removed using scalpels after local anaesthesia (performed with 2% lidocaine). The sample was individually wrapped and stored at −20°C for DNA extraction and in 10% buffered formaldehyde for histopathological analyses. DNA extraction, PV consensus PCR and Sanger sequencing was present in S1 File.

Histopathology
Tissue were fixed in 10% buffered formalin, trimmed, and processed routinely for histopathology. Tissue sections were cut at 3 μm and stained with haematoxylin and eosin (HE).

Rolling-circle amplification (RCA)
Multiply primed rolling-circle amplification (RCA) was performed as previously described [10,13]. Briefly, 100 ng of total DNA (2 μL of 50 ng/μL solution) from papillomatous tissue was denatured at 95°C for 5 minutes and immediately cooled on ice. Twenty-three microlitres of a previously prepared solution containing 1.5 mM of each dNTP (Invitrogen), 6.2 mM random exonuclease-resistant hexanucleotides (Thermo), 2 U of ϕ29 DNA polymerase (Thermo) and 2.5 μL of reaction buffer [50 mM Tris/HCl pH 7.5, 10 mM MgCl 2 , 10 mM (NH 4 ) 2 SO 4 , 4 mM dithiothreitol] were added to denatured DNA. The amplification solution was incubated for 18 hours at 30°C, followed by 10 min at 65°C to inactivate the enzyme. The amplicon was electrophoresed in a 0.8% agarose gel and visualized on a UV light source after ethidium bromide staining. The RCA products were purified with a commercial kit (GFX™ Purification Kit; Amersham Biosciences).
High-throughput sequencing and sequence analysis RCA DNA was tagged and fragmented using the Nextera DNA Library Prep Kit (Illumina) according to the manufacturer's instructions for the preparation of DNA libraries. After amplification via a limited-cycle PCR program, PCR cleanup was performed with Agencourt AMPure XP beads (Beckman Coulter). The library was sequenced in a MiSeq System (Illumina) using a MiSeq Reagent Kit V2 (2x150 cycles).
The data were de novo assembled using SPAdes genome assembler (version 3.6) [14]. Open reading frame (ORF) predictions and genome annotations of the 04AC14 genome were performed with the aid of Geneious software (version 8.1.4). Gene and protein comparisons were performed in the programmes BLASTn and BLASTp. Sequence of the BPV 04AC14 was deposited in GenBank under accession number KX098515.

Phylogenetic inferences
Local sequence alignments were constructed to determine the sequence identity with BLASTn [15]. Representative PV sequences were retrieved from GenBank. Nucleotide alignments were performed using MUSCLE software [16].
The best selection model to generate the phylogenetic trees was selected with the programme Modeltest 3.7 [17]. A phylogenetic tree with 1000 bootstrap resamples of the alignment data sets was generated using the Maximum Likelihood method in MEGA5 [18]. Bootstrap values (based on 1000 replicates) for each node are given.

Results and Discussion
De novo sequencing and genome assembly Illumina MiSeq (2×150 cycles run) generated a total of 110,820 high quality paired-end reads. These sequences were de novo assembled into 3,885 contigs using SPAdes 3.6 assembler. These contigs were analysed using BLASTn/BLASTx with the National Center for Biotechnology Information (NCBI) databases. One contig (named Node_1) related to Papillomaviridae with a circular genome of 7,282 nucleotides (nt) was identified. This contig was composed by 857 reads (mean coverage~17). The remaining contigs were either related to the bovine genome or to unknown sources.

Genomic characterization of a new BPV type
The PV recovered genome is 7,282 bp in length, arranged in a circular DNA molecule with a GC content of 42.5%. Seven BPV ORFs were identified in the positive strand: five of those corresponded to the genes coding for early proteins (E8, E7, E1, E2 and E4) and two genes (L1 and L2) coded for the late capsid proteins. A long control region (LCR) of 399 bp was recognized between the L1 and E8 gene, located at nt positions 6,884-7,282. The major genome features of 04AC14 BPV are shown in Fig 1. Whole-genome sequence alignments revealed that the closest related PVs were BPV3 (AF486184; 77% of identity to 04AC12), BPV6 (AJ620208; 74%) and BPV4 (X05817; 73%). When each ORF was compared with other PVs, the degree of nucleotide identity varied between 74% and 81% ( Table 1).
The LCR of BPV 04AC14 contained typical PV features [19] that hold regulatory elements for virus replication and control the transcription of transforming genes. The LCRs of mucosal epitheliotropic PVs possess a similar genome organization, which typically includes a promoter region, an enhancer region and a highly conserved distribution of E2 DNA binding sites [20].  BPV 04AC14 lacks a second LCR, similar to the majority of BPVs. Both E1 and E2 bind to the origin of virus replication, located in the LCR, and activate genomic DNA replication. Most PVs possess one E1-binding site (E1BS) and at least two E2-binding sites (E2BS) [21]. Nevertheless, the BPV 04AC14 LCR shows only one copy of E2BS (ACCN 6 GGT). In addition, the BPV 04AC14 LCR possesses one poly-A site (AATAAA) and one TATA box (TATAAA), both of which are important transcription and replication regulatory elements. The E2BS, poly-A site, and TATA box are located at positions 7,220-7,231, 6,966-6,971 and 7,160-7,165, respectively (Fig 2A). The early region of PV genomes encodes the non-structural viral proteins involved in viral DNA replication, transcription and cell transformation [22]. The early region of BPV 04AC14 encodes 5 ORFs: E8 (204 bp), E7 (297 bp), E1 (1,839 bp), E2 (1,248 bp) and E4 (459 bp).
The BPV 04AC14 genome encompasses a putative E8 gene. This gene encodes a protein that is chemically and functionally similar to the protein encoded by the HPV E5 gene and may also substitute for the function of E6 as an oncogene, primarily by activating cell growthpromoting signalling [23]. Therefore, E8 may play the role of E5 or E6 because it is located in a similar position in the genome (nt 1-204). Yet, BPV 04AC14 E8 shares a low degree of amino acid identity with BPV3 E8 (34.3% identity and 47.8% similarity).
The putative E1 protein (with helicase function) has an ATP-dependent helicase GX 4 GK[T/ S] (GPPNTGKS in BPV 04AC14) domain at amino acid positions 441-448 [33]. The novel PV E1 has a cyclin A interaction motif (RXL), which is thought to be important for the initiation of papillomavirus replication [34].
The late regions, L1 and L2, were predicted to encode the major and minor capsid proteins, respectively. Both capsid proteins contain a nuclear localization sequence (NLS), which consists of a high proportion of positively charged residues (K and R) in their C-terminal ends that are likely to play a role in the nuclear translocation of L1 and L2 during the viral life cycle. Furthermore, the L1 gene has been chosen as yardstick for building PV comparisons, and taxonomic categories are based on the percentages of identity at the nucleotide level in this gene [5].
A new PV type can be proposed if the L1 gene sequence shares less than 90% identity with the closest known PV type [5,7,41]. Here, it can be observed that the BPV 04AC14 L1 gene diverges 21% in relation to the nucleotide sequence of the closest BPV type (BPV3). Based on this criterion, the PV strain identified in this study should be designated as a novel PV type. Different PV genera share less than 60% nucleotide sequence identity in the L1 ORF. Our results indicate that 04AC14 BPV should be regarded as a new PV type within the genus Xipapillomavirus.
Additionally, a phylogenetic tree was reconstructed with optimized alignments based on the nucleotide sequence of the L1 gene. Using a set of genus-representative sequences of PV, BPV 04AC14 clustered in the Xipapillomavirus genus along with most known bovine PVs (Fig 3A).

Histopathological findings
Based on the histopathological findings, the diagnosis of epidermal papillomatosis was confirmed for specimen 04AC14. Neoplastic tissue of sample 04AC14 consisted of exophytic papillomatous lesions, epithelium proliferation, and well-differentiated cells, showing marked acanthosis (Fig 4A and 4B). This pattern has also been observed in other BPV-associated lesions [21,42,43]. Xipapillomaviruses infect only epithelial cells to induce true epithelial papillomas [3,32,38]. Some Xipapillomavirus types have been reported to cause teat and udder papillomatosis in dairy cattle worldwide [42][43][44]. Such lesions inflict serious economic losses on the dairy industry. The identification of distinct types/species of BPVs may be highly important not only to improve knowledge on BPV biology but also to aid in defining the appropriate antigens for candidate vaccines.

Conclusion
A new putative BPV type-BPV 04AC14 -is introduced. The L1 gene shares 90% identity with previously described BPVs (Table 1). The reconstructed phylogenetic tree with members of the Xipapillomavirus genus reveals that BPV 04AC14 is clearly a new member of this genus (Fig 3A). The genome of BPV 04AC14 aligned close to the Xipapillomavirus 1 branch, which also contains BPV3, the representative species of the genus (Fig 3B). These findings add to the expanding genetic diversity of bovine papillomaviruses. Additionally, because there was no evidence of BPV co-infection in the sample, we can infer that BPV 04AC14 is implicated in the aetiology of bovine papillomatosis.