Genome analysis of a novel Group I alphabaculovirus obtained from Oxyplax ochracea

Oxyplax ochracea (Moore) is a pest that causes severe damage to a wide range of crops, forests and fruit trees. The complete genome sequence of Oxyplax ochracea nucleopolyhedrovirus (OxocNPV) was determined using a Roche 454 pyrosequencing system. OxocNPV has a double-stranded DNA (dsDNA) genome of 113,971 bp with a G+C content of 31.1%. One hundred and twenty-four putative open reading frames (ORFs) encoding proteins of >50 amino acids in length and with minimal overlapping were predicted, which covered 92% of the whole genome. Six baculoviral typical homologous regions (hrs) were identified. Phylogenetic analysis and gene parity plot analysis showed that OxocNPV belongs to clade “a” of Group I alphabaculoviruses, and it seems to be close to the most recent common ancestor of Group I alphabaculoviruses. Three unique ORFs (with no homologs in the National Center for Biotechnology Information database) were identified. Interestingly, OxocNPV lacks three auxiliary genes (lef7, ie-2 and pcna) related to viral DNA replication and RNA transcription. In addition, OxocNPV has significantly different sequences for several genes (including ie1 and odv-e66) in comparison with those of other baculoviruses. However, three dimensional structure prediction showed that OxocNPV ODV-E66 contain the conserved catalytic residues, implying that it might possess polysaccharide lyase activity as AcMNPV ODV-E66. All these unique features suggest that OxocNPV represents a novel species of the Group I alphabaculovirus lineage.


Introduction
Baculoviruses are known to infect a wide variety of insect hosts and play important roles in regulating many insect populations in nature. They have been widely used as environmentally safe agents for pest control. In addition, baculoviruses provide efficient expression systems for the production of recombinant proteins in insect cells, as well as promising vectors for gene therapy [1,2]. These applications greatly facilitate fundamental studies of baculoviruses.

Sequencing and characterization of OxocNPV genome
The genome of OxocNPV was assembled from 71,240 high-quality Roche 454 sequencing reads with an average coverage of 249X. Uncertain regions were confirmed by PCR amplification and Sanger sequencing. The complete genome sequence and annotation information were submitted to GenBank (accession number: MF143631). In summary, the complete circular OxocNPV genome is 113,971 bp in length, with a G+C content of 31.1%. In total, 124 putative open reading frames (ORFs) that potentially encode proteins of >50 amino acids (aa) in length were predicted, covering 92% of the whole genome. Among them, 61 ORFs were in the forward orientation and 63 were in the reverse orientation. The polyhedrin gene was assigned as the first ORF according to tradition. The 38 baculovirus core genes (Fig 1, red), 22 lepidopteran baculovirus conserved genes (Fig 1, blue), 10 Group I alphabaculovirus unique genes (Fig 1, green) and 51 baculovirus common genes (Fig 1, gray) were annotated using Basic Local Alignment Search Tool (BLAST) comparisons. In addition, three genes were classified as OxocNPV unique genes as no homologs were found in the National Center for Biotechnology Information (NCBI) database (Fig 1, open arrows).

Phylogenetic analysis of OxocNPV
A phylogenetic tree based on 38 concatenated core genes from 88 whole-genome sequenced baculoviruses (including OxocNPV) was generated (Fig 2). According to the tree, OxocNPV  18.3% with the above nine viruses, respectively (S1 Table). When compared with the five selected Group I alphabaculoviruses (AcMNPV, BmNPV, ThorNPV, CapoNPV and OpMNPV), OxocNPV shares an average aa identity of 58.1%, 51.8%, and 38.3% for the core genes, lepidopteran baculovirus conserved genes and other baculoviral genes, respectively (S2 Table). Eight OxocNPV ORFs share high homology (>75% aa identity on average) with their counterparts in the other five selected Group I alphabaculoviruses, and the majority of them are core genes/lepidopteran baculovirus conserved genes, except ubiquitin (S2 Table). In contrast, among the 14 less conserved ORFs (<30% aa identity in average), only one core gene (desm oplakin) and one lepidopteran baculovirus conserved gene (lef6) were found (S2 Table). Gene parity plots of OxocNPV against the above nine selected baculoviruses are shown in Fig 3. The OxocNPV gene order is substantially collinear with representatives of both Group I and Group II alphabaculoviruses, with a small region that is collinear with betabaculoviruses; however, the gene order is significantly different from that of gamma-and deltabaculoviruses (Fig 3). The previously identified collinear region, which is conserved among lepidopteran baculoviruses and characterized by containing a highly similar gene contents (harboring~20 core genes) and gene orders [19] is also conserved in OxocNPV. In OxocNPV, this region contains 20 core genes, 5 lepidopteran baculovirus conserved genes and 8 other baculoviral genes (Fig 1).

Homologous regions
A typical characteristic of baculovirus genomes is the presence of interspersed homologous regions (hrs) with high A+T content, tandem repeats and imperfect palindromes, although they do not necessarily exist in all baculoviruses [20,21]. Hrs have been implicated both as origins of DNA replication and as transcriptional enhancers in a number of baculoviruses [22,23]. Six hrs were found in the OxocNPV genome, and they had an A+T content of 54.3% (Fig  1, pink, and Fig 4). Hr2 and hr4 are positioned in a counterclockwise direction and the rest are positioned in a clockwise direction in the genome. The length of the OxocNPV hrs ranges from 230-780 bp, and each hr consists of tandem repeats of about 80 bp in length (Fig 4). The secondary structure prediction of the tandem repeats revealed that it contains two imperfect palindromes (Fig 4, red and blue, respectively).

Gene content of OxocNPV
Annotation of the OxocNPV genome revealed that it contains 11 replication-associated genes, 12 transcription-associated genes, 34 structure-related genes, 10 genes essential for oral infection, and 20 auxiliary genes (Table 1). In addition, 37 genes of unknown function including three hypothetical unique OxocNPV genes were predicted (Table 1).

DNA replication and RNA transcription genes
So far, six genes have been found to be essential for baculovirus DNA replication and they are all present in the OxocNPV genome: immediate early gene-1 (ie-1, oxoc6), DNA polymerase (DNA-pol, oxoc72), helicase (oxoc48), late expression factor 1 (lef1, oxoc95), lef2 (oxoc124) and lef3 (lef3, oxoc70) (Table 1) [24,25]. Among them, IE-1, a major transcriptional activator of early genes, was found to be significant longer for OxocNPV (~714 aa) than for most of the other Group I alphabaculoviruses (~550 aa). Functional analysis showed that the N-terminal half of AcMNPV IE-1 contains two independent transcription stimulatory (transactivation) domains (M1-N125 and A168-G222) interrupted by a basic region, while the C-terminal half contains putative DNA-binding and oligomerization domains ( Fig 5) [26,27]. Sequence alignment showed that the transactivation domain I of OxocNPV IE-1 is quite divergent from the homologs in other clade "a" Group I alphabaculoviruses in that the OxocNPV domain contains several discontinuous insertions ( Fig 5). Interestingly, IE-1 of CapoNPV and Lonomia obliqua multiple nucleopolyhedrovirus [LoobNPV], which are also closely related to the ancestral Group I alphabaculovirus (Fig 2), also exhibits obvious differences in this domain compared to other Group I alphabaculoviruses ( Fig 5) [27]. Whether the transactivation domain I of OxocNPV, CapoNPV or LoobNPV IE-1 is sufficient to activate transcription (like its counterpart in AcMNPV IE-1) remains to be investigated, and this may provide useful information regarding the function-evolution relationship of the baculovirus IE-1 protein.
Additional genes that influence DNA replication were found in OxocNPV: DNA binding protein (dbp, oxoc107), lef11 (oxoc120), me53 (oxoc16), alkaline exonuclease (alk-exo, oxoc20) and ac79 (oxoc59) [16] (Table 1). However, homologs of lef7 (ac125) and proliferating cell nuclear antigen (pcna, ac49) were absent. Eukaryotic PCNA plays a role in DNA synthesis, DNA repair and cell cycle progression. In AcMNPV, PCNA was not found to play an obvious role in transient DNA replication [28]. However, it was found to accelerate expression of late genes [29]. LEF7 is a stimulating factor for viral DNA replication and it has been proposed to be a single-stranded DNA-binding protein [30,31]. Deletion of AcMNPV lef7 resulted in a >90% reduction in viral DNA replication in Sf21 and SE1c cells, but not in Tn368 cells [32]. Deletion of lef7 from the BmNPV genome also led to impairment of viral DNA synthesis [33]. Recent study suggested that AcMNPV LEF7 promoted efficient virus replication most likely by hijacking host factors regulating the DNA damage response [34]. So far, homologs of lef7 have been found to be present in all the sequenced Group I alphabaculoviruses except OxocNPV, CapoNPV and LoobNPV [35]. Gene parity plot analysis. Gene parity plots were constructed of OxocNPV against representative baculoviruses: AcMNPV, BmNPV, ThorNPV and CapoNPV (Group I clade "a"); OpMNPV (Group I clade "b"); HearNPV (Group II); CpGV (a betabaculovirus), NeseNPV (a gammabaculovirus) and CuniNPV (a deltabaculovirus). OxocNPV ORFs are on the x-axes. The accession numbers of these genomes are listed in S1 Early baculovirus genes are transcribed by the host cell RNA polymerase II, but after onset of DNA replication, the transcription of late and very late genes is dependent on viral-encoded RNA polymerase, a 560-kDa protein complex composed of LEF-4, 8, 9 and P47 [36]. In the OxocNPV genome, six core genes (the four components of RNA polymerase plus lef5 [oxoc44], and very late factor 1 [vlf-1, oxoc61]), three lepidopteran baculovirus conserved genes and two other baculoviral genes related to viral late gene transcription were identified (Table 1) [36,37]. OxocNPV lef6, CapoNPV lef6 and LoobNPV lef6 share low similarity (~20-40% aa identity) with other members of Group I clade "a". LEF6 is required for late gene transcription and may function as an mRNA exporter. Deletion of AcMNPV lef6 leads to a~90% reduction in infectious budded virus (BV) production [38]. The homolog of ie-2, a specific gene of Group I alphabaculoviruses, is absent from the genome of OxocNPV. IE-2 contains a predicted really interesting new gene (RING) finger domain and has been found to enhance transactivation when acting synergistically with IE-1 [39][40][41]. Deletion of ie-2 reduced the plasmid replication level by 3-fold in Sf21 cells [42]. OxocNPV is the first reported Group I alphabaculovirus to lack ie-2.

Structural genes
Eighteen core genes and six lepidopteran conserved genes that encode structural proteins were identified in the OxocNPV genome (Table 1) [43][44][45]. In addition, nine other baculoviral genes were also identified in the OxocNPV genome (Table 1). Desmoplakin (ac66) is one of the 38 core genes. Knockout of ac66 led to a >99% reduction in BV yield compared to the wildtype virus, as well as the elimination of occlusion-derived virus (ODV) and OB formation [46]. Certain baculoviruses harbor two or three copies of desmoplakin. In the OxocNPV genome, only one desmoplakin gene (oxoc71) is present and its protein length (683 aa) is much shorter than those of other Group I homologs (766-953 aa), due to many deletions in the middle region (S1 Fig, only clade "a" members are shown). Cg30 is present in the genomes of most sequenced alphabaculoviruses and certain betabaculoviruses (such as SpliGV). Regarding Group I alphabaculoviruses, cg30 is missing only in two cases, OxocNPV and Maruca vitrata multiple nucleopolyhedrovirus (MaviNPV). CG30 contains putative RING finger and leucine zipper domains. It is not an essential gene for AcMNPV replication as deletion of cg30 resulted  in only a subtle reduction in the BV titer [47]. In a study of BmNPV, CG30 was found to be required for maximum BV production and OB formation [48]. Therefore, the acquisition of cg30 may represent a selective advantage during evolution.

Proteins involved in primary infection
Per os infectivity factors (PIFs) are a group of ODV-specific envelope proteins that are required for the establishment of primary infection [49,50]. So far, all ten recognized PIF genes have been found in the OxocNPV genome, comprising p74 (oxoc17), pif1 (oxoc29), pif2 (oxoc102), pif7 (ac110, oxoc35), pif8 (vp91/p95, oxoc55) and sf58 (oxoc37) ( Table 1; S1 Table) [51][52][53][54]. Except for the last PIF gene, which is a lepidopteran baculovirus conserved gene, the PIF genes are core genes [55]. Besides PIFs, other ODV envelope proteins also play important roles in per os infection. ODV-E66 (ac46) is also a major component of ODV envelope proteins. Homologs of odv-e66 are found in the genomes of most alpha-and betabaculoviruses, but not in gamma-or deltabaculoviruses. Deletion of AcMNPV odv-e66 resulted in a 1000-fold increase in the lethal dose that kills 50% of a test sample (LD 50 ) compared to wild-type virus when larvae were infected per os, but there was no difference when the virus was injected into the hemolymph [56]. Recently, ODV-E66 was shown to have chondroitinase activity and it has been suggested that it facilitates the primary infection of ODV by digestion of chondroitin sulfate in the insect midgut peritrophic membrane [57].
Among the 23 sequenced Group I alphabaculoviruses, all encode odv-e66 except MaviNPV, CapoNPV and Condylorrhiza vestigialis MNPV [CoveNPV]). Interestingly, sequence alignment showed that OxocNPV ODV-E66 (oxoc113) shares low amino acid serquence identity (~25%) with the ODV-E66 sequences of other Group I members. The Group I homologs ODV-E66 normally exhibit a high degree of sequence conservation (>70% identity) (S2 Fig). It appears that the N-terminus of OxocNPV ODV-E66, which contains a polysaccharide lyase family sequence with homology to bacterial chondroitinases, is more conserved than its C-terminus, which is a baculovirus ODV-E66 superfamily domain with unknown function (S2 Fig) [58,59]. Five residues were identified as being essential for the catalytic actively of AcMNPV ODV-E66 [58], and these residues were also conserved in the protein of OxocNPV (Fig 6A).
These findings suggest that OxocNPV may encode an active ODV-E66, although further investigation is required. The significant difference in ODV-E66 between OxocNPV and other Group I members also suggests a more ancient origin of OxocNPV during the evolution of the Group I lineage.

Viral DNA extraction
OxocNPV-infected O. ochracea larvae have been preserved at the Chinese General Virus Collection Center (CGVCC) under collection number IVCAS 1.0235. The virus OBs were purified from larvae body homogenate by differential centrifugation [61]. Viral genomic DNA was isolated according to the method reported previously [62,63].

Genomic DNA sequencing and bioinformatics analysis
Genomic DNA sequencing of OxocNPV was performed using the Roche 454 GS FLX pyrosequencing system. The sequenced reads were assembled with 454 Newbler software version 2.7. Low-quality regions or ambiguous bases were further verified by PCR and Sanger sequencing.
The establishment of the full genome sequence of OxocNPV was followed by ORF and repeated regions prediction. The hrs were determined using Tandem Repeats Finder (http:// tandem.bu.edu/trf/trf.html) [64] and the NCBI BLAST server (http://blast.ncbi.nlm.nih.gov/ Blast.cgi). Putative ORFs were predicted using FGENESV0 (http://linux1.softberry.com/berry. phtml) [65] and the NCBI ORF finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html), using the criteria of protein length >50 aa and minimal overlaps. The predicted ORFs were annotated according to homology using NCBI BLAST. The complete genome sequence and annotation information were submitted to GenBank (accession number: MF143631). Gene parity plots were constructed to compare ORF organization, as previously described [66].

Phylogenetic analysis
The concatenated protein sequences encoded by the 38 core genes of OxocNPV and the other 87 sequenced baculovirus genomes were aligned using ClustalW [67]. A phylogenetic tree was reconstructed using the Maximum Likelihood method based on the Jones-Thornton-Taylor (JTT) model with 1000 bootstrap values for core proteins using MEGA6 software [68]. The reliability of the tree was explored via a bootstrap analysis with 1000 replicates [69].