Comparative genomic analysis of three geographical isolates from China reveals high genetic stability of Plutella xylostella granulovirus

In this study, the genomes of three Plutella xylostella granulovirus (PlxyGV) isolates, PlxyGV-W and PlxyGV-Wn from near Wuhan and PlxyGV-B from near Beijing, China were completely sequenced and comparatively analyzed to investigate genetic stability and diversity of PlxyGV. PlxyGV-W, PlxyGV-B and PlxyGV-Wn consist of 100,941bp, 100,972bp and 100,999bp in length with G + C compositions of 40.71–40.73%, respectively, and share nucleotide sequence identities of 99.5–99.8%. The three individual isolates contain 118 putative protein-encoding ORFs in common. PlxyGV-W, PlxyGV-B and PlxyGV-Wn have ten, nineteen and six nonsynonymous intra isolate nucleotide polymorphisms (NPs) in six, fourteen and five ORFs, respectively, including homologs of five DNA replication/late expression factors and two per os infectivity factors. There are seventeen nonsynonymous inter isolate NPs in seven ORFs between PlxyGV-W and PlxyGV-B, seventy three nonsynonymous NPs in forty seven ORFs between PlxyGV-W and PlxyGV-Wn, seventy seven nonsynonymous NPs in forty six ORFs between PlxyGV-B and PlxyGV-Wn. Alignment of the genome sequences of nine PlxyGV isolates sequenced up to date shows that the sequence homogeneity between the genomes are over 99.4%, with the exception of the genome of PlxyGV-SA from South Africa, which shares a sequence identity of 98.6–98.7% with the other ones. No events of gene gain/loss or translocations were observed. These results suggest that PlxyGV genome is fairly stable in nature. In addition, the transcription start sites and polyadenylation sites of thirteen PlxyGV-specific ORFs, conserved in all PlxyGV isolates, were identified by RACE analysis using mRNAs purified from larvae infected by PlxyGV-Wn, proving the PlxyGV-specific ORFs are all genuine genes.


Introduction
Baculoviruses have long been explored as biological control agents of agricultural and forest pests attributing to their pathogenicity highly specific for insects, mainly Lepidoptera, Hymenoptera and Diptera [1]. Genotypic variation in baculovirus populations have been widely detected between isolates from different geographical regions and within virus isolates by their genomic restriction endonuclease (REN) profiles in 1970s-1990s [2]. Differences in phenotypes were also revealed between different isolates and between genotypes derived from the single isolates by in vitro or in vivo techniques [2,3]. More recently, hundreds of baculoviruses genomes have been completely sequenced, including the genomes of multiple different geographical isolates of some virus species. Nucleotide polymorphisms (NPs) have been documented between the different virus isolates and within the same isolate [3][4][5][6][7][8]. Comparative analysis of these genome sequences may make it possible to determine the genetic basis for phenotypic variations in populations of the same viral species. This may facilitate the improvement of baculovirus pesticides by mixing different virus genotypes. The codling moth, Cydia pomonella, was reported to be resistant to a Cydia pomonella granulovirus isolate CpGV-M. However, the resistance can be overcome by several other CpGV isolates. Whole-genome sequencing and phylogenetic analyses of these geographic CpGV variants revealed that the resistance is the consequence of a mutation in viral gene pe38 [9,10]. Plutella xylostella granulovirus (PlxyGV) belongs to the genus Betabaculoviruses. It is pathogenic for the diamondback moth, Plutella xylostella, a major destructive pest of cruciferous crops worldwide [11]. The virus has been isolated in several countries including Japan, China, India, Kenya, and South Africa [12][13][14][15][16][17]. In China, PlxyGV was first isolated, in Wuhan, and studied in 1970s [18]. Subsequently, it was isolated in other districts [19,20]. Some strains of the diamondback moth have developed resistance to chemical pesticides and also become resistant to the bacterial insecticide Bacillus thuringiensis that have been used for its control [21,22]. As an alternative, PlxyGV has been tested for the control of the pest [16,19,20,23]. A registered PlxyGV biopesticide has been commercialized and used in large scale for the control of diamondback moth in China and Malaysia since 2008 [24][25][26][27][28][29]. Laboratory experiments have also been done to characterize PlxyGV morphology, histopathology, in vitro replication in cell culture, and molecular biology [17,[30][31][32][33][34][35][36].
PlxyGV has a single circular double-stranded DNA genome. The complete genome sequence of a PlxyGV isolate (PlxyGV-K1) from Japan was first reported in 2000 to consist of 100,999 bp and encode 120 putative protein-coding open reading frames (ORFs) [37]. Subsequently, the complete genome sequences of five additional isolates from mainland China (PlxyGV-C and PlxyGV-T), Taiwan (PlxyGV-K), Malaysia (PlxyGV-M), and South Africa (PlxyGV-SA) were published in 2016 [38,39].
In this study, the genomes of three PlxyGV isolates named PlxyGV-W, PlxyGV-B and PlxyGV-Wn were completely sequenced. Intra isolate NPs and inter isolate NPs in the genomes were detected; And their insecticidal activity to the larvae of diamondback moth were also evaluated. In order to investigate genetic diversity and stability of PlxyGV, the genome sequences of these PlxyGV isolates and six previously reported PlxyGV genome sequences were comparatively analyzed, in nucleotide sequence variations, non-synonymous sequence polymorphisms, gene content and phylogeny.

Virus and insects
The PlxyGV-W and PlxyGV-Wn were isolated near Wuhan in 1979 and 2018 from diseased P. xylostella larvae in cabbage fields, respectively [18]. PlxyGV-B is from a commercialized biopesticide, that was originally isolated in Beijing in 1980s. The isolates were propagated by feeding an artificial diet contaminated with the virus occlusion bodies (OBs) to third instar laboratory reared Diamondback moth larvae.
Purification of OBs and extraction of viral DNA were carried out as described by Hashimoto et al. [40], with modifications. Approximately100 infected larvae were homogenized with a blender. The worm-tissue fragments in the homogenate were removed by differential centrifugation at 750 RPM for 15min and 8,500 RPM for 25min, and repeated twice. The pellet was suspended in 3-4 ml of ddH 2 O. The suspension was layered onto a 30, 40, 50, and 60% (wt/vol) discontinuous sucrose gradient and centrifuged at 4,000 rpm for 1h. The OB fraction was collected and washed twice by suspension in H 2 O and centrifugation. The OBs were dissolved in equal volume of alkaline solution (100 mmol/L NaCl, 100 mmol/L Na 2 CO 3 , 5 mmol/ L EDTA) and incubated at room temperature with stirring, then mixed with equal volume of protein digestive solution [1 mmol/L EDTA, 1% SDS (w/v), 10 mmol/L Tris-HCl (PH 7.4), 0.5 mmol/L NaCl, 0.2 mg/L protease K], incubated in a bath at 58˚C overnight. Viral DNA was extracted twice with an equal volume of phenol-chloroform and precipitated by mixing with ethanol centrifugation, then dissolved in TE buffer and stored at 4˚C.

Genome sequencing
The genomes of the three PlxyGV isolates were sequenced by using Illumina Hisen X ten system. Sequence assembly were done by using SOAP denovo (Version 2.04) software (BGI), and using the first published genome sequence of the PlxyGV-K1 isolated from Japan (NC_002593) as a reference. PCR was performed to synthesize DNA fragments bridging the gaps between contigs by using the genomic DNAs of individual PlxyGV isolates as templates. PCR products were sequenced from both ends. The sequences were assembled with the initial contigs into a single, circular contig. Sequences were analyzed with Lasergene programs (DNASTAR). Homology searches were carried out with GenBank/EMBL, SWISSPROT and PIR databases by using the BLAST algorithm. Multiple sequence alignments were performed by using CLUSTAL W. The PlxyGV genome sequence accession numbers are MN099284 for PlxyGV-W, MN099285 for PlxyGV-B and MN099286 for PlxyGV-Wn.

RNA purification and RACE analysis of PlxyGV-specific genes
P. xylostella larvae in third instar were infected with PlxyGV-Wn by feeding with viral OBscontaminated diet and collected at 12 h, 24 h, 48 h, 72 h and 96 h post infection. 25 infected larvae (five larva from each time point) were immersed and homogenized in 1,000 μl of Trizol and incubated on ice for 10 min, then centrifuged at 11,400 rpm and 4˚C for 10 min. The supernatant was mixed with 200 ml of chloroform with shaking for 15 s, and incubated on ice for 15 min, then centrifuged at 11,400 rpm and 4˚C for 15 min. 400 μl of the upper phase was taken and mixed with 500 μl of isopropyl alcohol, incubated at room temperature, then centrifuged at 11,400 rpm for 10 min. The pellete was rinsed with 200 μl of 75% ethanol in DEPC water by centrifugation at 11,400 rpm for 5 min, air dried, then dissolved in x μl of DEPC water. 8 μl of RNA sample was mixed with 1 μl of DNA digestion buffer and 1 μl of DNase I, and incubated at 37˚C for 30 min. Then 2 μl of 50 mM EDTA was added to inactivated DNase I by incubation at 65˚C for 10 min.

Bioassays
Bioassays on the infectivity of PlxyGV isolates were performed as previously described [41].
To determine the median lethal concentration (LC 50 ), virus suspensions in concentrations of 0, 1×10 7 , 2×10 7 , 5×10 7 , and 1×10 8 OBs/ml were prepared respectively, by suspending the virus OBs in 4% sucrose in double-distilled water containing 0.05% food blue dye. Quantification of PlxyGV was performed by using a qPCR method. The virus suspensions were used to feed newly molted third-instar P. xylotella larvae that had been starved for twelve hours. The larvae that had swallowed the virus suspension were picked and transferred into the wells of twelve-well plates and feed with fresh artificial diet for the duration of the bioassay. Mortality was recorded daily after infection until larvae died or pupated. Forty eight larvae per concentration were used in the infection experiments and the experiments were repeated in triplicate. The LC 50 values were determined by the probit analysis calculated compared with a relative median potency method. To determine median lethal time (ST 50 ) , newly molted third-instar P. xylotella larvae were oral infected in the same way as above, using virus suspensions of 1×10 9 OBs/ml. Mortality was recorded every 6 h after infection until all larvae died or pupated. The ST 50 values were calculated with the Kaplan-Meier estimator and compared by the log-rank test.

Genome sequences of PlxyGV-W, PlxyGV-B and PlxyGV-Wn
The genomes of PlxyGV-W, PlxyGV-B and PlxyGV-Wn are 100,941 bp, 100,972 bp and 100,999 bp in length with a G + C compositions of 40.73%, 40.71% and 40.71%, respectively. A complete sequence alignment showed a sequence identity of 99.8% between PlxyGV-W and PlxyGV-B, 99.6% between PlxyGV-W and PlxyGV-Wn, and 99.5% between PlxyGV-B and PlxyGV-Wn, respectively. The gene contents, genome organization and variations of the three isolates are demonstrated by Fig 1. It is shown that all the three PlxyGV isolates contains 118 ORFs in common, being 150 bp or longer, starting with an ATG and having minimal overlap with adjacent ORFs or homologous repeat regions (hrs), respectively. All the homologous ORFs and hrs are completely collinear in organization in the three isolates.
Intra-isolate NPs are detected in all the three virus isolates. In PlxyGV-W genome, thirty two single nucleotide polymorphisms (SNPs) and four NPs involving two or more nucleotide alterations are identified. Although the majority of the NPs locate in ORFs, only eight SNPs encode amino acid alterations, occurring in ORF26 (f), ORF32 (lef2), ORF61 (dbp), ORF99 (lef9), ORF104 (fgf), ORF109 (lef8) and ORF113, respectively, and there is a NP with a insertion/deletion (InDel) of twelve nucleotides after AA68 and a deletion of a single nucleotide causing frame shift after AA101, in ORF73 (ac91) ( Table 1).
In the PlxyGV-B genome, there are 105 SNPs and six NPs involving multiple nucleotide alterations. The majority of the NPs identified in PlxyGV-W are also found in PlxyGV-B. Nineteen NPs causing amino acid changes in thirteen ORFs of PlxyGV-B genomes including ORF2 (p10), ORF20, f, lef2, ORF35 (mmp), dbp, ORF68 (ac145), ORF72 (hel-1), ORF73 (ac91), ORF84 (pif8), lef9, fgf and lef8 and ORF112 (ac53). lef2, lef9 and fgf contain two, and ORF73 contains three nonsynonymous NPs (Table 1). Similar to PlxyGV-W, there is an identical NP with an InDel of twelve nucleotides and an InDel of a single nucleotide causing frame shift in ORF73. The difference is that the sequencing reads missing the twelve nucleotides are more than the ones with the twelve nucleotides, in PlxyGV-W ORF73. In contrast, the sequencing reads missing the twelve nucleotides are much less than the ones with the twelve nucleotides, in PlxyGV-B ORF73. In PlxyGV-B ORF20, there is an NP involving an InDel of eighteen nucleotides. PlxyGV-Wn genome contains sixty nine SNP sites. Only six SNPs in five ORFs induce amino acid alterations, including ORF16 (pif5), ORF52 (ac38), ORF70 (38k), ORF73 (2 SNPs) and ORF113 (Table 1). Most SNPs in PlxyGV-Wn genome are different from the ones in PlxyGV-W and PlxyGV-B genomes. Majority of the SNPs in all the three PlxyGV isolates are nucleotide transitions.

Comparison of the genomes of PlxyGV-W and PlxyGV-B
PlxyGV-B genome is 31 bp longer than PlxyGV-W genome. The difference is mainly in the hr regions. The hr1, hr2, hr3, and hr4 of PlxyGV-B are 40 bp, 2 bp, 3 bp and 15 bp longer than those of PlxyGV-W, respectively. The sizes of all putative protein-coding ORFs of PlxyGV-B are same as the ones of PlxyGV-W except ORF20 and ORF73 (Table 2). Relative to PlxyGV-W ORF20, PlxyGV-B ORF20 has six amino acids deleted after AA119. And there are twelve additional amino acid variations between these two ORF20 homologs. As mentioned above, both PlxyGV-W and PlxyGV-B ORF73 homologs have two NPs at AA68 and AA101 (PlxyGV-W)/ 94(PlxyGV-B) sites. In PlxyGV-W ORF73, most sequencing reads have the extra twelve nucleotides encoding "TPPP" after AA68, and a small part of sequencing reads miss the twelve nucleotides. In contrast, most sequencing reads miss the twelve nucleotides and small parts of sequencing reads have the ones in PlxyGV-B ORF73. At the AA101/97 site, most sequencing reads contains an "A", small part of sequencing reads contains "AC" that makes a frame shift. In contrast, most sequencing reads contains "AC" and small part of reads contain "A", in PlxyGV-B ORF73. There are seven additional nonsynonymous variations in six ORFs between PlxyGV-B and PlxyGV-W (Fig 1 and Table 2).

Comparison of PlxyGV-Wn genome with the genomes of PlxyGV-W and PlxyGV-B
PlxyGV-Wn genome is 58 bp and 27 bp longer than the genomes of PlxyGV-W and PlxyGV-B, respectively. Similarly, the differences are also mainly due to the differences in the length of the hrs. There are totally 486 nucleotide substitutions between the genomes of PlxyGV-Wn and PlxyGV-W, including seventy three nonsynonymous point mutations in forty seven ORFs (Table 1 (Table 2 and Fig 1). Relative to PlxyGV-B, PlxyGV-Wn ORF73 has four codons inserted after AA68 and a cluster of eight proline codons deleted after AA95. Unlike PlxyGV-B ORF20 that contains five nonsynonymous mutations relative to PlxyGV-Wn ORF20, PlxyGV-W ORF20 encodes the same amino acid sequences as PlxyGV-Wn ORF20. ORF61 and ORF99 contain one and two nonsynonymous variations between PlxyGV-Wn and PlxyGV-W whereas there is no difference in these two ORFs between PlxyGV-Wn and PlxyGV-B.

PlxyGV-W and PlxyGV-B demonstrate higher insecticidal activity than PlxyGV-Wn for P. xylotella larvae
The infectivity of PlxyGV-W, PlxyGV-B and PlxyGV-Wn were tested for newly molted thirdinstar P. xylostella larvae by feeding the larvae with viral OBs and determining LC 50 and ST 50 in bioassays. As shown in Table 3, the LC 50 of PlxyGV-Wn is about two times of the ones of the other two virus isolates while there is no significant difference between PlxyGV-W and PlxyGV-B. No significant difference is detected in ST 50 between all the three isolates at a concentration of 1×10 9 OBs/ml ( Table 4).

Comparison of the genome sequences of nine PlxyGV isolates
To investigate diversity of PlxyGV isolates from different area, the genome sequences of PlxyGV-W, PlxyGV-B, PlxyGV-Wn are compared with six additional complete PlxyGV   Table). Similar variation frequencies in hrs, other noncoding sequences and ORFs are also observed between genomes of the other viral isolates. Base transitions account for most variations between the genomes of the virus isolates. Base transitions are two to three times of transversions between PlxyGV-W and PlxyGV-B, -Wn, -K1, -C, -K, -M, and -T genomes, and nine times of transversions between PlxyGV-W and PlxyGV-SA genomes. Except PlxyGV-K1, all the eight additional PlxyGV isolates have 118 putative protein-coding ORFs in common. ORF organization are completely collinear between the genomes of them. PlxyGV-K1 genome was reported having 120 putative protein-coding ORFs. Sequence alignment shows that the ORF38 and ORF39 of PlxyGV-K1 match the upstream and downstream sequences of the ORF38 in the other isolates, respectively. The difference results from a frameshift induced by a single nucleotide deletion in the genome of PlxyGV-K1 relative to the other PlxyGV isolates. Similarly, the sequence of PlxyGV-K1 ORF48 and ORF49 match the downstream and upstream of the ORF49 (p74) of the other PlxyGV isolates. This is also from a single nucleotide insertion/deletion between PlxyGV-K1 and the other isolates. Frameshift variations by single nucleotide changes between PlxyGV-K1 and the other isolates are also found in ORF9, ORF13 (odv-e18), ORF26 (ac23), ORF95 (lef3), and ORF108 resulting in changes in ORF size and predicted amino acid sequences encoded. The ORF9 in PlxyGV-K1 and PlxyGV-SA has extra thirteen codons at the N-terminal relative to its homologs in the other isolates, resulting from a C/T substitution at nt38 upstream of the first ATG of the ORF9 in the other isolates, which creates an new start codon. A single nucleotide missing in the middle of PlxyGV-K1 ORF13 relative to its homologs in the other isolates results in a frameshift after aa48. An A/T substitution in ORF26 creates a stop codon immediate upstream of the second ATG relative to the other PlxyGV isolates. That causes nine codons at the N-terminal missing in PlxyGV-K1 ORF26. A C/A substitution converts the cysteine codon at aa298 in the ORF95 of the other PlxyGV isolates into a stop codon in PlxyGVK1 ORF95, that causes PlxyGV-K1 ORF95 forty aa shorter than its homologs in the other isolates. In addition, PlxyGV-K1 has a cluster of seventeen nucleotides missing in ORF108 relative to the other isolates, after aa137. That causes frameshift and creates a stop codon immediate downstream of the deletion. Whether these differences between PlxyGV-K1 and the other virus isolates result from evolution or sequencing error needs further verification. In addition, there is a cluster of eleven codons inserted within the C-terminal region of PlxyGV-SA ORF50 (p10) relative to the ones in the other PlxyGV isolates. Apart from the ORFs described above, ORF73 and ORF20 are most variable among the ORFs of PlxyGV isolates (Fig 3). ORF73 is homologous to AC91. Homologs of this gene are found in genomes of all Group I alphabaculoviruses and CpGV in addition to PlxyGV [42]. It is rich in proline and serine/threonine residues. The amino acid sequence from AA69 to AA74 of PlxyGV-SA ORF73 is different from the ones of the other isolates while PlxyGV-B ORF73 misses four amino acids in this region. There is a glutamine residue at AA96 (AA92 for PlxyGV-B) position in ORF73 homologs of PlxyGV-W, -B, -Wn, -T, and -K1. Following the AA96 is a long cluster of repetitive proline residues varying in number among the virus isolates. PlxyGV-K ORF73 miss all the C-terminal sequences after AA102. The C-terminal of PlxyGV-W ORF73 is totally different from the ones of the other isolates due to frame shift as mentioned before. The ORF20 homologs consist of 223-235 amino acids. There are 4-6 repeated "RCPSPR" and 4 "RC/ SP/Q/S/ESPR/H" repeats in the middle region. PlxyGV-B and PlxyGV-SA ORF20 have two copies, PlxyGV-W and PlxyGV-Wn have one copy of "RCPSPR" less than the other isolates.

Phylogenetic tree of PlxyGV isolates
An evolutionary tree of the nine virus isolates was constructed using MEGA6 software with the neighbor joining method, based on concatenated amino acid sequences encoded by the thirty eight baculovirus core genes [42], using Hyphantria cunea granulovirus (HycuGV) as an outgroup. HycuGV was shown to be most close to PlxyGV [44]. In the process, PlxyGV-K1 ORF48 and ORF49 were merged into one ORF by filling the single missing base relative to their homologs in the other viral isolates. It can be seen that PlxyGV-C clusters with PlxyGV-K. This cluster is near PlxyGV-M and PlxyGV-T. PlxyGV-W and PlxyGV-B are located in the same cluster. PlxyGV-Wn is more distant from PlxyGV-W and PlxyGV-B than the other isolates except PlxyGV-SA. PlxyGV-K1 is closer to PlxyGV-T, -M, -C and -K than PlxyGV-Wn although they are in the same clade. PlxyGV-SA is relatively distant from the other isolates.

Discussion
In this study, we describe the genome sequencing and analysis of three PlxyGV isolates. PlxyGV-W was first isolated from a diamondback moth larva from cabbage fields in Wuhan city, in early 1980s. PlxyGV-B was originally isolated near Beijing in the early 1980s. The PlxyGV-Wn was isolated in Wuhan, in April, 2018. The sequence data show that the genomes of PlxyGV-W and PlxyGV-B share a sequence identity of 99.8%. And the amino acid sequences encoded by these two viral genomes are almost identical except for variations in ORF20 and ORF73. Surprisingly, although PlxyGV-Wn was isolated from the same area as PlxyGV-W, it shares higher sequence identity with PlxyGV-K1, -T, -C, -K and -M than with PlxyGV-W and -B (Fig 2), and is more closely related to PlxyGV-K1, -T, -C, -K and -M than to PlxyGV-W and -B as demonstrated by the phylogenetic tree (Fig 6). It implies that PlxyGV-Wn and PlxyGV-W originated from different populations, which may have emerged in Wuhan at different time points in history. Located at the junction of the Yantze and Han rivers, Wuhan is a transportation hub that facilitates the introduction of species from different regions. In addition, PlxyGV-W and -B demonstrated higher insecticide activity to diamondback moth larvae than PlxyGV-Wn. How the genomic sequence variations determine the differential insecticidal activity between PlxyGV-Wn and the other two virus isolates will require further investigation. Notably, among the fouty eight ORFs containing non-synonymous variations between PlxyGV-Wn and PlxyGV-W and/or PlxyGV-B are homologs of egt and six per os infectivity factor genes pif0, pif1, pif2, pif5, pif6 and pif8 and odv-e66, an additional possible per os infectivity factor gene. Egt encodes ecdysteroid UDP-glucosyltransferase to block molting and pupation in infected larvae, thereby to prolong the feeding stage of infected larvae [45,46]. per os infectivity factors are required for infection of insects [47][48][49][50]. Previously, intra isolate genetic diversity was reported in many baculoviruses. Phenotypic changes were observed between genotypes. For instance, twenty-five genotypic variants of a nucleopolyhedrovirus were identified and purified from a single Panolis flammea larva. Four of the genotypic variants were found having significant difference in pathogenicity, speed of killing and yield [2]. Genome sequencing makes it possible to characterize inter and intra isolate diversity of same species. Complete NPs contained in a baculovirus genome was first identified in Mamestra configurata nucleopolyhedrovirus v90/4 [4]. It is considered that presence of a pool of polymorphisms may provide advantage in adapting to a changeable environment. In this study, intra isolate NPs are identified in the genomes of all the three PlxyGV isolates. The NPs occurring in PlxyGV-W almost completely overlap with the ones in PlxyGV-B, although PlxyGV-B has more nucleotide polymorphisms than PlxyGV-W. Notably, the ORFs containing non-synonymous polymorphisms include homologs of three DNA replication factors HEL1, LEF5 and DBP, two late expression factors LEF8 and LEF9, and per os infectivity factor PIF9. The NP profile in PlxyGV-Wn genome is totally different from the ones in the other two isolates. The ORFs with non-synonymous polymorphisms in PlxyGV-Wn genome also include a homolog of per os infectivity factor, PIF5.
Genome sequence comparison of nine PlxyGV isolates reveals high genetic stability of PlxyGV. These PlxyGV isolates are from five areas of four countries, but they have limited variations in genome size and nucleotide sequence. The maximal length difference is only sixty three base pairs, which exists between PlxyGV-W/PlxyGV-SA (100,941 bp) and PlxyGV-T (101,004 bp). The minimum sequence homogeneity is 98.6 percent, existing between PlxyGV-SA and four isolates from the mainland of China and the one from Japan. No gain/loss of prospective protein-coding ORFs identified among the viral isolates. The high genetic stability of PlxyGV ensures the stability and specificity of its control effect on diamondback moth, and is helpful to commercialization of PlxyGV insecticides. In addition, it also facilitates the construction of recombinant PlxyGVs with enhanced insecticidal activity through genetic manipulation, ensuring that the superior properties obtained by engineered viruses are not easily lost or changed.
Previously reported genomes of different geographic isolates of the same baculovirus species usually have variations in gene contents, frequently occurring in bro gene associated regions [4,51]. PlxyGV lacks bro homologs. Similarly, seven Erinnyis ello granulovirus (ErelGV) field isolates also have common ORF contents and organization, but all of them are isolated in Brazil [7]. Similar to other baculoviruses, NPs are mainly present in hrs and two ORFs containing repetitive sequences in PlxyGV genome. All the ORFs with the highest levels of nonsynonymous mutations have unknown functions. Paralogous genes p10 and ac145/ac150 homologs demonstrate relatively high levels of non-synonymous mutations. Baculoviruses have thirty eight core genes whose homologs are present in all baculovirus genomes sequenced to date [42]. Generally, the PlxyGV core gene homologs contain low levels of non-synonymous variations among the nine viral isolates. Similar phenomenon were also observed in ErelGV isolates [7].
Thirteen ORFs specific to PlxyGV are conserved in all the PlxyGV isolates. The TSS and PAS of the ORFs were identified by RACE analysis. The data suggest that all these PlxyGV-specific ORFs are transcribed during infection. Seven of these PlxyGV unique ORFs have no nonsynonymous variation among all the PlxyGV isolates, implying these genes must play important roles in replication and infection of PlxyGV. Notably, none of the PlxyGV-specific genes were found to start transcription from late promotor motifs. We are not sure whether these results reflected the real situation. If the levels of some transcripts starting from TAAG motifs were very low, they might not be detected.
The PlxyGV isolates analyzed in this study are from five geographically separate areas, the mainland of China, Taiwan, Japan, Malaysia and South Africa. Phylogenetic analysis shows that PlxyGV-SA is distantly related to the other isolates, which may reflect their geographic distance from the other isolates. PlxyGV-M from Malaysia, PlxyGV-C from mainland China and the two isolates from Taiwan, PlxyGV-K and -T are closely related. However, PlxyGV-C is distantly related to other three isolates from the mainland of China. It is likely some isolates migrated from one area to another area recently.
Supporting information S1 Table. Mutation frequency of eight PlxyGV isolates relative to the PlxyGV-W genome sequence in coding, noncoding and hr regions (×10 −3 ). (DOCX) S2 Table. The transcription start sites and polyadenylation sites of PlxyGV specific genes. (DOCX)