Genetic Diversity and Molecular Evolution of Plum bark necrosis stem pitting-associated virus from China

Plum bark necrosis stem pitting-associated virus (PBNSPaV), a member of the genus Ampelovirus in the family Closteroviridae, infects different Prunus species and has a worldwide distribution. Yet the population structure and genetic diversity of the virus is still unclear. In this study, sequence analyses of a partial heat shock protein 70 homolog (HSP70h) gene and coat protein (CP) gene of PBNSPaV isolates from seven Prunus species grown in China revealed a highly divergent Chinese PBNSPaV population, sharing nucleotide similarities of 73.1–100% with HSP70h gene, and 83.9–98.6% with CP gene. Phylogenetic analysis of HSP70h and CP sequences revealed segregation of global PBNSPaV isolates into four phylo-groups (I–IV), of which two newly identified groups, II and IV, solely comprised Chinese isolates. Complete genome sequences of three PBNSPaV isolates, Pch-WH-1 and Pch-GS-3 from peaches, and Plm-WH-3 from a plum tree, were determined. The three isolates showed overall nucleotide identities of 90.0% (Pch-GS-3) and 96.4% (Pch-WH-1) with the type isolate PL186, and the lowest identity of 70.2–71.2% with isolate Nanjing. For the first time, to the best of our knowledge, we report evidence of significant recombination in the HSP70h gene of PBNSPaV variant Pch2 by using five programs implemented in RDP3; in addition, five codon positions in its CP gene (3, 8, 44, 57, and 88) were identified that appeared to be under positive selection. Collectively, these results indicate a divergent Chinese PBNSPaV population. In addition, our findings provide a foundation for elucidating the epidemiological characteristics of virus population.


Introduction
Plum bark necrosis stem pitting-associated virus (PBNSPaV) is a member of the genus Ampelovirus in the family Closteroviridae [1]. The stem-pitting disease of sweet cherry trees was described in the 1960s in North America [2,3]. Subsequently, the disease has been observed in many cultivated and ornamental Prunus species, including Japanese plum (Prunus salicina), apricot (P. armeniaca), peach (P. persica), cherry (P. avium), and almond (P. dulcis) [4][5][6][7][8][9]. To date, the disease has been reported in Italy [4], Morocco [10], Serbia [11,12], Jordan [13], Egypt [14], Turkey [15], France [16], and China [17]. Some diseased trees show decline, gummosis, flattening of scaffold branches, severe necrosis of bark tissues, and necrotic pitting on the woody cylinders [18,19]. The plum bark necrosis-stem pitting disease was first confirmed to be grafttransmissible on a Japanese plum cv. Black Beaut grown in the United States [20]. High-molecular-weight double-stranded RNAs (dsRNA) were first recovered from symptomatic cherry trees in California [21]. The first successful amplification of cDNA fragments, using HSP70h (a homolog of the heat shock protein 70 gene)-degenerate primers for viruses in the family Closteroviridae, identified the virus as a closterovirus, and it was named Plum bark necrosis stem pitting-associated virus (PBNSPaV) [7]. The association of the virus with the PBNSP disease of stone fruit trees was further confirmed by using degenerate primers, and primers designed within the fragment amplified by the degenerate primers [4,22]. The first complete genome sequence of a PBNSPaV isolate, PL186, was determined by Al Rwahnih et al. [23]. Its plus and single-stranded RNA genome of 14,214 nucleotides (nts) long consists of seven open reading frames (ORFs), and two untranslated regions (UTRs) at the 59 and 39 termini. ORFs 1a and 1b encode a large polyprotein with a molecular mass (Mr) of 259.6 kDa, containing conserved domains characteristic of a papain-like protease, a methyltransferase, and a helicase (ORF1a), and a 64.1-kDa RNA-dependent RNA polymerase (RdRp) (ORF1b), respectively. ORF2 and ORF4 encode a 6.3-kDa hydrophobic protein and a 61.6-kDa protein of unknown function, respectively. ORF3 encodes a 57.4-kDa heat shock protein homolog (HSP70h). ORF5 and ORF6 encode a 35.9-kDa capsid protein (CP) and a 25.2-kDa minor capsid protein (CPm), respectively. The molecular characterization of the virus supported its classification in the genus Ampelovirus [23]. Recently, the complete genome sequences of four PBNSPaV isolates have been determined by pyrosequencing, and genome sequence comparison  [16].
China is an important country for fruit production, and stone fruit trees and ornamental trees in the genus Prunus are grown in almost all regions of China. However, the incidence and distribution of PBNSPaV in the various Prunus species grown in China is unknown. Recently, during field investigations for the incidence of viral diseases in stone fruit trees in China, some cultivated and ornamental Prunus species were found to show trunk gummosis, stem pitting or grooving and some plum trees were found to have died from the disease. The presence and divergence of PBNSPaV isolates in China was first confirmed by reverse transcription (RT)-PCR and sequencing of a 590-nt fragment of HSP70h gene [17]. The origins and evolutionary status of PBNSPaV isolates are poorly understood. Given the apparent effect of the disease associated with PBNSPaV on fruit production, it is imperative to better understand the genetic variation and population structure of PBNSPaV. In this study, PBNSPaV isolates from different Prunus species grown in China were characterized by sequencing their CP and HSP70h genes; the complete genomes of three representative isolates were also sequenced. The objectives of this study were (i) to determine the incidence of PBNSPaV in Prunus trees grown in China, and (ii) to characterize the genetic variability of the PBNSPaV isolates obtained from different Prunus species and from various regions of China. This study provides useful information for enhanced understanding of molecular evolution within the global PBNSPaV population, which will be helpful for epidemiological investigations and for developing more efficient molecular detection methods for the diagnosis of the viral disease.

RT-PCR analysis shows wide PBNSPaV infections in stone fruit trees
RT-PCR using the primer pair PBN-HSP-P1/PBN-HSP-P2 showed that of the 256 samples analyzed, 52 samples (20.3%), including 25 from peach, 5 from flowering peach, 1 from nectarine, 10 from plum, 1 from apricot, 7 from cherry, and 3 from flowering cherry, were positive for PBNSPaV. In peach, flowering peach, and plum trees, PBNSPaV was usually associated with symptoms such as stem pitting and bark cracks on the trunks ( Table 1, Fig. 1), whereas other trees were symptomless or showed only dark-colored gummosis.

Phylogenetic analysis for HSP70h and CP sequences of PBNSPaV isolates reveals existence of divergent variants
The sequences of 87 HSP70h clones from 31 isolates and 61 CP clones from 38 isolates were determined ( Table 1). The product sizes for the HSP70h and CP genes were 590 bp and 1041 bp (including 978 bp of the complete CP gene and 48,63 bp of the intergenic region preceding the CP gene), respectively. The GenBank accession numbers for the determined sequences are KJ792814 to KJ792828 for the CP gene, KJ792829 to KJ792851 for the HSP70h gene. Sequence comparison showed that PBNSPaV isolates from China were highly divergent and the Phylogenetic analysis of the HSP70h nucleotide sequences from Chinese isolates determined in the present study, those previously reported by our group, and eight isolates available in GenBank (Table S1 in File S1) using the neighbor-joining method revealed four major groups (designated as groups I-IV) with strong (100%) bootstrap support values ( Fig. 2A). Among those groups, group I and group II are separated by a bootstrap value of 81%, and group IV is separated from the isolate Nanjing KC5990347 by a bootstrap value of 100%. Group I was the largest and contained isolates or variants from all species sampled (81.3% of all obtained sequences) and four sequences (AJ305307, AF159901, EF546442, and KC590344) referred from GenBank. Group II contained only two Chinese isolates from a plum and a peach samples. Group III contained six sequences, including one sequence (Plm-WH-3-1) determined in the present study, three sequences (Fch-WH-2-1, Ch-WH-1-1, Ch-WH-1-2) previously reported by our group (Cui et al., 2011), and two sequences (KC590345 and KC590346) of isolates, Pair-2 and PR258-2, from France, recently reported by Marais et al. [16]. Therefore, group III appears to correspond to Group 2 identified by Marais et al. [16], based on nucleotide sequences of PBNSPa-3f/2r (HSP70h). Four isolates from peach grown in Gansu province formed a separate group, group IV, which was most closely related to isolate Nanjing (KC590347), sharing 81.6-82.4% nt and 92.3-92.9% aa similarities.
The phylogenetic tree inferred from the nucleotide sequences of the complete CP gene sequences showed a similar topology to the tree inferred from the HSP70h sequences (Fig. 2B); most sequences clustered in a large group with sequence EF546442, obtained from the plum PBNSPaV type isolate PL186. However, no group corresponding to the HSP70h group IV was identified in the analysis of the viral CP sequences.
Sequence alignment of the intergenic spacer (IS) region immediately preceding the start codon of the CP gene showed that the region was highly variable, rich in the bases A and T, and that its size varies from 48 nts to 63 nts ( Fig. S1 in File S2). In the IS-based phylogenetic tree, most isolates occupied similar positions to those they occupied in the CP-based phylogenetic tree, indicating that the IS sequence diversity of PBNSPaV isolates may reflect the phylogenetic relationships between viral variants.

Genome sequences of three PBNSPaV isolates
The complete genome sequences of Pch-WH-1, Plm-WH-3, and Pch-GS-3 consisted of 14,208, 14,211, and 14,211 nts, respectively ( Table 2). The sequences have been deposited in the GenBank database with accession numbers KJ792852, KJ7928523, and KJ792854. The Pch-WH-1 and Plm-WH-3 isolates had highly similar genome sequences, with 94.8% identity at the nt level, and over 98% identity at the aa level, except for the polymerase encoded by ORF1. The isolate Pch-GS-3 shared only 88.9-89.7% nt identity with isolates Plm-WH-3 and Pch-WH-1. The genome-wide nt sequence identities between these three isolates and the type PBNSPaV isolate PL186 (EF546442) were 90.0% (Pch-GS-3) and ,96% (Pch-WH-1 and Plm-WH-2), and these three isolates showed low identity (70.2-71.2%) to an isolate Nanjing (KC590347). The genome structure of these three isolates was identical to that of the type PBNSPaV isolate (EF546442), and consisted of seven ORFs. Each ORF was the same size as the corresponding ORF in EF546442, except ORF1b, which was three nts smaller, and the gene encoding the minor coat protein (CPm), which was 16 nts longer than the corresponding genes of isolates Pair-2 and Nanjing. In general, the 39-UTRs were highly conserved and were 99.0% identical to those of EF546442. ORF1 showed the highest variability, and the three isolates showed 88.2-95.8% nt and 86.5-96.4% aa sequence identities overall to EF546442, and only 66.3-67.6% nt and 67.3-70.6% aa sequence identities to the isolate from Nanjing. The CP and p6 genes appeared to be the most highly conserved proteins, sharing over 93% and 91.2% aa identities among these three isolates and the four reference PBNSPaV isolates. The Pch-GS-3 isolate was the most divergent of the three isolates sequenced here. The 59-UTR of the Pch-GS-3 isolate was only 87.8% identical to that of EF546442, and its seven ORFs showed overall 88.2-94.3% nt identities to those of EF546442.
Phylogenetic analysis of the complete genomic sequences of the three isolates examined in the present study and the five isolates available in GenBank revealed four groups with 100% bootstrap support values (Fig. 3). Pch-WH-1 and Plm-WH-3 clustered with the type isolate, forming a group I, and Pch-GS-3 clustered with a Ta Tao 25 isolate from a Chinese Ta Tao sample in group II, two isolates (Pair-2 and PR258-2) from France form a group III, and one isolate from Nanjing, China [16], forming a group IV. Genetic parameters confirms the genetic diversity of PBNSPaV population The genetic distance within and between HSP70h groups was calculated using MEGA 5 [24]. The within-group genetic distance was highest in group I (0.052) and lowest in group III (0.009). The between-group genetic distance ranged from 0.167 to 0.342 (Table 3). Group IV, consisting of peach isolates from Gansu province, was found to have the greatest genetic distance (0.332-0.342) from the other three groups. The Nanjing isolate, which was not included in any of the groups, showed the greatest genetic distance (0.335) from group I and the lowest distance (0.211) from group VI. These data strongly supported the topology of the tree inferred from the HSP70h sequences, with four major groups.
HSP70h and CP differed in diversity, polymorphic sites, and number of haplotypes (Table 4). An analysis of all available data showed that 47.8% of the nucleotides in HSP70h were polymorphic and 39.1% were parsimony-informative, with a diversity statistic of 0.10521. In the CP gene, 44.8% of the nucleotides were polymorphic and 28.7% were parsimony-informative, with a diversity statistic of 0.06582. These results confirmed that HSP70h was more variable than CP.
The population genetic parameters of the partial HSP70 (590 nt) and CP gene sequences obtained in this study and those available in GenBank were calculated using Dnasp5 ( Table 5). The overall nucleotide diversity (p) values for HSP70h and CP were found to be less than 1 and the haplotype diversity (Hd) values were found to be 0.996 and 1.000, indicating low genetic diversity. Negative values were obtained for the Fu and Li's D statistical tests, suggesting that the viral population may be in expansion; however, these values were not statistically significant. The ratios (dN/dS) of nucleotide substitutions at non-synonymous (dN) to synonymous positions (dS) for HSP70h and CP were 0.083 and 0.099, much less than 1, suggesting that the two proteins have been under strong purifying selection.
Selective pressure analyses identified positively selected codons in the PBNSPaV CP gene Furthermore, the strength of the selective forces acting on codons of the two genes were analyzed using three complementary methods, namely, SLAC, FEL, and IFEL [25]. The results showed that the distribution of codon positions under purifying, neutral, and positive selection differed between the two genes, indicating that the two genes had been exposed to different selective pressures ( Table 6). The CP gene contained more neutral codons than . Recco analysis showed that the recombination junction was between nt 313-331 (Fig. S2 in File S2), with Pch3 (JF810179) and Pch1 (JF810177) identified as the major and minor parents, sharing 98.73% and 93.82% nt, and 98.91% and 97.81% aa similarities, respectively. Moreover, two other HSP70h variant sequences, Apr-YT-1-3 from apricot and Pch2-3 from peach, showed potential evidence of recombination based on Recco analysis (Fig. S2 in File S2). Of those recombinants, variant Pch2-3 and Pch2 (JF810178) were obtained from the same peach plant. However, the variant Pch2-3 showed evidence of a different recombination junction at nts 76-94, with Pch2-1 and Pch6-3 as its parents. The recombination junction of Apr-YT-1-3 was found to be between nts 233 and 245; Ch-YT-2-1 was identified as one of its parents, and the other parent was unknown.

Discussion
Of the 256 Prunus samples examined in the present study, 56 were found to be infected by PBNSPaV, representing an incidence of 20.3% in stone fruit trees, similar to results obtained for samples collected in Italy [26], but much higher than the incidence identified in other studies [16]. The wide species of stone fruit hosts identified and the high incidence of PBNSPaV suggest that the virus may be efficiently transmitted in nature, similar to other closteroviruses.
To date, only a few partial sequences of the HSP70h gene of PBNSPaV are available, and no studies have been conducted on the phylogenetic characteristics of the CP gene, with the exception of the six fully sequenced isolates. In this study, sequence analyze of a 590-bp fragment of HSP70h from 31 isolates and the complete CP gene from 38 isolates shows that the Chinese PBNSPaV population consists of highly divergent isolates, which can be divided into four distinct HSP70h clusters. Two HSP70h groups, II and IV, consist solely of Chinese PBNSPaV isolates sequenced in the present study and are likely to represent two novel phylo-groups. Especially, group IV consisting of molecular variants of four peach isolates from Gansu province were the most divergent from the other groups, with genetic distances of 0.332-0.342, and a distance of 0.207 from the Nanjing isolate. Unfortunately, CP gene sequences of the isolates in HSP70h group IV could not be obtained, although several primer pairs were designed in attempts to amplify their CP gene. The failure of primers that successfully amplified most isolates to amplify CP gene of isolates Pch-GS-1, 2, 3 and 4 suggests a high level of sequence divergence in the genome of these isolates as compared to the other isolates. Marais et al. experienced a similar problem [16]. The similar topologies of the phylogenetic trees constructed using all available CP and HSP70h sequences from PBNSPaV isolates indicate a co-evolutionary tendency between the two genes [27]. However, phylogenetic analyses did not show a clear relationship between genetic variability of PBNSPaV isolates and geographical or host origin excerpt that a few sequences (Pch-GS-1, 2, 3 and 4) from peach grown in Gansu formed a separated group. Furthermore, to the best of our knowledge, our study is the first to show the occurrence of mixed infections of highly divergent variants of PBNSPaV in individual host plant. The Pch-GS-3 and Pch-XN-3 isolates contained HSP70h variants in groups I and IV and CP variants in groups I and II, respectively. The presence of divergent variants in a single host plant may be a result of different infection events during horticultural operations or of vector (if it presents) transmission. Mixed infections with divergent variants could increase viral genotypic complexity, with implications for phylogenetic analysis and the evolutionary history of the virus.
Analysis of the complete genome sequences of three Chinese PBNSPaV isolates determined in this study, combined with two Chinese PBNSPaV isolates (Ta Tao 25 and Nanjing) sequenced by other researchers [16] also showed the genetic complexity of the Chinese PBNSPaV population. The genomic sequences of Pch-WH-1 from a peach and Plm-WH-3 from a plum were closely related to the type isolate PL186, sharing 96% overall nucleotide identity. However, Pch-GS-3 was found to be closely related to Ta Tao 25 (92.7% overall nucleotide identity), but less closely related to Plm-WH-3 and Pch-WH-1. Although Pch-WH-1 and Plm-WH-3 recovered from the same region (Wuhan) share high similarity of 94.8%, we cannot postulate that the sequence similarity is related to their same geographic origin. This similarity could be related to origin of their foremost isolate, from which variants were separated during dissemination. In addition, although it has been observed that the infection of PBNSPaV can induce severe symptoms or be latent, more extensive studies will be done to reveal the relationships between host phenotypes and viral genotypes. Moreover, the clustering pattern of the complete genome sequences was substantially concordant with the topology of the HSP70hderived trees, indicating that the evolutionary relationship of global PBNSPaV isolates can be reliably inferred using HSP70h sequences.
Recombination is a powerful driving force for generating new variants in RNA viruses [28][29][30]. Recombination events have been reported to be evolutionarily important in shaping the genomes of some viruses in the family Closteroviridae [27]. Multiple recombination events have been identified between Table 3. Genetic distances between and within variant groups of PBNSPaV partial HSP70 gene and complete CP gene. sequence variants of the Citrus tristeza virus [31][32][33]. As more sequences of PBNSPaV variants become available, we may be able to identify more potential recombination events between different variants. The presence of divergent PBNSPaV variants and co-infection with different variants enable the occurrence of recombination events. For the first time, we report evidence of significant recombination in the HSP70h gene of PBNSPaV variant Pch2. Another HSP70h variant, Pch2-3, also from the Pch2 isolate, and two other variants, Apr-YT-3-1 and Pch-XN-4-6, also showed clear crossover sites at regions having high sequence similarity with their parental variants [34], although these were not detected by the programs implemented in RDP3. However, no recombination events were detected in the CP gene. The results indicate that the HSP70h gene could be a hotspot of recombination in the PBNSPaV genome; our results contrast with those from GLRaV-3, which has been shown to contain recombination hotspots in its CP gene [35], although both viruses are in the same genus. The HSP70h genes of plant viruses in the family Closteroviridae have multiple biological functions and are used for phylogenetic classification of viruses in this family [1]. Most plant closteroviruses are insect-vector transmissible [27]. Thus, their capsid proteins play important roles in their ability to survive in both plant and insect environments. To date, no insect vector has been identified for PBNSPaV. Although mealybugs were frequently observed on PBNSPaV-infected fruit trees during this investigation, no insects were found to be PBNSPaV-positive in RT-PCR tests (data not shown). Some studies have indicated that vector-borne plant viruses are subjected to greater purifying selection on their capsid proteins than non-vectored viruses [36]. The population genetic parameters of both the HSP70h and CP genes suggest the possibility of a PBNSPaV population in expansion. Given the low dN/dS ratios found for HSP70h and CP, combined with the large portion of negatively selected sites identified in both genes, it appears that the virus may have undergone purifying selection, which is the primary evolutionary force acting on many plant viruses [37]. In addition, although HSP70h is more variable than CP as it is showed in Table 4, the higher dNs/dS value and the presence of positively selected sites in the CP gene indicate that the CP gene is under stronger selection pressure than that acting on the HSP70 gene. In accordance with the expectation that most mutations are deleterious or lethal to plant RNA viruses due to the compactness of viral genomes [38], the large portion of negatively selected sites existing in those two gene. The presence of a small number of positively selected sites in the CP gene might be necessary for adaption to different biological conditions. Since the successful infection of a virus in hosts depends on multiplex interactions between the virus and its host, beneficial mutations (positively selected sites) may affect interactions with host receptors and other host-specific molecules [39]. Extensive investigations will be necessary to generate detailed information about the viral population structure and the biological and epidemiological implications of its genetic diversity and possible transmission vectors.
Because the genetic diversity within global PBNSPaV populations has not been studied extensively, the results of this study provide important information on the genetic diversity of the Chinese PBNSPaV population and will advance the improved understanding of the epidemiology of the virus on a global scale and give insights into viral disease management [40]. From a practical perspective, the genetic diversity data generated in this study show that the HSP70h-specific primer pair is more robust and better, and that it is able to detect more PBNSPaV isolates than the primers used to amplify the CP gene. Considering the economic impact of the virus on many stone fruits [41], it is necessary and urgent to establish an efficient scheme for reducing the global transmission of the virus via propagation materials.

Sample collection
In total, 256 samples from 7 species of Prunus, including 98 from peach (P. persica), 49 from plum (P. domestica), 50 from sweet cherry (P. avium), 10 from flowering cherry (P. serrulata), 24 from flowering peach (P. persica), 18 from apricot (P. armeniaca), and 7 from nectarine (P. persica var. nucipersica), were collected in China between 2009 and 2013. All sample collections were conducted with approval from local institutes, and no specific permissions were required for these locations/activities. The study did not involve endangered or protected species. Symptoms of sampled trees consisting of stem pitting on tree trunks with thick corky bark, and dark-colored gummosis or spongy texture and severe cracks in the bark were observed on the trunks of some peach and plum trees. Virus-free seedlings of peach GF305 were used as negative controls in all tests for the detection of the virus.
The complete genome sequences of three PBNSPaV isolates, namely, Pch-WH-1, Plm-WH-3, and Pch-GS-3, were determined. Isolates Pch-WH-1 and Plm-WH-3 were collected from a peach tree and a plum tree grown in Wuhan city in central China, and the isolate Pch-GS-3 was taken from a peach tree grown in the Gansu province in western China. Based on our primary results of HSP70h sequencing, the three isolates distributed into three phylogenetic clades or sub-clades. Both peach tree infected by Pch-WH-1 and plum tree infected by Plm-WH-3 showed visible symptoms (Table 1) and represent two important host species of the virus. RT-PCR and Sequencing of HSP70h and CP genes Total RNA was extracted from leaves using a CTAB protocol, as described by Li et al. [42]. The presence of PBNSPaV was tested by reverse transcription (RT)-PCR using the primer set PBN-195-F/PBN-195-R (59-CTGGTCTTCCTGCTACTCCT-T-39/59-CGCTCTGAGATTGTGGGCTT -39), designed for the detection of the coat protein (CP) gene of the virus [17].
The primer pairs PBN-HSP-F/PBN-HSP-R (59-GGAATT-GACTTCGGTACAAC-39/59-TTCGGTGGTGGTACTTTCG-A-39) and PBN-CP-F/PBN-CP-R (59-TCTTGTTGGATCGGG-GAATA-39/59-CATCTTCCACCGGACTGATT-39), designed based on the corresponding sequences of the American PBNSPaV isolate PL186, were used for the amplification of partial HSP70h gene sequences and complete CP gene sequences, respectively. Reverse transcription was performed at 37uC for 1.5 h using 3 mL of total RNA and 1 mL of random primer in a 20-mL reaction volume with Maloney murine leukemia virus (M-MLV) reverse transcriptase (Promega, Madison, WI, USA), according to the manufactur-er9s protocols. PCR was performed in a 25-mL volume of reaction mixture containing 2.5 mL of 106PCR buffer, 0.5 mM dNTPs, 0.5 mM of each primer, one unit of Taq DNA polymerase (TaKaRa, Dalian, China), and 3 mL cDNA. PCR was performed using an iCycler Thermocycler (Bio-Rad, Hercules, CA). The PCR profile employed for all primer sets consisted of an initial denaturation at 95uC for 30 s followed by 35 cycles of 95uC for 3 min, 52uC for 30 s, 72uC for 1 min, and a final extension for 10 min at 72uC. The PCR products were separated by electrophoresis on a 1.2% agarose gel, stained with ethidium bromide, and visualized under UV light.
PCR products were gel purified and ligated into the pMD18-T vector (Takara, Dalian, China), following the manufacturer's instructions. The recombinant plasmids were identified after transformation into Escherichia coli DH5a. In order to obtain a view of molecular composition intra each isolate, at least three positive clones of each product were sequenced at Shanghai Sangon Biological Engineering & Technology and Service Co. Ltd, Shanghai, China.

Determination of complete genome sequences
For the amplification of PBNSPaV genomes, primer sets (Table S2 in File S1) were designed based on the CP and HSP70h gene sequences obtained, and then the amplifications were extended toward the 39-and 59-ends using primer sets designed based on the sequences obtained here and the genome sequence (EF546442) of the first PBNSPaV isolate PL186, available in GenBank. 59-and 39-RACE reactions were attempted using an Invitrogen GeneRacer Kit (Invitrogen, USA), according to the manufacturer's instructions. PCR solutions and conditions were similar to those mentioned above, except that 2 mM of each dNTP was used in a 25-mL reaction volume, with annealing for 45 s at 50-54uC (depending on the primer set used in each reaction), with extension for 1-3 min (depending on the size of the PCR product) at 72uC. All products were cloned and sequenced as mentioned above. To overcame inconvenient caused by intra-isolate sequence diversity and avoid mistakes in sequence assembling, the adjacent amplicons were overlapped for .100 bp, and at least three clones of each PCR product were sequenced. The obtained sequences were assembled into a contiguous sequence at a standard of over 99.9 % similarities at each of overlapped regions.

Phylogenetic and recombination analyses
The sequence alignment (produced using Clustal W) was imported into the program MEGA5, and the phylogenetic tree was constructed using the neighbor-joining method with 1,000 bootstrap replicates [24,43]. The number of polymorphic sites, single variants, parsimony-informative sites, invariant sites, haplotypes, and diversity (h) were determined using Dnasp5 software package. To obtain an overview of the Chinese PBNSPaV population, the eighteen HSP70h sequences reported by Cui et al [17], the complete genome sequences of the five PBNSPaV isolates available in GenBank, and the two HSP70h sequences available in GenBank were included in the corresponding sequence analyses. The sequence sources and GenBank accession numbers of the PBNSPaV isolates used in the sequence analysis are listed in Table  S1 in File S1.
Finally, the sequence alignments for the HSP70h and CP genes were scanned using seven programs for detection of recombination, implemented in the software RDP v.3.27 [44], for evidence of recombination and to determine putative recombination events, potential recombinants, and their parental sequences. Only recombination events detected by at least four methods were included in further analyses. Meanwhile, recombination profiles were constructed by using the software Recco [45].

Supporting Information
File S1 Supporting tables. Table S1, The sources of HSP70h gene and complete genome of PBNSPaV isolates referred from GenBank. Table S2, Primers used for the amplification of the genomes of PBNSPaV isolates Pch-WH-1, Plm-WH-3 and Pch-GS-3.

(DOCX)
File S2 Supporting figures. Figure S1, Nucleotide sequence alignments (A) and phylogenetic analysis (B) of the intergenic spacer (IS) region. The tree was constructed using the maximum likelihood method. Figure S2, The Recco output for molecular variants Pch2, Pch2-3, Apr-YT-3-1, and Pch-XN-4-6, based on their HSP70h sequences. The possible crossover sequences for each variant are marked by two arrows. The possible recombinant sequences are shown in white, and sequences highly similar to recombinant sequences are marked in red. (PPTX)