Molecular Variability and Distribution of Sugarcane Mosaic Virus in Shanxi, China

Background Sugarcane mosaic virus (SCMV) is responsible for large-scale economic losses in the global production of sugarcane, maize, sorghum, and some other graminaceous species. To understand the evolutionary mechanism of SCMV populations, this virus was studied in Shanxi, China. A total of 86 maize leaf samples (41 samples in 2012 and 45 samples in 2013) were collected from 4 regions of Shanxi. Results Double-antibody sandwich (DAS)-ELISA and RT-PCR showed 59 samples (30 samples in 2012 and 29 samples in 2013) to be positive for SCMV, from which 10 new isolates of SCMV were isolated and sequenced. The complete genomes of these isolates are 9610 nt long, including the 5′ and 3′ non-coding regions, and encode a 3063-amino acid polyprotein. Phylogenetic analyses revealed that 24 SCMV isolates could be divided on the basis of the whole genome into 2 divergent evolutionary groups, which were associated with the host species. Among the populations, 15 potential recombination events were identified. The selection pressure on the genes of these SCMV isolates was also calculated. The results confirmed that all the genes were under negative selection. Conclusions Negative selection and recombination appear to be important evolutionary factors shaping the genetic structure of these SCMV isolates. SCMV is distributed widely in China and exists as numerous strains with distinct genetic diversity. Our findings will provide a foundation for evaluating the epidemiological characteristics of SCMV in China and will be useful in designing long-term, sustainable management strategies for SCMV.


Introduction
Maize is one of the most important and widely cultivated food crops in the world [1][2]. USA is the leading producer of maize, followed closely by China. China produces about 30% of the world's maize, amounting to 220 million tons in 2013. Within China, it is mainly grown in Jilin, Heilongjiang, Shanxi, Shandong, Hebei, Henan, Shaanxi, Sichuan, Hubei, and Hunan provinces [3][4]. In Shanxi alone, maize production was over 3 million tons in 2013 [5], valued at over $ 1.09 billion. Viral diseases pose a threat to maize production and cause economic losses [6]. Currently, three viruses have been reported to infect maize in Shanxi, among which Sugarcane mosaic virus (SCMV) is one of serious threat [7].
SCMV belongs to the genus Potyvirus within the family Potyviridae [8][9]. Potyviruses have a single-stranded positive-sense RNA genome. The genome of SCMV is approximately 9.6 kb long, covalently linked to a virus genome-linked protein at its 5 0 terminus and poly (A) at its 3 0 terminus [10]. The genome encodes a single large polyprotein, which is subsequently cleaved into 10 mature proteins (P1, HC-Pro, P3, 6K1, CI, 6K2, NIa-VPg, NIa-Pro, NIb, CP) by 3 selfencoded proteinases [10][11]. SCMV is easy to mutate because of the weak proofreading activity of RNA-dependent RNA polymerase, short generation time, and large population size [12][13][14]. As a consequence, the virus exists as numerous strains and replicates as complex and dynamic mutant swarms [14][15]. Understanding the genetic structure and the molecular variability factors of SCMV is not only an important aspect of evolutionary biology but also could be useful for virus management.
In recent years, numerous studies have been performed on the biology and genome characterization of SCMV worldwide [14][15][16]. One hundred and seventy-three SCMV isolates were grouped into five groups (sugarcane, maize, Thailand groups, the noble sugarcane and Brazil groups) based on CP gene sequences [17]. In further study, most of the codons of the CP gene proved to be under negative selection, and recombination also existed within the CP cistron [12]. The previous studies were based mainly on the CP gene due to the lack of whole genome sequences. In this study, SCMV isolates were collected from 4 regions (Xinzhou, Jinzhong, Linfen, and Yuncheng) in Shanxi during 2012 and 2013, and were tested by double-antibody sandwich (DAS)-ELISA and RT-PCR. The genomes of these SCMV were sequenced and compared with those available from online databases. The whole genome of SCMV, including the 3 0 and 5 0 termini, is 9610 nt long. It contained a single large open reading frame (ORF) (Fig 1). The putative ORF starts at AUG (148-150 nt). It encodes a polyprotein of 3,063 amino acids with an estimated molecular weight of 346.13 kDa. The polyprotein is subsequently processed into ten proteins (P1, HC-Pro, P3, 6K1, CI, 6K2, NIa-VPg, NIa-Pro, NIb, and CP) (Fig 1).

Nucleotide Sequence Similarities and Phylogenetic Analyses
To further understand the genetic relationships among the global SCMV isolates, 24 isolates (10 isolates from this study and 14 isolates from Genbank database) were used for phylogenetic analysis. According to the phylogenetic tree, the 24 isolates were clustered into two groups (Fig 2). Group I included 18 isolates, all of which were isolated from maize collected from different sites (16 isolates from China, 2 isolates from Mexico). Group II contained 6 isolates, which were isolated from sugarcane from different regions spanning three continents (3 isolates from China, 2 isolates from Argentina, and 1 isolate from Australia). Nineteen isolates from China were classed into two different groups, while the isolates from maize grouped together. These results confirmed that the molecular diversity of SCMV isolates was closely associated with host species and not with geography.
Genetic distances within and between groups were calculated to determine the molecular diversity of the 24 SCMV isolates. The within-group genetic distances of group I and II were 0.0579 ± 0.0037 and 0.1907 ± 0.0115, respectively, and the inter-group genetic distance between group I and II was 0.1504 ± 0.0091 (Table 3). The inter-group genetic diversity was higher than the within-group genetic distances. This result suggested that it was host type rather than geography that played an important role in the genetic diversity of SCMV isolates.
The Fst values (the interpopulational component of genetic variation, or the standardized variance in allele frequencies across populations) were measured to test the degree of differentiation among populations. Since the Fst values within groups were less than 0, the isolates within groups were highly similar and less differentiated among populations ( Table 3). The Fst value between group 1 and 2 was as high as 0.25, which confirmed that the isolates between groups had a very high genetic differentiation (Table 3).

Recombination Analysis and Selection Analysis
The potential recombination events in the genome sequences of 24 SCMV isolates were detected using the recombination detection program (RDP3) [18][19][20], and a total of 15 recombinant genomes resulting from 20 recombination events (Fig 2, Table 4).
To further analyze selection pressure on the 24 SCMV isolates, the ratio between mutations in the non-synonymous and synonymous sites (d N /d S ratio) were calculated. The d N values for all genes of the 24 isolates were less than the d S values (d N /d S ratio < 1), which indicated that all the SCMV isolates were under negative selection (Table 5). In the polyprotein gene, 1085 sites were under negative selection (35.25%), with the maximum (282 sites) in CI and the minimum (7 sites) in 6K2 (Table 5).

Neutrality Tests and Population Demography
The mismatch distributions of SCMV were evaluated using concatenated sequences. The shapes of mismatch distributions of SCMV for all groups were multimodal and ragged (S1 and S2 Figs), indicating that all these populations were stable. Tajima's D values for all SCMV populations were positive (Table 3), which supported that these populations were contracting. The p-values were not significant in any population. This result showed that the deduction might be less convincing (Table 4).

Discussion
SCMV is a major threat to the maize production in China [12]. We collected SCMV populations and sequenced positive isolates of Shanxi in 2012 and 2013. These isolates showed different genotypes and formed different clades in the phylogenetic tree. Since the mismatch distributions of all SCMV isolates were multimodal and ragged (S1 and S2 Figs), all SCMV isolates had a trend of population diffusion. An increasing number of isolates with genetic differentiation might be generated in the field. Because of negative selection and recombination, genetic differentiation might be increasingly pronounced, and high-virulence strains might be generated in the field. This explains why SCMV exists in maize in the form of populations with ever-increasing molecular variability in the field. These potential high-virulence isolates are a potential threat to maize cultivated varieties, even those carrying resistance genes, because of the high possibility of overcoming the resistance genes. Considering the evolution mechanisms, an integrated management measures are needed to control SCMV, which should include breeding resistant cultivars, and controlling insect vectors (such as aphids) to prevent SCMV transmission to other crops. According to the results, we can understand that SCMV isolated from maize is greatly different from SCMV isolated from Sugarcane. In China, maize is mainly grown in the north, while sugarcane is in the south. The differences of environment between north and south may generate the selection pressure on SCMV. Meanwhile, host is also one of the most important selection pressure. The genetic diversity of SCMV is mainly adapting to the complex and different conditions. Negative selection on the SCMV was also detected in this study. Under the negative selection, SCMV constantly accumulated the available variations to adapt to the difference of conditions. For RNA viruses, recombination is a natural phenomenon and played an important role in evolution. Recombination events have been reported in SCMV. Based on the CP gene of SCMV, six recombinants were detected [12]. Although the field study was not carried out to test the potential recombination events, it at least showed that recombination was a natural phenomenon for SCMV. Recombination may play an important role in the evolution of SCMV, and be an important reason for SCMV genetic diversity.
The conservation of ten genes of SCMV differed from each other. As a member of the genus Potyvirus, the complete RNA genome of SCMV encodes a large polyprotein, which is processed into 10 mature proteins by 3 virus-encoded proteases after translation [10][11]. These 10 proteins play different roles in the life cycle of SCMV (infection, replication, movement, and transmission) [21][22][23][24]. We found that the nucleotide identity in the CP gene was the highest, followed by CI, HC-Pro, and P3, the lowest nucleotide identity was found in P1. This may be attributed to the various roles of these genes during the SCMV life cycle. The phylogenetic trees of 24 SCMV isolates were also diverse when based on different genes, which confirmed that it was more accurate to study genetic diversity of SCMV based on the whole genome than on one gene or partial sequence. In addition, the probability of recombination was diverse for different genes. Most of recombination events were found in 6K2, NIa-VPg, NIa-Pro and CI gene, while some were in P1 and CP genes (Fig 1). These results showed that the genes which are near 5 0 and 3 0 termini were more conserved, and recombination may be an important reason for the phenomenon.
Taken together, our results demonstrated that SCMV isolates formed two divergent evolutionary groups. Host, negative selection and recombination were found to be the important evolutionary factors shaping the genetic structure of these SCMV populations. Using infectious clones of SCMV should facilitate the study of gene function and biological characteristics. Our findings provide a foundation for evaluating the epidemiological characteristics of SCMV in China and will be useful in designing long-term, sustainable management strategies for SCMV.  Table 1.  Table 6.

RT-PCR, Cloning, and Sequencing
RNA was extracted from samples using the Universal Plant Total RNA Extraction Kit (DP405-02, BioTeke, China), and cDNA was synthesized using the Prime Script RT reagent Kit (D6130, TaKaRa, Japan). PCR was carried out in a 25-μL PCR mixture including 2 μL of cDNA template, 2.5 μL of 25 mM Mg 2+ (M2101, Promega, USA), 2.5 μL of a dNTP mixture with each dNTP at 5 mM, 2.5 μL of 10× polymerase buffer (M2101, Promega, USA), 0.5 μL of 5 U/μL Hot-start Taq polymerase (M2101, Promega, USA), and 2 μL sense and antisense primers (10 μM each). The reaction process was as follows: denaturation at 94°C for 3 min; 35 cycles of denaturation at 94°C for 30 s, primer annealing at 52°C for 1 min, and primer extension at 72°C for 2 min; and final extension at 72°C for 10 min. For the 5 0 -terminal and 3 0 -terminal sequence, 5 0 RACE and 3 0 RACE reactions were conducted using the 5 0 RACE and 3 0 RACE system (D315, TaKaRa, Japan). The size of PCR products was examined by 2% agarose gel under UV light. The positive bands were purified from the agarose gel using a gel extraction kit (DP204-02, BioTeke, China). These fragments were inserted into a pGEM-T simple vector and cloned into Escherichia coli JM109. For each fragment, at least 3 clones from each ligation were sequenced. If there was any difference at any position of the sequences, at least 4 clones were sequenced to obtain the consensus sequence.
Specific PCR primers were designed for primer walking and obtaining the fragment sequences. The complete nucleotide sequences of all SCMV isolates were generated based on the fragment sequences and SCMV genome sequences deposited in the GenBank database using ClustalX program [25]. The GenBank accession numbers of all SCMV isolate genome sequences are listed in S1 Table.

Recombination and Phylogenetic Analysis
The high-similarity sequences of the SCMV isolates were selected for further analysis by BLAST (http://www.ncbi.nlm.nih.gov/BLAST/). Multiple alignments of nucleotide sequences and corresponding amino acid sequences were performed using MultAlin (http://bioinfo. genotoul.fr/multalin/multalin.html) [26]. The recombination analysis of these SCMV isolates based on the whole genome was carried out using the recombination detection program (RDP3) [18]. The 7 methods (RDP, GENECONV, BOOTSCAN, MAXCHI, CHIMAERA, SIS-CAN, and 3SEQ) implemented in RDP were used in the recombination analysis [18][19][20]. An event detected by at least 5 different methods and with p-values < 10 −6 was considered to be a positive recombination event [18][19][20]27]. Phylogenetic relationships were determined by neighbor-joining in MEGA 5 [27]. Bootstrap analysis with 1,000 replicates was performed to evaluate the significance of the internal branches. Branches with less than 70% bootstrap value were collapsed.

Analysis of Genetic Distance and Selection Pressure
The genetic distances of SCMV isolates within and between groups were calculated by the maximum composite likelihood method in MEGA 5 [27]. The selection pressure was estimated by the d N /d S ratio, where d N represented the average number of non-synonymous substitutions per non-synonymous site and d S represented the average number of synonymous substitutions per synonymous site. The values of d N and d S were estimated using the PBL method in MEGA 5 [27]. The gene is under positive (or diversifying) selection when the d N /d S ratio is >1, neutral selection when d N /d S ratio = 1, and negative (or purifying) selection when d N /d S ratio < 1 [28].

Demography Analyses
Tajima's D statistical test was performed to analyze the population changes in SCMV by DnaSP 5.0 [29][30]. Tajima's D measures the departure from neutrality for all mutations in a genomic region. The purpose of the test is to distinguish randomly and non-random process for a DNA sequence. In the mismatch distribution, a smooth unimodal Poisson distribution indicated that the population had a star-like phylogeny due to the accumulation of low-frequency mutations during a recent expansion; ragged multimodal distributions indicated that the population was experiencing long-term demographic stability [28].