Genetic Diversity of Sitobion avenae (Homoptera: Aphididae) Populations from Different Geographic Regions in China

Sitobion avenae is a major agricultural pest of wheat in China. Using microsatellite markers, we studied the potential gene flow, genetic diversity, genetic differentiation, and genetic structure of seven S. avenae populations from different regions of China (Beijing, Hebei, Henan, Hubei, Jiangsu, Shandong, and Shanxi provinces). The populations from Henan, Shandong, and Jiangsu showed high levels of genic and genotypic diversity. By contrast, the genic diversity in the Beijing and Hebei populations was much lower. Despite this low genic diversity, the genotypic diversity of the Beijing population was higher than that of all of the other populations, except those from Jiangsu and Shandong. Overall, the genetic divergence among the seven S. avenae populations tested was high, though there was almost no differentiation between the Shandong and Henan populations. We observed significant negative correlation between the strength of gene flow and the geographic distances among populations. Based on genetic analysis, the seven S. avenae populations studied can be divided into four distinct clusters; (i) Hubei, (ii) Shanxi, (iii) Beijing and Hebei, and (iv) Shandong, Henan, and Jiangsu. The present results provide a basis for potentially optimizing integrated pest management (IPM) programs in China, through adapting control methods that target biological traits shared by various populations of the same genotype.


Introduction
Several wheat aphid species are major agricultural pests in China, notably Sitobion avenae, Rhopalosiphum padi (L.), Schizaphis graminum (Rondani), and Acyrthosiphon dirhodum Walker [1]. Of these, S. avenae is the most dominant and destructive wheat aphid in China [2,3]; it affects about 13 million hm 2 per year, and yield losses can be up to 40% [4]. Wheat aphids infest cereal crops and cause direct economic losses through sucking sap, and indirect losses by vectoring plant viruses [5].
Wheat aphids have complex life cycles that are known to be highly affected by climate [6]. Some aphid species exhibit great flexibility in the selection of their reproductive mode [7][8][9]. All lifecycle types reproduce parthenogenetically for most of the year; some species have an annual sexual phase [10]. The lifecycle of cyclic parthenogenic species have a sexual reproduction phase once a year, and overwinter as asexually produced eggs; this is a more reliable strategy when winters are harsh [11][12][13][14][15][16]. Cyclic parthenogenesis is the dominant mode of reproduction, especially in the regions with harsh winters [15,17]. However, if the climatic conditions allow, some clones are obligate parthenogenic, they do not respond to autumnal cues and have no sexual phase. There are also intermediate types which employ both sexual and parthenogenetic reproduction [6,9,15,18]. A study showed that the asexual genotypes of Rhopalosiphum padi had higher genetic variation in fitness compared to the sexual genotypes [19], indicating that the reproductive mode greatly impacts the fitness of wheat aphids.
S. avenae can survive on numerous plant species, including all of the cereals, many other monocots, and certain dicots [20]. Divergent selection on different host plants greatly influences the diversification of the aphids, and imposes considerable selective pressure on aphids' evolution [21]. Host-transfer experiments for clones of S. avenae collected from oat and barley showed that their fitness traits differed significantly, indicating a genetic basis for their differentiation [22].
Aphids have complex lifecycles and adaption to diverse hosts; this means that any given geographical population of aphids will not be homogeneous [23]. If the migration scale is large enough, the population genetic structure and evolutionary trajectory will be influenced [24]. As different kinds of aphids have different abilities to fly and face different pressures leading to migration, they act out different migration behaviors [25]. The flight ability of winged aphids is very weak [26,27], and they migrate for long distances by largely depending on wind forces [28].
Clonal selection is very important in the study of population genetic structures of wheat aphids. Ecological adaption may occur quickly because of the rapid propagation of asexual offspring [29]. There are many factors influencing clonal selection. These include climatic, host plant, microclimates, crop density, natural enemy pressure, and resistance to pesticides [15,30,31].
It is difficult to understand the ecology of wheat aphids, as they tend to migrate with the wind [32]. Additionally, wheat aphids are small and have short life spans and large population sizes. Populations can be diluted rapidly in the air and migrate frequently [26]. Therefore, it is difficult to monitor the migration of wheat aphids using ecological approaches. Currently, polymorphic genetic markers have been widely used in the studies of population ecology [25].
In the past 20 years, many studies addressing the flying behavior and genetic structure of aphids populations have been conducted using microsatellite markers [15,26,27,33,34]. As noted, wheat aphids have different lifecycles based on the climate conditions of various geographic regions; the most suitable clone lineages for a given area are selected. Due to lifecycle strategy and clonal selection, studies of the aphids in the same region, but in different years, may also present different population genetic structures [15,35]. However, the gene flow among different geographical populations can also be reflected in population genetic differentiation and population structure [34][35][36]. The management methods used to control wheat aphids can lead to new phenotypes emerging to overcome strong pressures such as pesticide applications, so quantification of the genetic diversity of populations is very important for the management of wheat aphids [37].
In the present study, we used five microsatellite loci (Sm10, Sm11, Sm12, Sm17, and Sag4), which had been used previously [15,25,38,39,40], to characterize the potential gene flow, genetic diversity, genetic differentiation, and genetic structure of seven S. avenae populations from seven different geographic regions in China. To interpret the population structure and dynamics of S. avenae, we need to better understand the importance of migration and clonal selection, which may have relevance in forecasting aphid outbreaks. Our results also provide a foundation for optimizing integrated pest management (IPM) programs through adapting control methods according to biological traits shared by various populations of the same genotype.

Aphid Sample Collection
Wingless adults of S. avenae were collected from seven sites across six provinces (Hebei, Shandong, Henan, Shanxi, Jiangsu, and Hubei province) and the Beijing municipality of China. These are the main wheat producing provinces of China. The site locations are presented in Figure 1. The fields, where the S. avenae collected from, are agricultural experiment fields belong to plant protection research stations at Cangxian (Hebei province, China), Liaocheng (Shandong province, China), Xihua (Henan province, China), the Yanhu distriction of Yuncheng (Shanxi province, China), Dongtai (Jiangsu province, China), Zaoyang (Hubei province, China) and the Agricultural Experiment Station of China Agricultural University (Beijing, China), respectively. 30,40 individuals were collected from each site, and each aphid was taken from a different field in order to minimize resampling of the same clone. Samples that were collected from the same region were considered as one 'population'. The collected specimens were taken back to the laboratory in 100% ethanol.

DNA Extraction and PCR of Microsatellite Loci
Genomic DNA was extracted from wingless adult aphid individuals using DNAVzol (Vigorous Biotechnology Beijing Co., Ltd.), according to the manufacturer's instructions.
Two hundred forty six aphids were examined at five microsatellite loci (Sm10, Sm11, Sm12, Sm17, and Sag4). The microsatellite loci of Sm10, and Sm11, Sm12, Sm17 were isolated from S. miscanthi [38], and Sag4 was isolated from S. avenae. Sm11 is xlinked, and the other three loci are autosomal [40]. The SaS4    locus is also autosomal [15].  [25]. They have verified before application. The PCR conditions were the same as those of Simon et al. (1999) [15]. PCR reactions were performed in a 20 mL reaction volume. Each reaction mixture contained 0.2 mL of 5 U/mL rTaq polymerase, 2.0 mL of each 2.5 umol dNTP and 2.0 mL of 610 buffer (TaKaRa Biotechnology (Dalian) Co.,Ltd.), 1 mL of 10 mM of each primer (SANGON), and 1 mL of DNA template (approximately 10 ng). The PCR products were examined using an ABI3730*1 instrument and the allele sizes were analyzed using Genemapper 3.0 software (Applied Biosystems).

Data Analysis
The genotype data is presented in File S1. Departure from Hardy-Weinberg equilibrium (HWE) was tested under the hypothesis of heterogeneity deficit and excess using Genepop [41,42]. The score test (U test) was used. Linkage disequilibrium among the microsatellite loci were also tested using Genepop [41,42]. The null hypothesis was ''genotypes at one loci are independent from genotypes at the other loci'', and the default test statistic was the log likelihood ratio statistic (G-test). During the processes of identification and isolation of microsatellite sequences using primers and amplification by PCR, the following errors can occur: 1. One or more alleles fail to amplify during PCR (null alleles); 2. Slight changes occur in the allele sizes during PCR (stuttering); 3. Large alleles do not amplify as efficiently as small alleles. We detected these errors using Micro-Checker v2.2.3 [43]. This application calculates the frequency of any null alleles detected, using the methods described by Chakraborty et al. (1992) [44] and Brookfield (1996) [45]. The indices of genetic diversity including the number of alleles (N a ), the richness of alleles (A r ), the observed heterozygosity (H O ), the expected heterozygosity (H E ), gene diversity (H S ), and the inbreeding index (F IS ) for each population were calculated using FSTAT v2.9.3 [46]. Na is the mean number of alleles in each sample. It can be calculated according to the following equation: N a~X r j~1 N j r N j is the number of alleles at microsatellite loci j in one sample. r is the number of all loci. The richness of alleles (A r ) is the mean of R s (allelic richness per locus and population) across all microsatellite loci. R s is a measure of the number of alleles and is independent of sample size, hence allowing one to compare R s between different sample sizes. However, the observed number of alleles in a sample is highly dependent on sample size. To bypass this problem, El Mousadik and Petit (1996) [47] suggested the adaption of the rarefaction index of Hurlbert (1971) [48] to population genetics. The principle is to estimate the expected number of alleles in a sub-sample of 2n genes, given that 2N genes have been sampled (N$n). In FSTAT, n is fixed as the smallest number of individuals typed for a locus in a sample. Allelic Richness is then calculated according to the following equation: Table 2. Indices of genetic diversity for the seven S. avenae populations of China sampled in 2012. Where N i is the number of alleles of type i among the 2N genes. Note that each term under the sum corresponds to the probability of sampling allele i at least once in a sample of size 2n. In FSTAT, estimates of gene diversity per locus and sample use an unbiased estimator [49]: Where n is the size of a sample. p i is the frequency of allele A i in a sample. H o is the observed proportion of heterozygotes in a sample. The multilocus genotypes (MLG) were classified based on the length of alleles, using software that we developed in house. The genotype diversity (K) of each population was calculated as follows: K = G/N, where G is equal to the number of MLGs, and N is the number of samples [25].
F-statistics (F IT , F IS , and F ST ) and pairwise F ST [50], were estimated using FASTAT v2.9.3 [46]. Nei's standard genetic distance (D S ) [51] and Nei's genetic distance (D A ) [52] were calculated using the Dispan program [53]. An indirect estimate of gene flow was calculated as [54]. To analyze the relationship between F ST and geographic distance, a Mantel test was carried out. The matrices of pairwise F ST and the geographic distance between two populations was transformed by Genepop (the geographic distances were transformed to the natural logarithm (Ln) of geographic distances in kilometers), and a regression was performed using SPSS 17.0.
To identify the population structure, two clustering methods were used. First, we constructed phylogenetic trees (dendrograms) using the unweighted pair group-method with arithmetic mean (UPGMA) [55], based on Nei's genetic distance (D A ) using the Dispan program. Second, factorial correspondence analysis (FCA) was implemented in Genetix v4.04 [56], to examine the threedimensional spatial distribution of genetic variation for each individual.
An analysis of molecular variance (AMOVA) was implemented in Arlequin v3.0 [57], to confirm population clusters and to examine the component variance of genetic differentiation among populations.

Genetic Diversity
We screened 246 aphids collected from six provinces and the Bejing municipality of China for five microsatellite loci. There was no large allele dropout or stuttering observed in the result. Of the 35 tests, there were 6 tests in which null alleles were found to exist; they appeared at the Sm11 and Sm17 microsatellite loci. However, no particular microsatellite loci had null alleles in all of the populations. The frequencies of the null alleles were not more than 0.08 (Table 1). In the 70 linkage disequilibrium tests, there were 12 tests with significant linkage (Table 1). However, no fixed combination of any two microsatellite loci showed significant linkage in all of the populations, and the 5 microsatellite loci selected here can estimate genetic mutation independently. Five populations deviated from HWE significantly because of heterozygote deficit; both Beijing and Shanxi populations were the exception to this. The Shanxi population deviated from HWE because of heterozygote excess ( Table 1). The F IS values of  The genic diversity level of the Beijing population was the lowest; the values of Na, Ar, and H S of the Beijing population were 6, 3.65, and 0.39, respectively. However, the genotypic diversity of Beijing population was higher (K = 0.925) than all of the other populations, except for the Jiangsu and Shandong populations.
We observed 215 multilocus genotypes from the 246 aphids tested. The same genotypes were observed within the Hubei, Henan, Shanxi, Hebei, and Beijing populations, but the same genotype was not found within the Jiangsu and Shandong populations. However, one individual in 31 aphids from Shandong was found to share the same genotype with one individual from Jiangsu province. There were not any genotypes shared among the other five populations.

Genetic Differentiation
Population differentiation was analyzed using pairwise F ST values ( Table 3) According to an analysis of molecular variance (AMOVA), the S. avenae populations sampled had high levels of genetic differentiation (F ST = 0.2547) ( Table 4). Genetic variation between populations accounted for 25.47% of the total genetic variation.
The migration number among different geographical populations has been proposed to be represented as gene flow (N m ) [54]. There is little divergence if N m .1. However, if N m ,1, it means that gene flow could not counteract the divergence caused by genetic drift [58]. In the present study, the level of genetic differentiation among the seven S. avenae populations was N m = 0.899 (Table 5), and the gene flow was not large enough to offset the genetic drift. The biggest gene flow (N m = 6.277) was observed between the Henan and Shandong populations ( Table 6). The values of N m between Beijing and the other populations were less than 1, except for the Hebei population. The gene flow between the Hubei population and the other populations was also low (N m ,1).
Overall, the genetic divergence of the populations had significant positive correlation (r 2 = 0.401, p = 0.002) with geographic distance, according to the results of the Mantel test.

Population Structure
We used two methods for clustering analysis, and a small divergence appeared between the two sets of results.
First, we classified the seven geographical populations into three clusters using the UPGMA method ( Figure 2): (i) Hubei, (ii) Shanxi, Beijing and Hebei, and (iii) Shandong, Henan, and Jiangsu.  Second, the FCA results divided the seven populations into four groups ( Figure 3); (i) Hubei, (ii) Shanxi, (iii) Beijing and Hebei, and (iv) Shandong, Henan, and Jiangsu. The only differentiation between results of the two clustering methods was that the Shanxi population formed its own cluster in the results of the FCA method.
However, the two kinds of genetic distance (D S and D A ) ( Table 7) between the Shanxi and Beijing populations were larger than between Shanxi and the other populations, except for Hubei. Therefore, the Shanxi and Beijing populations cannot be classified into a single group.
We classified the seven populations into 4 groups (i. Hubei; ii. Shanxi; iii. Beijing and Hebei; iv. Shandong, Henan, and Jiangsu.), and conducted AMOVA to detect the variance components. The results showed that the variances among groups and among populations within groups accounted for 18.34% and 9.53% of the total variance, respectively (Table 8), and that the variance between groups was significantly larger than the variance among populations within a particular group (F CT = 0.18399). This result indicates that gene flow among the four groups was strictly restricted. The AMOVA results thus confirmed the conclusions of the clustering analysis.

Discussion
S. avenae is a migratory pest insect that is widely distributed throughout the wheat growing regions of China [59]. In the present study, we analyzed the population genetic diversity, population genetic differentiation, and genetic structure of seven different S. avenae geographical populations in China using microsatellite marker technology. These research results provide a basis for efforts to optimize integrated pest management (IPM) programs through adapting control methods which target the biological traits shared by various populations from the same genotype.

High Level of Genetic Diversity
The genetic diversity of aphids is influenced by many factors such as climate, host plants, topography, and physiognomy [60]. Wheat aphid populations frequently exhibit high levels of genetic variation [61]. In the present study, a high level of genetic diversity was detected at all of the five microsatellite loci. There were 32, 41, 35, 60, and 13 alleles at the Sm10, Sm11, Sm12, Sm17, and Sag4 loci, respectively. The genetic diversity level in our study (average = 36.2 alleles/locus) was much higher than that detected by Guo et al. (2005) (average = 9.6 alleles/locus) [61]. In particular, the genic diversity of the Henan and Shandong populations was higher than the diversity of the other populations.  This may be due to the fact that their geographic location had more gene flow with the surrounding wheat production regions. In our results, genotypic diversity was also quite high. Two hundred fifteen multilocus genotypes were distinguished in the 246 individual samples. A given genotype was rarely present in the different geographic populations, which may result from a prevalence of sexual reproduction.
Many factors can affect the heterozygosity of populations. The expected heterozygosity (H E ) of the Hebei, Henan, and Shandong populations was lower than their observed heterozygosity (H O ). According to previous aphid genetic studies, factors like inbreeding, the Wahlund effect, and the presence of null alleles and/or clonal selection have been implicated in the generation of heterozygote deficits [39,62]. As S. avenae has the ability for long-range dispersal, and their host plants are common and widespread, strong inbreeding and the Wahlund effect might not occur in this aphid [39,63]. There are also many reasons for heterozygote excess, including rarely migrating between locations, asexual lineage expansion, and/or the accumulation of heterozygosity by mutation in longer-term parthenogenesis [64]. In our results, the greatest heterozygosity excess was present in the Shanxi population, and the F IS value of the Shanxi population was extremely negative. This might be associated with the geography of Yuncheng of Shanxi. Shanxi province is mountainous, which might restrict aphid migration.

Genetic Differentiation and Genetic Structure
The genetic structure of populations reflects the interaction of genetic drift, mutation, gene flow, migration, and natural selection [25]. There is absolutely no genetic differentiation among populations when F ST = 0, and when the F ST = 1, the populations can be recognized to have differentiated completely. The threshold of F ST value can be divided into 4 levels: (i). In the present study, there was a high level of genetic differentiation (F ST = 0.2547) present in the S. avenae populations of China. Various factors might affect the divergence of S. avenae populations in different sites, including reproductive mode, clonal selection, migration, and so on. Additionally, genetic differentiation might be accelerated by the high level of variation in S. avenae population size [66].
A study showed that the aphid clones on the winter host plant did not significantly contribute to the spring/summer population build-up in the cereal fields over short distances, but that the density of sources for early migrants, on a regional scale, is important for the establishment of the population [67]. Therefore, the migration of aphids plays an important role in genetic structure. Aphids can become suspended in the air, and thus fly for long distances with the wind [10,68]. This flight phenomenon can overcome the combination of the forces of selection, genetic drift, and mutation, if the scale of the migration of S. avenae is large enough [25]. The time we collected the wheat aphids samples was the peak time of their breeding. It is known that when the population density of S. avenae is high enough and if the host plant cannot supply adequate nutrition, winged aphids will be produced, and a large scale migration can occur. In our results, there was little differentiation between the Henan and Shandong populations (F ST = 0.0383), which might be attributed to large scale migration and/or the local pest management strategies. The Mantel test showed that the genetic divergence of the populations had a significant positive correlation with the geographic distances between the populations. This means that the gene flow between Table 7. The standard genetic distance (D S ) and Nei's unbiased genetic distance (D A ) of pairwise comparisons among the seven populations. populations was significantly associated with the distance among the locations, and that the gene exchange among the different geographic populations could be affected by geographical isolation. For example, the genetic divergence between Beijing (the most northern site in our study) and Hubei (the most southern site in our study) populations was the highest as compared to the others. However, geographical distance didn't explain the study results completely. According to the results published by Huang et al. (2013) [69], some vital life-history traits (the developmental time of the instars of nymphs, postreproductive timing, and total lifespan of adults, and so on) were different for aphid populations living North or South of the Qinling Mountains. The divergence and gene flow among S. avenae populations might be impacted by the barrier of the Qinling Mountains. In the present study, the Qinling mountains and Dabie Mountains separate Zaoyang city of Hubei from the other 6 sites. There was great genetic differentiation between Hubei and the other populations, which might result from a limited dispersal of aphids. Similar phenomena of genetic isolation resulting from by geographical barriers were also found in Eriosoma lanigerum [70], two tansy-feeding aphids (Macrosiphoniella tanacetaria and Metopeurum fuscoviride) [33], and Diuraphis noxia [34].
Populations of S. avenae in China might be classified into 4 clusters based on our research results. The population from Shanxi had less communication with other populations because of Taihang mountain, which may have obstructed the migration of the wheat aphids. However, the geographic location of Shanxi is adjacent to Henan, Hebei, and Shandong, and the location of sample collection for Shanxi was Yuncheng, which is at the southern limit of the Taihang mountain range, so gene flow was affected relatively little and there was overlap between Shanxi, Henan, and Shandong in the FCA analysis ( Figure 3).

Conclusions
In the present study, we analyzed the population genetics and predicted the gene flow among the seven different S. avenae geographical populations in China using microsatellite marker technology. The genetic diversity of the Henan and Shandong populations was higher than the others populations. The genic diversity of the Hebei and Beijing populations were the lowest. For genetic differentiation, there was high or significant differentiation among the Hubei population and the others. The genetic differentiation between Beijing and Hebei, and Henan and Shandong were both at very low levels, indicating that the gene flow occurs frequently between these populations. Finally, the 7 S. avenae geographic populations can be divided into 4 clusters: (i) Hubei, (ii) Shanxi, (iii) Beijing and Hebei, and (iv) Shandong, Henan, and Jiangsu. The present results provide a basis for efforts to optimize Integrated Pest Management programs through adapting control methods which target the biological traits shared by various populations from the same genotype.

Supporting Information
File S1 Microsatellite genotypes of the seven Sitobion avenae populations in China. ''individual ID'', the tag of each tested aphid individual. ''sampling location'', the corresponding location where each aphid individual was collected. The other ten columns were the corresponding allele sizes of each aphid individual at each microsatellite loci. (XLSX)