The Distribution and Host Shifts of Cotton-Melon Aphids in Northern China

Aphis gossypii Glover (Hemiptera: Aphididae) is a serious pest of cotton in northern China. A microsatellite analysis was used to characterize the genetic structure of A. gossypii populations from different geographic, host plant, and seasonal populations in 2014. Among 906 individuals, 507 multilocus genotypes were identified, with genotypic richness values of 0.07–1.00 for the populations. We observed moderate levels of genetic differentiation among geographic populations (FST = 0.103; 95% confidence interval: 0.065–0.145) and host plant populations (FST = 0.237; 95% confidence interval: 0.187–0.296). A Mantel test of isolation by distance revealed no significant correlations between Slatkin’s linearized FST and the natural logarithm of geographic distance. A Bayesian analysis of population genetic structures identified three clusters. An analysis of molecular variance revealed significant differences among the three clusters (F = 0.26596, P < 0.0001), among seasons (F = 0.04244, P = 0.00381), and among host populations (F = 0.12975, P = 0.0029). Thus, the A. gossypii populations in northern China exhibit considerable genotypic diversity. Additionally, our findings indicated that the 31 analyzed populations could be classified as one of three host biotypes (i.e., cotton, cucumber, and pomegranate biotypes). There were also clear seasonal effects on population genetic structure diversity among aphids collected from Anyang.


Introduction
The cotton-melon aphid, Aphis gossypii Glover, is a polyphagous species with a worldwide distribution. Economic losses caused by this aphid result from its feeding on agricultural crops, the production of honeydew secretions on lint (i.e., sticky cotton), and the fact it serves as a vector for viral diseases [1,2]. This pest colonizes more than 600 plant species, including many important crops, such as cotton, cucurbits, citrus, aubergine, potato, and okra [1]. Aphis gossypii is an important pest of cotton in northern China, causing yield losses during the seedling stage [3].
Aphis gossypii has a highly variable life cycle. The aphid is considered to be anholocyclic in most places where it is found, including in Europe and Africa, where it reproduces continuously by apomictic parthenogenesis [1,4]. However, this aphid can be holocyclic in regions with very harsh winters, such as Japan, China, Korea, India, and the United States of America, where it reproduces sexually, with a few woody plants serving as primary hosts. More than 10 of these primary hosts are extremely abundant in China, including Hibiscus syriacus, Zanthoxylum simulans, and Punica granatum [5][6][7]. In China, A. gossypii eggs hatch in March, and two to three generations are produced on primary hosts before alate adults develop because of overcrowding and a diminishing food supply. Alate adults migrate to cotton fields, where seedlings emerge in late April to mid-May. In autumn, these morphs return to the primary hosts, where the gynoparae produce oviparae, which mate and produce eggs [5].
The genetic structure of aphid populations is associated with the spatial patterns of selection pressures resulting from biotic and abiotic factors, such as climate, habitat distribution, dispersal abilities, insecticides, and life cycle [8][9][10]. Aphid biotypes are defined based on the ability to feed on specific hosts within the host range of the species [7]. Genetic diversity is correlated with host type, with host-specific A. gossypii biotypes detected in many countries and regions [11,12]. The existence of A. gossypii biotypes was determined by host transference experiments [13]. Polymorphisms in microsatellite or simple sequence repeat (i.e., tandem repeats of simple nucleotide sequences) loci have been used to analyze the population genetic structures of many organisms [14]. Eight A. gossypii microsatellite loci were described in 1999 [15], and have since been widely used in this species [10,13,14,[16][17][18][19].
In this study, eight microsatellite markers were used to characterize the population structure of A. gossypii aphids collected from cotton plants at 20 locations in the North China Plain in Henan, Hebei, and Shandong provinces. We also compared the genetic structures of A. gossypii populations from three host plants in mid-May and late August in Anyang, Henan Province.

Sample identification
The polymerase chain reaction (PCR) and sequencing results revealed that all samples collected in cotton fields were A. gossypii. Some samples collected on hibiscus, pomegranate, and Chinese prickly ash contained other species of aphids, with A. gossypii comprising 30-95% of all aphids [20].

Genetic diversity
The microsatellite locus Ago84 failed to amplify in 44.3% of individuals. Therefore, samples were genotyped at seven microsatellite loci. Among the 906 individuals, 507 multilocus genotypes (MLGs) were distinguished, with each population having 3-42 MLGs. The ratio of MLGs to individuals sampled (M/I) in a population was 0.097-1.000. In the populations SDWC, HBWX, HBQZ, HBLZ, SDSX, HNWS, HNSQ, HNLY, and HNYL, the M/I ratios were less than 0.500, and the lowest ratio was observed for HNSQ (0.097) ( Table 1). The allelic richness values were 1.45-3.23, with an average of 2.34. The heterozygosity deficit indicates the level of inbreeding, and we determined that six of seven loci exhibited heterozygote deficits. These deficits were observed in three of five Hebei populations, three of five Henan populations, and two of nine Shandong populations. Approximately half of the geographic populations met the Hardy-Weinberg equilibrium (HWE) criteria for panmictic sexual reproduction, while the other half underwent inbreeding. Heterozygote deficits were also observed in half of the Anyang populations, suggesting the prevalence of inbreeding in these populations. The observed proportion of heterozygotes (H O ) and the expected proportion of heterozygotes (H E )

Population genetic structure
There were fewer than five MLGs in populations HBLZ, HNSQ, and KIW-May, which were excluded from population genetic structure analyses. Distinct clusters were not observed in a neighbor-joining tree of 31 A. gossypii populations based on genetic distances (Fig 1). However, 18 geographic populations appeared to cluster together. Additionally, six melon populations were grouped together, while two other melon populations (i.e., ZUW-Sep and CUW-Sep) and COW-Sep (which were collected in September) were grouped together. The genetic structure among 18 geographic populations was examined using pairwise comparisons of multilocus F ST with and without ENA correction. The genetic structure among 13 Anyang populations was examined using pairwise comparisons of multilocus F ST with and without the ENA correction. The overall F ST value (F ST = 0.237; 95% CI: 0.187-0.296) indicated moderate genetic differentiation among the Anyang populations. Pairwise F ST estimates between pairs of Anyang populations indicated 50.5% of population differentiation tests were significant (  Table).
The Mantel test of isolation by distance (IBD) revealed there was no significant correlation between Slatkin's linearized F ST and the natural logarithm of geographic distance (r = −0.086, P = 0.291) for all 18 geographic populations (Fig 2A). The presence of null alleles only weakly affected the result (r = −0.107, P = 0.187) (Fig 2B), which indicated the distance between populations was not responsible for the differentiations between populations.
Results of the Bayesian analysis of population genetic structures indicated the best dataset partitioning involved three genetic clusters because the modal value of ΔK (Evanno method) occurred with K = 3 (S1 Fig). The pattern of the three clusters corresponded well with the geographical distributions of the populations and the A. gossypii aphid colors in the neighbor-joining tree (Fig 1). Cluster analyses of aphid populations were based on the mean proportion of members in a particular cluster. Twenty-five of thirty population proportions were higher than 0.900, with the highest proportion (0.994) observed in cluster 3 of CUW-May. Cluster 1   (Fig 3). A principal component analysis (PCA) was completed using Nei's genetic distance pairwise matrix. The PCA results divided the 31 populations into three groups (Fig 4), which were consistent with the cluster analysis results.
The proportions of Anyang populations from three hosts (i.e., cotton, cucumber, and zucchini) depended on the season. In May, the blue proportion of COW-May was 0.850, the red proportion of CUW-May was 0.994, and the blue and red proportions of ZUW-May were 0.360 and 0.636, respectively. In August, the blue proportion of COW-Aug was 0.986, the red proportion of CUW-Aug was 0.795, and the blue and red proportions of ZUW-May were 0.686 and 0.308, respectively. In September, the blue proportion of COW-Sep was 0.988, the red proportion of CUW-Sep was 0.164, the blue proportion of CUW-Sep was 0.829, and the blue and red proportions of ZUW-May were 0.790 and 0.204, respectively (Fig 5).
Analysis of molecular variance (AMOVA) revealed significant differences among the three clusters (F = 0.26596, P < 0.0001) ( Table 4), suggesting there were significant genetic differentiations among the clusters. Differential adaptations to hosts may have led to genetic differentiations. The occurrence of host-based genetic differentiation among Anyang populations was supported by AMOVA (F = 0.12975, P = 0.0029) ( Table 4). Under field conditions, population expansions can be interrupted several times by environmental factors, such as rain and high temperatures, after aphids migrate onto summer hosts. The AMOVA results for comparisons among seasons in Anyang (i.e., May vs. August vs. September) revealed significant differentiations between sampling periods (F = 0.04244, P = 0.00381) ( Table 4).

Discussion
The North China Plain, which is surrounded by mountainous and coastal regions, has long been used to grow cotton. The life cycle of A. gossypii aphids in this region is considered to be holocyclic [5]. In this study, A. gossypii samples were collected from cotton plants in late August at 20 locations in the North China Plain. At Anyang, in the middle of the plain, A. gossypii aphids were sampled from different host species. Of 31 populations, 16 did not meet the criteria for the HWE, and exhibited significant heterozygote deficiencies. Heterozygosity deficits were also observed for Sitobion avenae [9] and Myzus persicae [21]. These results suggest the presence of null alleles, along with the parthenogenetic life cycle, resulted in deviations from the HWE.
Our results indicate that the A. gossypii populations in northern China exhibit greater genotypic diversity in their microsatellite loci than other populations elsewhere in the world [10,13,15,22], except for Japan and Iran [14,18]. Despite the limited resolution provided by the seven  microsatellite loci, we distinguished 507 MLGs among the 906 analyzed individuals. Their M/I ratio of 0.61 was higher than in melon crops at three locations in France and one location in the Lesser Antilles [19], Australia [16], and Iran [18]. A holocyclic life cycle may lead to greater genetic diversity, as observed in aphid populations from southern China and Japan. The species is considered to be anholocyclic in other areas [1,14]. Similarly, green peach aphid populations are also genetically more diverse in areas where sexual reproduction occurs than in areas where asexual reproduction predominates [23]. Pairwise F ST values indicated there were moderate levels of genetic differentiation among geographic and Anyang populations. Results of the IBD analysis revealed that geographic distance had no effect on A. gossypii population structure. An AMOVA detected significant differences among the three identified clusters, indicating there was significant genetic differentiation among clusters. A. gossypii alates migrate to the North China Plain and influence aphid genetic structure. Given the significant genetic differentiation among geographic populations, the migration rate may be low [24], which is also indicated by the Bayesian clustering results.
In this study, the number of MLGs in the three Anyang populations collected in May and August were 245, 27, and 106, with M/I ratios of 0.69, 0.93, and 0.93, respectively. Genetic diversity was strongly influenced by sampling time in populations collected in France [19]. Low clonal diversity may result from heavy selective pressures resulting from repeated treatments of cotton fields with insecticides [13]. Our results indicate M/I ratios at later stages are higher than those of the initial stages in summer hosts. Therefore, weather may have influenced our results because aphids were collected from infested crops in August and September, but heavy summer rains disrupted the aphid reproductive cycle in the North China Plain in 2014. The aphids migrated into the fields in August from both primary and summer hosts. Differential adaptations to hosts may have led to genetic differentiation [19,22], with M/I ratios of 0.56-1.00 in Anyang populations from various hosts. The M/I ratios were different among geographic populations from cotton, perhaps owing to different selective pressures, such as insecticide treatments [13]. In three Anyang aphid populations from cotton that were not exposed to pesticides, the M/I ratios were above 0.92, which was higher than the ratios of most populations collected from cotton fields.
Many studies have reported the existence of host-specific A. gossypii biotypes. This specialization has occurred worldwide on the same crops. A microsatellite study of A. gossypii in northern Cameroon revealed that cotton and cucurbits were colonized by distinct groups of clonal genotypes [10]. A similar study in Tunisia determined that cultivated crops of the families Cucurbitaceae and Solanaceae were infested by specific A. gossypii genotypes [22].  Additionally, Carletto et al. [13] identified five host races using eight microsatellites. Previous studies classified A. gossypii aphids in southern China in two groups based on whether they colonized cotton or cucumber [25]. In this study, a Bayesian analysis of population genetic structures revealed three genetic clusters. This clustering was confirmed using PCA results, which divided the 31 populations into three groups consisting of the same populations as the genetic clusters. Our findings indicate the 31 populations could be classified into one of three biotypes according to host species (i.e., cotton, cucumber, and pomegranate biotypes). Many studies have described host-specific A. gossypii biotypes, but a lack of standardization has prevented comparisons of biotypes [7]. Microsatellite analyses may lead to a standardization protocol that enables comparisons among biotypes. In China, there are five summer hosts for A. gossypii, while hibiscus, pomegranate, and Chinese prickly ash serve as the main primary hosts. A previous study concluded A. gossypii from hibiscus plants preferred cotton over cucumber in southern China [25]. Bayesian analysis results revealed that populations from hibiscus and Chinese prickly ash plants were mainly blue, indicating these plants are the primary hosts of the cotton biotype of A. gossypii. Meanwhile, the red proportion of aphid populations from hibiscus and Chinese prickly ash plants was about 0.050, suggesting these plants are also the primary hosts of the cucumber biotype. The aphid populations from pomegranates were mainly green, and no summer hosts were identified.
Because of broad phenotypic plasticity, aphids can be opportunistic and colonize different plant hosts [12], with differential adaptations to hosts resulting in genetic differentiation. In Anyang populations, clonal diversity depended on the plant hosts [18]. Because A. gossypii completes a holocyclic life cycle in Anyang, the seasons were another factor affecting genetic diversity. Results of the AMOVA between seasons revealed significant differences between sampling periods. Bayesian analysis also detected seasonal effects. For the three summer hosts (i.e., cotton, cucumber, and zucchini), the seasons obviously affected population genetic structures and host biotypes. The aphid populations may not have undergone host plant-dependent selective filtering, which may have been responsible for the differences between alate and apterous populations [19].
In conclusion, the A. gossypii populations in northern China exhibit considerable genotypic diversity. Moderate levels of genetic differentiation among geographic populations and host plants, and geographic distances did not affect A. gossypii population structures. Our findings indicate the 31 populations belonged to three host biotypes (i.e., cotton, cucumber, and pomegranate biotypes). Additionally, the seasons clearly influenced the population genetic structure diversity in Anyang aphids.

Ethics Statement
The cotton-melon aphid is an insect pest of many plant species. Because studies of A. gossypii may provide a new method to control this pest, such investigations are welcomed by farmers. No specific permits were required for field studies, which did not involve endangered or protected species.

Sample collection and DNA extraction
Samples of wingless A. gossypii aphids were collected from cotton fields in northern China in late August 2014. A total of 564 individuals were sampled at 20 sites, including six populations in Hebei province, sampling locations were Nanpi (HBNP; 37°58'53.1N, 116°49'34.1E), Zhaoqiang (HBZQ; 37°31'25.4N, 115°39'22.5E), Weixian (HBWX; 36°58'59.6N, 115°17'37.7E), subsequent statistical analysis using CONVERT v. 1.31 software [29]. To examine the possible occurrence of null alleles at microsatellite loci, the van Oosterhout algorithm of Micro-Checker v. 2.2.3 was used [30]. Descriptive statistics, including the number of alleles per locus, allelic richness, and fixation index (F IS ), were estimated using FSTAT v. 2.9.3.2 software. The F IS value ranges from −1 to +1, with negative F IS values indicating heterozygote excess (outbreeding) and positive values indicating heterozygote deficiency (inbreeding) relative to HWE expectations [31]. The H E and H O values were determined with GenALEX v. 6, which was also used to complete the PCA [32]. To minimize any biases in the genetic diversity statistics induced by null alleles, the corrected dataset was used to determine the H O , H E , and F IS statistics. Additionally, the HWE test involved the EM and INA methods [33].
A neighbor-joining tree using genetic distance was constructed with PHYLIP v. 3.696 (University of Washington; http://evolution.genetics.washington.edu/phylip.html) and 5,000 bootstraps to estimate the phylogenetic relationships among populations. An F ST value of zero indicates a lack of divergence between populations, while an F ST of one indicates complete isolation of populations. Therefore, pairwise F ST values for each comparison of populations were calculated using FSTAT. The ENA method was also used to obtain unbiased pairwise F ST values (F ST [ENA] ) using the FreeNA program [33]. A Mantel test of IBD was completed in Gene-Pop, with significance tests performed with 1,000 permutations. Population structures were analyzed using the Bayesian clustering algorithm of Structure v. 2.3.4 software [34]. The Markov chain Monte Carlo simulation was run 20 times for each K value (1 K 10) for 500,000 iterations after a burn-in period of 750,000. The optimal K value was determined using the highest ΔK value as described previously [35]. The ΔK value was calculated using Structure Harvester v. 0.56.3 [36]. Population genetic variances were investigated further by AMOVA [37] using Arlequin v. 3.5.1.3 [38].