Population genetic structure of Bemisia tabaci MED (Hemiptera: Aleyrodidae) in Korea

The sweet potato whitefly, Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) is a major agricultural pest that causes economic damages worldwide. In particular, B. tabaci MED (Mediterranean) has resulted in serious economic losses in tomato production of Korea. In this study, 1,145 B. tabaci MED females from 35 tomato greenhouses in different geographic regions were collected from 2016 to 2018 (17 populations in 2016, 13 in 2017, and five in 2018) and analyzed to investigate their population genetic structures using eight microsatellite markers. The average number of alleles per population (NA) ranged from 2.000 to 5.875, the expected heterozygosity (HE) ranged from 0.218 to 0.600, the observed heterozygosity (HO) ranged from 0.061 to 0.580, and the fixation index inbreeding coefficient (FIS) ranged from -0.391 to 0.872 over the three years of the study. Some significant correlation (p < 0.05) was present between genetic differentiations (FST) and geographical distance, and a comparatively high proportion of variation was found among the B. tabaci MED populations. The B. tabaci MED populations were divided into two well-differentiated genetic clusters within different geographic regions. Interestingly, its genetic structures converged into one genetic cluster during just one year. The reasons for this genetic change were speculated to arise from different fitness, insecticide resistance, and insect movement by human activities.


Introduction
The sweet potato whitefly, Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) is a major agricultural insect pest that is distributed worldwide. B. tabaci has an extremely broad host range [1] and causes serious damage to diverse host plant species. B. tabaci is also a vector for more than 100 pathogenic plant viruses [2], particularly known to be a vector for begomoviruses [3], and a major vector for tomato yellow leaf curl virus (TYLCV), one of the most devastating viruses in cultivated tomatoes in the world [4]. B. tabaci is a complex of 11 welldefined high-level groups consisting of at least 36 putative species identified based on mtCOI (mitochondrial cytochrome oxidase I) [5,6]. These putative species are morphologically indistinguishable and differ in host range, virus transmission, fecundity, and insecticide resistance a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Analyses of genetic diversity GENEPOP v.4.0 [35] and Micro-Checker v.2.2.3 [36,37] were used to determine the microsatellite data for scoring errors, allelic dropouts, and null alleles. The estimated frequency of null alleles per loci for each population was calculated in FreeNa [38] using the expectation maximization (EM) algorithm [39]. Each of the 1,145 collected samples were used to test deviations from Hardy-Weinberg equilibrium (HWE) conditions, the number of alleles (N A ), allele size range, and the observed (H O ) and expected heterozygosities (H E ), and the inbreeding coefficient (F IS ) were computed using GenAlEx v.6.5 [40] and Microsatellite Toolkit [41]. Analysis of molecular variance (AMOVA). AMOVA was performed using GenAlEx v.6.5. AMOVA was used to characterize genetic variation patterns and to estimate variance components. A two-part AMOVA analysis was conducted to check genetic divergence (F ST ) as a factor of variation among and within the populations. AMOVA computations were performed with 999 permutations to test for significance.

Analyses of genetic structure
The number of genetic clusters (K) was estimated in STRUCTURE v.2.3.2 with 60,000 Markov Chain Monte Carlo (MCMC) steps and a burn-in period of 600,000. The log-likelihood estimate was run for K = ranges from 1 to 10 with ten replicates each. They were used to determine the number of clusters based on a combination of the mean estimated Ln probability of the data [42] and the second-order rate of change in the log-probability of the data (ΔK) [43]. The Evanno method was then implemented in STRUCTURE HARVESTER Web v.0.6.93 [44].

Principal coordinate analysis (PCoA)
PCoA was conducted between multi-locus genotypes in all individuals. The codominant-genotypic option of GeneAlex v.6.5 was used to calculate the similarity genetic distance matrix [40]. The PCoA plot was based on factor scores along the two principal axes (axis 1 and 2) and enabled the visualization of population differences.
Discriminant analysis of principal components (DAPC). DAPC was performed in the 'adegenet' package [45] of R software v.3.5.1 (R Development Core Team, 2018) to identify an optimal number of genetic clusters to describe the data. DAPC is a multivariate algorithm, similar to principal component analysis (PCA) that identifies genetic clusters and can be used as an efficient genetic clustering tool [46]. The number of clusters was identified based on Bayesian information criterion (BIC). If the value of BIC is positive and low, it is a suitable model. When the BIC value is negative, a high number is a suitable model.
Isolation by distance (IBD). The Mantel test [47] was performed to assess isolation by distance. The relationship between pairwise geographic distance (Ln km) and pairwise genetic distance in terms of F ST /(1-F ST ) with 1,000 random permutations was conducted using the GenAlEx v.6.5, GENEPOP v.4.0, and 'ade4' package [48] of R software v.3.5.1. The IBD graph was generated by using the R software v.3.5.1 with 'ggpolt2' package.
Bottleneck test. The BOTTLENECK v.1.2.02 [49] was used to detect the effect of a recent reduction in all population sizes. The possibility of bottleneck events in the 35 populations was examined using a one-tailed Wilcoxon signed-rank test under three mutation models, the infinite allele model (IAM), the two-phase model (TPM), and the stepwise mutation model (SMM) (parameters for TPM: variance = 30.0%, probability = 70.0%, 1,000 replications). The Wilcoxon signed-rank test has been shown to be effective and reliable when eight microsatellite loci are analyzed [49].

Pairwise comparisons of fixation index (F ST ).
To assess the level of genetic differentiation between the samples, pair-wise fixation index (F ST ) value estimates were computed using GENEPOP v.4.0. To correct for null alleles, pairwise estimators of F ST values were calculated from each microsatellite dataset that potentially harbored null alleles using the excluding null alleles (ENA) correction method (F ST-ENA ) following 1,000 bootstrapping permutations over the loci. The ENA correction method was used to obtain unbiased pairwise F ST values using FreeNA. To investigate the relationship between the genetic distance revealed by the F ST values and geographic distance, an isolation-by-distance analysis was performed using a regression of F ST /(1-F ST ) values against the logarithm of the geographical distance (km) between the populations. Significance of the correlation between the two data matrices was assessed using a Mantel test with 1,000 permutations. This was performed with the ISOLDE program implemented in GENEPOP v.4.0.

Identification of the B. tabaci populations
All B. tabaci individuals collected were successfully sequenced and analyzed. Approximately 810 bp of the mtCOI gene was amplified from B. tabaci individuals by PCR. All populations identified belonged to the MED (Q1) species based on representative samples.

Genetic diversity
The values of the genetic diversity indexes for the Korea populations of B. tabaci MED are shown in Table 2 Table. AMOVA. AMOVA among the 35 B. tabaci MED populations showed that 48.0% of the total genetic variation was accounted for by variation among the populations and 52.0% of the variation was accounted for by individual variation within the populations ( Table 3) (Fig 2b and 2c). The populations of B. tabaci MED converged rapidly into one cluster (orange color) over time (Fig 3).
PCoA of B. tabaci MED. Principal component analysis of the 35 B. tabaci MED populations showed that the first principal components accounted for 27.6% of the total variation, followed by the second component, which accounted for 43.3% of the variation (Fig 4a). The first and second components of PCoA for each year are as follows: 32.3%, 52.6% for 2016 (Fig  4b), 30.7%, 53.1% for 2017 (Fig 4c), and 39.8%, 69.1% for 2018 (Fig 4d), respectively.
DAPC. In DAPC, the elbow in the curve of BIC was at K = 2 using the find. cluster function of R software v.3.5.1 [50]. In this study, the value of BIC was found to be 166.05, which was positive and the smallest value (Fig 5a). The DAPC results showed that the populations of B. tabaci MED were split into two well-differentiated genetic clusters with low overlap between them. The first cluster contained populations from 2016 and the second cluster contained populations from 2017 and 2018 (Fig 5b). The DAPC results agreed with the STRUCTURE results.

IBD.
A significant correlation was detected between genetic and geographic distances in the B. tabaci MED populations based on the Mantel tests of IBD (r 2 = 0.557; p = 0.01), indicating a pattern of isolation by distance (Fig 6). Multiple points in the scatterplot fit to the linear regression along the geographic distance range. This result indicates that gene flow between population increases with geographic distance. IBD analysis revealed that geographic distance had an effect on the population structure of the B. tabaci.
Bottleneck test. The mode-shift analysis of bottleneck test, a signature of recent population reduction was found only for the 16'GJ and 18'PT populations (Table 4). Departure from    (Table 4 bolded numbers), respectively. Under the SMM, however, significant heterozygosity excess was not detected in any population.

Pairwise comparisons of fixation index (F ST ).
The fixation index (F ST ) reflects the degree of genetic differentiation among the populations. F ST is close to 0 when the genetic variation shows no difference in fixation among the populations. It is close to 1 when genetic differentiation is high. In this study, the F ST values ranged from -0.0155 to 0.7501 and the ENA-corrected F ST values ranged from -0.0139 to 0.7327 among the populations ( Table 5). The highest F ST value was detected between the 16'JJ and 16'BY populations (0.7327). The lowest F ST value was

Discussion
This study is the first comprehensive genetic structure analysis of B. tabaci MED (Q1) populations in Korea using eight microsatellite loci. The Korean populations of tomato B. tabaci MED appeared to be classified into two genetic clusters based on STRUCTURE and DAPC analyses, and their genetic structure converged rapidly into one genetic cluster. This phenomenon was reported previously by Dinsdale et al. [52] in Australia. They reported that the genetic cluster of B. tabaci rapidly changed even in a period of just four months. The results of this study and those by Dinsdale et al. [52], suggested that one out of the two B. tabaci MED genetic clusters in Korea might become the dominant species in the future.
This phenomenon could be caused by different fitness between the two B. tabaci MED genetic clusters in Korea. Although the two B. tabaci MED genetic clusters might have been mixed when they were first introduced in new areas, one genetic cluster would become dominant if there is fitness difference between them. Fitness difference between two genetic clusters could result from different susceptibilities to insecticides. The use of various insecticides, such as neonicotinoids, organophosphates, and carbamates, has been the main control method for B. tabaci MED in Korea. Extensive use of these insecticides has rapidly resulted in high levels of insecticide resistance in B. tabaci MED populations [53]. The two genetic clusters of B. tabaci MED might have different potentials for developing resistance to different insecticides. This differentiation was partially supported by changing the frequencies and diversity caused by chemical control [54,55]. Results of the current study also showed low genotype frequencies and diversities, and limited founder or bottleneck effects.
However, the speed of this genetic cluster change in Korea could differ by areas. For example, the Jeju populations showed one genetic cluster of B. tabaci MED and this trend was maintained during the past three years. However, in the Pyeongtaek area, the genetic cluster of B. tabaci MED changed every year. The differences in the speed of genetic cluster change could Population genetic structure of Bemisia tabaci MED  be caused by human-related factors because B. tabaci has a low dispersal ability over long distances [56]. In the case of Jeju, the B. tabaci MED populations should not have been affected by other populations because almost all growers produce tomato seedlings themselves and Jeju is isolated because it is an island. On the other hand, the Pyeongtaek tomato growers have purchased tomato seedlings from different nurseries every year. Moreover, the city of Pyeongtaek has one of the most active agricultural trades of all Korean cities. Whitefly populations are generally affected by human activities, such as the movement of infested plants from nurseries, material shipments, and commercial trading, rather than by active flight [54,57]. Thus, the populations in areas with high human activities and diverse nursery routes (i.e., the Pyeongtaek populations) might show accelerated genetic cluster changes compared to populations in isolated areas with limited nursery routes (i.e., the Jeju populations). The information on the genetic characteristics of B. tabaci in areas where it usually occurs should be useful for efficient management of B. tabaci [58][59][60]. The genetic structure information gathered from the long-term and large-scale field analysis in this study facilitates a better understanding of the population dynamics of B. tabaci MED as an invasive pest in Korea. Thus, the results of this study could be a valuable foundation to develop efficient management strategies for B. tabaci MED in Korea. However, further studies are needed to clearly find the fitness differences between the two B. tabaci MED genetic clusters in Korea.
Supporting information S1