Genetic diversity analysis of the invasive gall pest Leptocybe invasa (Hymenoptera: Apodemidae) from China

Leptocybe invasa Fisher et LaSalle is a global invasive pest that seriously damages Eucalyptus plants. Studying the genetic diversity, genetic structure and introgression hybridization of L. invasa in China is of great significance for clarifying the breeding strategy, future invasion and diffusion trends of L. invasa in China and developing scientific prevention and control measures. Genetic diversity and phylogenetic analyses of 320 L. invasa female adults from 14 geographic populations in China were conducted using 10 polymorphic microsatellite loci (SSRs) and mitochondrial DNA cytochrome oxidase I gene sequences (COIs). (1) The Bayesian phylogenetic tree and haplotype network diagram showed that only haplotype Hap3 existed in L. invasa lineage B in China, while haplotypes Hap1 and Hap2 existed in lineage A, among which haplotype Hap2 was found for the first time. The nucleotide and haplotype diversities of lineage A were higher than those of lineage B. (2) The SSR genetic diversity of the Wuzhou Guangxi, Ganzhou Jiangxi and Panzhihua Sichuan populations was higher than that of the other 11 populations, and the SSR genetic diversity of lineage A was higher than that of lineage B. (3) The AMOVA analysis of mitochondrial COI data showed that 75.55% of the variation was among populations, and 99.86% of the variation was between lineages, while the AMOVA analysis of nuclear SSR data showed that 35.26% of the variation was among populations, and 47.04% of the variation was between lineages. There were obvious differences in the sources of variation between the COI and SSR data. (4) The optimal K value of COI and SSR data in structure analysis was 2, and PCoA analysis also divided the dataset into two obvious categories. The UPMGA phylogenetic tree based on SSR data clustered 14 geographic species into two groups. The results of genetic structure analysis supported the existence of two lineages, A and B, in China. (5) Structural analysis showed that there was obvious introgressive hybridization in Wuzhou Guangxi, Ganzhou Jiangxi, Panzhihua Sichuan and other populations. These results suggest that lineage introgressive hybridization has occurred in the L. invasa population in China. The introgressive hybridization degree and genetic diversity of lineage A are obviously higher than those of lineage B. Lineage introgressive hybridization may be the driving force for further L. invasa invasion and diffusion in China in the future.


Introduction
With the development of international trade and economic globalization, biological invasion has become a serious concern of the international community. China is one of the countries most affected by invasive organisms in the world. At present, there are more than 640 kinds of invasive organisms in agriculture and forestry, including 130 kinds of invasive insects, which cause direct economic losses of more than 50 billion yuan every year [1].
Leptocybe invasa Fisher et LaSalle (Hymenoptera: Apodemidae) As an invasive gall-pest that damages more than 30 species of Eucalyptus, it mainly harms the branches and leaves of eucalyptus, causing typical lumplike galls on the veins, petioles and shoots of new leaves of eucalyptus. Serious infection can lead to plant death and huge economic losses. L. invasa was first discovered in the Middle East and Mediterranean in 2000 [2]. In 2014, it spread to 29 countries on the five continents [3]. In 2018, the number increased dramatically to 45 countries [4]. L. invasa was first discovered in China in 2007 in Dongxing, Fangchenggang, Guangxi, which is near the border with Vietnam [5]. Now it has spread to Guangxi, Hainan, Guangdong, Fujian, Sichuan, Jiangxi, Yunnan, Taiwan and other 8 provinces [3,[6][7][8]. L. invasa has become an important invasive forest pest in eucalyptus worldwide with unprecedented scale and speed of invasion and spread.
L. invasa has three lineage A, B and C. lineage A is mainly distributed in Europe, the Middle East, South America and most parts of Africa. lineage A and B are distributed together in Laos, Thailand, Vietnam, South Africa and other places, while lineage C only appears in Australia [9][10]. Nugnes et al. first reported that L. invasa that invaded China belongs to lineage B [11]. However, Peng Xin et al. identified L. invasa from 14 geographical populations in China and found that there are both lineage A and lineage B in China, with a ratio of about 1:2 [12].
The main reproductive method of L. invasa is parthenogenesis, in which the genes of the female offspring are the same as their mother [2]. Some of the symbiotic bacteria in L. invasa, such as Rickettsia, can induce parthenogenesis [11]. Due to its superior parthenogenesis, L. invasa can rapidly expand its population in the invaded areas. In recent years, L. invasa males have been reported in Turkey (male/female ratio: 1:124) [13], India [14], China (male/female ratio: 1:126& 1:125) [15], Taiwan [16], Thailand [17] and other places successively. Liang Yiping's study found that the proportion of male L. invasa in Guangxi, China was 18% -48%, which was significantly higher than the proportion reported in other reports [18]. Yang Xianxian studied the reproductive biology of L. invasa and found that there is mating behavior between male and female L. invasa in China, and offspring can be produced through bisexual reproduction [19]. Through the analysis of genetic diversity and genetic structure, Dittrich-Schröder et al. found that L. invasa in the Vietnamese population had introgressive hybridization between lineages [9].
Introgression is a phenomenon in which genetic material from one species is transferred to another species through hybridization and repeated backcrossing, also called introgressive hybridization [20]. Hybridization occurs in about 10 percent of all animals and 25 percent of all plants in nature [21], Hybridization often leads to introgression [22]. Hybridization is considered to be an important driving force for adaptive evolution of species [23][24], Introgression caused by hybridization has been reported in many animals, such as insects [25], mollusks [26], mammals [27], reptiles [28], amphibians [29], fish [30], etc. In recent decades, with the frequent occurrence of invasive species introduction events, the hybridization opportunities between invasive species and their relatives have increased, and the introgressive hybridization progeny may be more invasive, leading to the extinction of the local parent groups [31][32]. Introgressive hybridization is easy to occur between lineage due to their close genetic relationships. Introgression leads to the rapid improvement of genetic diversity, which increases the fitness of lineage in new environments, and promotes the spread of lineage and population expansion [30,[33][34]. The mixing of L. invasa lineage may lead to introgression between lineage and the emergence of new genotypes that are more suitable for the environment [9]. In this paper, SSR and COI molecular markers were used to analyze the genetic diversity, genetic structure and introgressive hybridization of L. invasa in China, aiming to provide a basis for the prediction and scientific prevention and control of L. invasa invasion and spread trend in China.
Fresh shoots of eucalyptus with gall were collected from each geographic population, the branches were placed in a plastic bottle filled with water to retain freshness and transferred into a sealed net cage (40 cm×40 cm×80 cm) at room temperature to keep the adults from escaping. The water in the plastic bottle was renewed daily until the emergence of L. invasa adults. Sexes were identified by morphological observation [3]. Samples of female adults from each geographic population were placed into tubes with anhydrous ethanol, labeled and stored in a refrigerator at -80℃. In order to avoid contamination, only single geographical population L. invasa samples were cultivated each time, and the remaining branches were treated in time after the culture. Thirty-two samples were randomly selected from the samples collected from each geographic population for subsequent molecular analysis, and Some populations with insufficient samples use the entire samples.
The animal tissue DNA extraction kit (Beijing Tsingke Biological Technology) was used to extract the total genomic DNA of single L. invasa female adults according to the instructions. Nanodrop spectrophotometer was used to measure the quality and concentration of DNA at 260 nm/280 nm and 260 nm/230 nm ratios, and the qualified DNA was stored in a refrigerator at -20℃ for later use.

Microsatellite amplification and capillary electrophoresis genotyping
Reference was made to the polymorphic SSR primers of L. invasa developed by Dittrich-Schröder et al. [9] and Peng Xin et al. [12] ( Table 2). The 5' end of the Forward primer was labeled with four kinds of fluorescence (FAM, ROX, HEX and TAMAR) (Beijing Tsingke Biological Technology). The sample DNA used was consistent with mitochondrial sequencing. The PCR reaction system was 25μL, including 2×T5 Super PCR Mix 12.5μL (Beijing Tsingke Biological Technology), forward fluorescent primer and reverse primer (diluted to 10 μm) 1.0μL each, DNA template 1.0μL, ddH2O 9.5μL. The PCR reaction procedure was pre-denaturation at 98℃ for 3min, denaturation at 98℃ for 10s, annealing at suitable temperature for 10s, extension at 72℃ for 10s, 35 cycles and extension at 72℃ for 2min. PCR products were sent to Beijing Tsingke Biological Technology for capillary electrophoresis genotyping using QIAXCEL nucleic acid analysis system (QIAGEN, Irvine, USA) (S1 Fig). construction 831 COI sequence data downloaded from GenBank molecular database were input into Notepad++ v6.5 software and saved in FASTA format (S1 Table). The COI sequences were Mafft alignment using PhyloSuite v1.1.1 [36] and conserved sequences were extracted using Gblocks. The optimal model of Bayesian tree was found through the Model Finder program in PhyloSuite v1.1.1 software, and the optimal model was used to construct the Bayesian phylogenetic tree. The generation number was set as 100 million, and 3 replicates were run. L. invasa COI gene sequence haplotype network was constructed using Popart v1.7 software(http://popart.otago.ac.nz).

Genetic diversity analysis
Genetic diversity analysis based on SSR data: The peak data of capillary electrophoresis was input into Excel in GeneAlex v6.5 [37] format, and the format conversion required for subsequent software was completed using GeneAlex v6.5 [37] plug-in. Genetic diversity analysis software was used to analyze the genetic diversity of (1) different geographic populations (N =320), (2) lineage A (N = 104) (3) lineage B (N = 216). Arlequin v3.5 [38] was used to calculate genetic diversity indexes, including total number of alleles, average number of alleles, number of polymorphic loci, average observed heterozygosity, average expected heterozygosity and average genetic diversity. Arlequin v3.5 [38] was used to perform molecular analysis of variance (AMOVA) to quantitatively analyze the genetic variation among populations, within populations, between lineage and within lineage. Genepop v4.2 [39] was used to detect Linkage disequilibrium, And a significant correction of Linkage disequilibrium through the "sequential Bonferroni correction" [40] (S2 Table). Genepop v4.2 [39] was used to detect Hardy-Weinberg equilibrium at each site (S3 Table), and Cervus v3.0 [41] software was used to calculate polyphic information content (PIC) at each locus.
Genetic diversity analysis based on COI data: The haplotype diversity and nucleotide diversity of different lineage and all samples were calculated using DNAsp v6 [42]. Arlequin v3.5 [38] was used to analyze the molecular variance of L. invasa within and among populations, within and between lineage in China.

Genetic structure analysis
Population v1.2.32 software (http://bioinformatics.org/project/?group_id=84) was used to construct UPMGA population evolutionary tree based on SSR data. Structure version 2.3 (http://pritch.bsd.uchicago.edu/structure.html), which applies a Bayesian clustering method, was used to indicate the number of populations or clusters (K) based on genetic identity of the individuals and without geographic prior. An admixture model with correlated allele frequencies was implemented because it was more likely to detect subtle population structure. Tests included 20 repetitions for a range of K-values from 1 to 10. For the Markov chain Monte Carlo (MCMC) procedure, the 'burn in' period was set to 100000 iterations followed by 100000 samples. Structure Harvester [43] is used to determine the likelihood value and ΔK [44] changes to verify the optimal K value, which represents the most likely clustering of individuals. Visualize the results through the website (http://pophelper.com/). Principal co-ordinate analysis (PCoA) was performed on SSR and COI data using GenAlEx v6.5 [37].
Bayesian phylogenetic tree and haplotype network were constructed by using 831 605bp COI gene sequences downloaded from GenBank. The Bayesian tree and haplotype network diagram shows that there are three lineage of L. invasa: A (Hap1-Hap2), B (Hap3-Hap25), and C (Hap26-Hap27). Haplotype Hap4-Hap25 and Hap26-Hap27 were found only in Australia, while haplotype Hap3 and Hap1 was widely distributed in countries other than Australia where where L. invasa is invaded except Australia. Two lineage, lineage A and lineage B, exist in China, including Hap1, Hap2 and Hap3, among which Hap2 was first found in GXFCG1 population in China (Fig  1-2).

Genetic diversity
The genetic diversity analysis of COI data showed ( Table 3) that the base diversity and nucleotide diversity of L. invasa lineage A were significantly lower than lineage B and lineage C. There was no base mutation in the COI gene sequence of lineage B in China, and lineage A only had one base mutation site. The nucleotide diversity and haplotype diversity at the lineage level are both at a low level, indicating that L. invasa in China has experienced population expansion recently. AMOVA analysis (Table 4) showed that the genetic variation among and within populations accounted for 75.55% and 24.45%, respectively. The genetic variation between and within lineage accounted for 99.86% and 0.14%, respectively. Genetic variation is concentrated among populations and lineage.
Diversity analysis of SSR data shows (Table 5), At the population level, JXGZ had the most alleles, GXWZ had the highest observed heterozygosity and expected heterozygosity, and the genetic diversity of GXWZ, JXGZ and SCPZH populations was significantly higher than that of the other 11 populations. At the level of lineage, the total number of alleles of lineage A and B was similar, but the observed heterozygosity, expected heterozygosity, number of polymorphic sites and average genetic diversity of lineage A were significantly higher than those of lineage B. AMOVA analysis (Table 4) showed that the genetic variation among and within populations accounted for 38.61% and 61.39%, respectively. The genetic variation between and within lineage accounted for 48.86% and 51.14%, respectively. Compared with the COI gene sequence, the genetic variation of SSR was concentrated within the populations and lineage.

Genetic structure
In the UPMGA population evolutionary tree constructed based on SSR data (Fig 3), GXFCG1, JXGZ, GXWZ and SCPZH populations were clustered into one branch, and the other populations were clustered into another branch, which was consistent with the results of Structure analysis.
The principal co-ordinates analysis (PCoA) of COI data showed that lineage A and lineage B were clearly separated, and co-ordinates 1 and 2 represented 99.65% and 0.35% variation percentages, respectively (Fig 4). The principal co-ordinate analysis of SSR data showed that there was a mixing region between lineage A and B, and coordinates 1 and 2 represented 44.32% and 12.26% variation percentages, respectively (Fig 4), indicating the presence of gene introgression between lineages.
The optimal K value of structure analysis based on COI data is 2, The black and yellow represent the samples of lineage A and lineage B, respectively. The lineage A and B appeared simultaneously in 7 populations, namely GXFCG1, JXGZ, GXWZ, SCPZH, GXFCG2, GXNN1 and GXNN2 (Fig 5).
The optimal K value of structure analysis based on SSR data is also 2. Black and yellow represent the genes of lineage A and lineage B, respectively. Lineage mixing appears in part of the samples, indicating that the samples are introgression hybrids. The samples of GXFCG1, JXGZ, GXWZ and SCPZH populations were dominated by the genes of lineage A, while the other populations were dominated by the genes of lineage B. Different degrees of introgressive hybridization appeared in JXGZ, GXWZ, SCPZH, GXNN2, GXQZ, GXYL, GXLB and HNDZ populations. L. invasa in GXWZ and SCPZH populations were almost replaced by their introgressed hybrids (Fig 5).

Discussion
We captured L. invasa female adult samples from 14 geographic populations in 6 provinces of China, covering most of the L. invasa hazard areas in China. In particular, Guangxi Province, where the L. invasa outbreak first occurred, has set up multiple sampling sites to speculate on its possible transmission route. Because all susceptible varieties of eucalyptus have been replaced with highly resistant varieties in Guangdong Province, we did not collect the samples of L. invasa from Guangdong population, which indicated that the comprehensive promotion of resistant varieties has a very good effect on the control of L. invasa. There are two lineage of L. invasa in China: lineage A and lineage B [12], which are consistent with the lineage in other countries where L. invasa is invaded in the world. However, a new haplotype HAP2 was found in the lineage A of the Chinese population by Bayesian clustering analysis, which is of guiding significance for the traceability of lineage A. The lineage structure of L. invasa in China is similar to that in Southeast Asia, which indicates that L. invasa in China probably originated from Southeast Asia. When selecting appropriate pest control methods, it is necessary to consider the genetic diversity, genetic structure and gene flow of the pest population, which can help us predict the future distribution of invasive pests, the possibility of hybridization between different lineage, and the ability to overcome the resistance of different host plants. Mixtures between closely related species are of particular concern, as introgression within a mixed population can rapidly increase genetic diversity within the population, thus enabling pests to invade previously unsuitable environments or to overcome control measures [45][46]. The progeny of parthenogenesis is genetically the same as the mother, and the genetic diversity of its population is mainly determined by a few founders. The genotypes of its progeny are highly consistent, which is the root cause of the low polymorphism of L. invasa SSR loci [9]. The genetic structure analysis of L. invasa of different geographical populations in China using Structure and PCoA showed that there was obvious introgressive hybridization in China, which was similar to the genetic Structure analysis results of Dittrich-Schröder et al. in Laos [9]. Introgressive hybridization among lineage significantly increased genetic diversity in some geographic populations, such as Wuzhou in Guangxi, Panzhihua in Sichuan and Ganzhou in Jiangxi. In AMOVA analysis, the variation percentage within lineage based on COI data (99.84%) was significantly higher than that based on SSR data (44.86%). This phenomenon of inconsistency in the source of nucleoplasmic gene variation also proved that introgressive hybridization occurred between lineages. According to the genetic structure of the species, there were heterozygous samples in both lineage A and B, which indicated that the introgression between lineage of L. invasa may be bidirectional. Le et al. speculated that L. invasa lineage B might be more inclined to produce males [4]. Therefore, in the process of introgressive hybridization, lineage B is more likely to infiltrate genes into lineage A through sexual reproduction, which makes the genetic diversity and introgressive hybridization degree of lineage A significantly higher than that of lineage B. In addition, survival pressure may also be an important factor affecting L. invasa reproductive strategy. When food supply is insufficient, L. invasa will increase male differentiation, thus reducing reproductive ability and the number of offspring. For example, the proportion of male collected from E. exserta was 1.7-1.8 times that of DH201-2 (E. grandis ×E. tereticornis) [18]. In this study, the introgressive hybridization degree of Guangxi Nanning 2 population was significantly higher than that of Guangxi Nanning 1 population, which also indicated that the host plants would affect the reproduction strategy of L. invasa. In recent years, China has vigorously promoted resistant varieties of eucalyptus, resulting in the lack of suitable hosts for L. invasa and insufficient food supply, leading to the transformation of L. invasa reproduction strategy and more frequent sexual reproduction in the population.
The emergence of introgressive hybridization in lineage has sounded the alarm for the control of L. invasa in China. Hybrids have been widely distributed in several populations in China. Among them, L. invasa in Guangxi Wuzhou population and Sichuan Panzhihua population were almost replaced by their introgressed hybrid offspring, which indicates that there is obvious heterosis in L. invasa population. For most organisms, heterosis can only be maintained for one generation and is difficult to be inherited steadily. However, L. invasa can rapidly expand the number of offspring of dominant hybrids through parthenogenesis, thus breaking the reproductive barrier to avoid inbreeding decline and maintaining heterosis in the population for a long time. Hybrids can form large populations independent of their parents in a short period of time. Once a hybrid is able to adapt to the new environment, It can invalidate the previously dependent resistant varieties and specific natural enemies, And in a short time again outbreaks of insect pests. There are a large number of clonal eucalyptus forests in China, which will become a huge hidden trouble for L. invasa control. Especially the remaining stands of susceptible eucalyptus species such as DH201-1, DH201-2 (E. grandis × E. tereticornis) and E. exserta margin. L. invasa can quickly establish a large population on susceptible host plants. The aggregation of a large number of progenies increase the chance of gene exchange between lineage, and susceptible host plants accelerate the generation replacement and evolution rate of pests. Eucalyptus susceptible species can not only serve as the source of L. invasa's continuous spread, but also become a hotbed for L. invasa to adapt to the environment through genetic variation. With the intensification of L. invasa lineage mixing, gene penetration between lineage will become a trend in the future. L. invasa flexibly applies two reproductive strategies to adapt to the environment, which will increase the difficulty of its prevention and control.

Conclusion
Introgressive hybridization between L. invasa lineage A and B has occurred in many geographical populations in China. The genetic diversity and introgression degree of L. invasa lineage A were significantly higher than that of lineage B. lineage introgressive hybridization may be the driving force for L. invasa to invade and spread in China in the future.