Invasion History of the Oriental Fruit Fly, Bactrocera dorsalis, in the Pacific-Asia Region: Two Main Invasion Routes

The oriental fruit fly, Bactrocera dorsalis, was initially recorded in Taiwan Island in 1912, and has dispersed to many areas in the Pacific-Asia region over the last century. The area of origin of the species may be confidently placed in South-East China. However, routes of range expansion to new areas and underlying population processes remain partially unclear, despite having been the subject of several studies. To explore the invasion history of this species, a partition of the cox1 gene of mitochondrial DNA was used to investigate genetic diversity, haplotype phylogeny and demographic history of 35 populations, covering China and South-East Asia and including marginal populations from Pakistan and Hawaii. Based on neighbor-joining tree analysis and the distribution of haplotypes, two main invasion routes are inferred: one from South-East China to Central China, another from South-East China to South-East Asia, with both routes probably coinciding in Central China. Populations in Taiwan Island and Hainan Island might have originated in South-East China. The marginal populations in Pakistan and Hawaii might have undergone founding events or genetic bottlenecks. Possible strategies for the control of this species are proposed based on the invasion history and reconstructed expansion routes.


Introduction
Biological invasions constitute a growing threat to human economic activities and health, agriculture, and natural environments. Although many species of plants, animals and other organisms have been introduced around the world by human activities [1], invasion processes have been relatively slow in previous centuries. However, the pace has been accelerated by globalization [2]. Currently, human-mediated species invasions are a significant component of global environmental change [3].
An understanding of the history of invasion processes, i.e. a description of geographical pathways of invading populations, provides useful information about the origin and genetic composition of such populations and can be highly advantageous in attempts to quarantine, control or eradicate an invading species. Investigating the genetic architecture of invading populations offers an opportunity to infer a species' invasion history using molecular techniques [4]. Several different invasion scenarios have been identified, such as independent introductions [5], multiple introductions [6,7], population-genetic bottlenecks [8], founder events [9], invasive bridgeheads [10][11][12] and genetic admixture [13][14][15].
Although the reliability of some traits of mitochondrial DNA (mtDNA), such as neutrality and constant mutation rate, has been criticized [16][17][18][19][20], mtDNA markers are universally used in historical phylogeography. In contrast to nuclear markers, mtDNA exhibits some special evolutionary traits that are useful in the study of invading populations: the property of nonrecombination enables a relatively precise retracing of the origins of invasive populations [21]; high copy numbers in cells facilitate the amplification of DNA sequences, especially when historical specimens are used [22]; less effective gene size than in nuclear DNA, resulting from uni-parental inheritance, make it sensitive to selective neutrality and the loss of mutation-drift equilibrium and male-to-female sex ratio balance [22]. These characteristics combine to make mtDNA markers a mainstay of phylogeography [23].
The oriental fruit fly Bactrocera dorsalis, a phytophagous species of the Tephritidae family, was first recorded in Taiwan Island, China, in 1912 [24]. Enormous damage to agricultural production has been caused by this species through its larvae, which feed on fruit. Negative impacts on biodiversity in invaded regions have also been observed [25,26]. Due to the species' broad host range, wide climate tolerance and high dispersal capacity [27], its distribution range has covered the Asia-Pacific region in the last century, ranging from India to Hawaii and encompassing all of South-East Asia. Although the oriental fruit fly has been successfully eradicated in several regions (Ryukyu Islands in Japan, Nauru, Guam and Northern Mariana Islands) [28], the invasion process is rapid and continuous, and an invasion trend towards the poles in the wake of global environment changes has been predicted [28]. Despite the current and potential risk posed by this fruit fly, information about the invasion process of this species is scarce. Previous studies by Aketarawang et al. (2007) [29] and the authors [30] supported the supposition that the oriental fruit fly may have originated in the South-East China region facing the South China Sea, and spread further inland from there.
In this study, we aim to a) expand our previous work on oriental fruit fly expansion to cover all areas of the species' worldwide distribution, b) reconstruct the routes of the species' expansion from its origin in South-East China to areas of recent colonization in South-East Asia and to marginal populations in Pakistan and Hawaii, and c) discuss the application of these results to the planning of appropriate control measures.

Sampling, DNA extraction and amplification
A total of 552 oriental fruit flies from 35 populations were used in the analysis, including 256 B. dorsalis adults collected from 14 locations in China and one in Pakistan in 2008-2010 (for sampling information see Table 1). No specific permissions were required for these locations/activities, the locations are not privately owned or protected in any way. Information from these locations was completed with 296 additional sequences obtained from GenBank that together cover 20 locations in eight countries and constitute a reasonably complete coverage of the distributional range of the species (for sequence accession numbers see Table S1).
DNA was extracted using DNeasy Blood and Tissue Kits (QIAGEN) on each individual fly. A 505 base-pair partial segment of the mtDNA cox1 gene was amplified following Shi et al. (2005) [31]. PCR products were purified and sequenced on both strands by Invitrogen Biotechnology Co. (Shanghai, China). Unique sequences were deposited in GenBank under accession number JN643923-JN644053.

Data analysis
Sequences were aligned using ClustalX 2.0 [32] and then corrected manually. ARLEQUIN 3.5 [33] was used to identify unique haplotypes. Descriptive statistics (nucleotide diversity, number of haplotypes, number of variable sites, average number of nucleotide differences and haplotype diversity) were calculated with DNAsp 5.0 [34].
The Kimura two-parameter model in MEGA 5.0 [35] was used to estimate pairwise genetic distances of the 35 populations, then the population phylogenetic tree was reconstructed using the neighbor-joining (NJ) method in PHYLIP 3.69 [36], which is the most widely used method for building phylogenetic trees from distances [37]. Median-joining (MJ) networks of haplotypes were constructed using NETWORK 4.6 to infer the evolutionary relationships of haplotypes [38,39].
To infer asymmetric immigration rates between different regions, seven regions were defined based on geographic location and previous studies [30,31]  MIGRATE 3.27 [40] was used to estimate mutation-scaled effective immigration rate for entering and leaving each region per generation (M = m/m, where m is immigration rate and m is mutation rate per site per generation), and mutation-scaled effective population sizes (H = N e m, where N e is effective population size), by applying a Bayesian search strategy. Four independent MIGRATE runs of 20,000,000 generations with different random start seeds were performed to examine the consistency of the results, with the first 10,000 generations discarded as ''burn-in''. Analysis of molecular variation (AMOVA) and computation of fixation indices (F ST ) between pairwise regions were implemented in ARLEQUIN.
The demographic history of each region and of all populations pooled together was examined using mismatch distributions, population size before expansion (h 0 ), population size after expansion (h 1 ), population expansion time (t), Tajima's D, Fu's FS, and sum of squared deviations (SSD) between observed and expected mismatches. All parameters were calculated using ARLEQUIN and tested against the expected values of a recent population expansion with 1000 bootstrap replicates.

Genetic diversity
A total of 217 unique haplotypes were identified. Of these, 51 haplotypes were shared by at least two populations, the most frequent haplotype H3 being present in 16 populations. The HON population did not share any haplotypes with other populations. Detailed information of each population's haplotype composition is shown in Table S2.
Basic descriptive indices of genetic diversity for each population are shown in Table 2. The number of variable sites (V) ranged from 4-29. Haplotype diversity (H) ranged from 0.3250-1.0000, nucleotide diversity (p) from 0.0026-0.0128, and average number of nucleotide differences (k) from 1.3000-6.5000. Almost all populations showed high levels of genetic diversity, except for the LAH and HON populations.

Population genetic structure and gene flow
The AMOVA analysis revealed that the largest amount of genetic differentiation (86.23%) was to be found within populations, while genetic differentiation among groups (6.41%) and among populations within each group (7.36%) was limited. All fixation indices tested as highly significant (P,0.01) ( Table 3).
Pairwise F ST values ranged from 20.0018 (Taiwan Island and Hainan Island) to 0.4948 (South Asia and Hawaii), and increased with geographic distance. F ST values between Southeast China, Taiwan Island, Hainan Island, Central China and Southeast Asia were low. Relatively high and highly significant, F ST values were found between South Asia or Hawaii and the other five regions (Table 4).
Effective immigration rates per generation between paired regions were high. No asymmetric immigration rates were found, as indicated by overlapping 95% highest probability density (HPD) intervals between immigration rates into and out of each region. Due to the low effective population size, 0.00372 and 0.00141 in South Asia and Hawaii respectively, gene flows (H6M) entering these two regions were very limited (Table 5). A star-like MJ network was constructed, with some high frequency haplotypes (such as H3, H8, H22, H41, H43 and H49) located in the center and other rare haplotypes connected to them through several mutation steps. The haplotypes belonging to the Hawaii region were clearly separated from others in the network, with the exception of H208. H208 connected to H19 through one mutation step, as did the main haplotype of H210 with H154 (Fig. 3). Other haplotypes connected to H210 through several mutation steps (not shown).

Demographic history
Significantly negative Tajima's D and Fu's FS values were found among pooled populations, as well as in the separate populations from Southeast China, Central China, Hainan Island and Southeast Asia, suggesting that the B. dorsalis populations of those distribution regions did not conform to the theory of neutral evolution ( Table 6).
The unimodal mismatch distribution (Fig. 4) of these populations revealed that they were undergoing population expansion; however, a sudden population expansion model was rejected (based on significant SSD values between simulated and observed mismatch distributions) in the case of the pooled populations (P SSD = 0.0032) and the Central China region (P SSD = 0.0169).
Population expansion time (t) in the seven regions ranged from 2.506-5.828. Ratios between the effective population after

Invasion routes
Based on the South-East China origin of the oriental fruit fly [30,31], the observed increase in F ST values between South-East China and other regions of Asia, together with the increase in the geographic distances, indicate that this species may be colonizing westwards. This is demonstrated by asymmetric gene flow from South-East China to inland [31] and South-East Asia [30]. In the phylogeny of the 35 sampled populations, almost all of the South-East China populations and two South-East Asia populations were found in the first clade of the NJ tree, most South-East Asia populations and one South-East China population in the second clade, and Central China populations distributed among both clades, which suggests that the species may have been colonizing from the endemic region (South-East China) along two independent routes.
One invasion route may run from South-East China to Central China, as indicated by asymmetric gene flows from South-East China to inland China [31]. Sampling records from Mainland China and the genetic data (unimodal mismatch distributions, significant Tajima's D and Fu's FS values) suggest a gradual invasion process along this route, coupled with rapid population expansion [31]. Many shared haplotypes and high gene flows found between Taiwan Island, Hainan Island and South-East China also imply the same origin region. Although Taiwan Island and Hainan Island are divided from Mainland China by Taiwan Strait and Qiongzhou Strait respectively, the fly would have been able to disperse to the two islands by air and ocean currents [41] and especially via the increasingly frequent fruit and vegetable trade in recent decades [42]. Large numbers of founder individuals may have contributed to the high genetic diversity in the two islands.
Another invasion route may run from South-East China to South East Asia. The fact that the PX population (Guangxi Province, China -part of the South-East Asia region) located on the border of China and Vietnam shared haplotypes with South-East China and other South-East Asia populations (Table S2) indicates that the species may have been invading South-East Asia through Guangxi. Significantly negative Tajima's D and Fu's FS values suggest that due to population expansion, selective neutrality is not currently present in this species in South-East Asia [43]. Unimodal mismatch distributions [44] and nonsignificant SSD values, leading us to reject the null hypothesis of a sudden population expansion model [45], further support this hypothesis. The high genetic diversity of South-East Asian populations found in this study, which has also been detected using nuclear markers [30], indicates an absence of recent genetic bottlenecks. This is likely due to a large number of introduced individuals with high genetic variability, and the abundant host plants and suitable climate in this area. South-East Asia has been suggested as the hypothetical area of originof the oriental fruit fly populations now found in Yunnan, China [46]. Our finding that populations in Yunnan shared haplotypes with several inland China populations such as XS, WZ, JJ, WL and HX (Table S2) suggests that re-invasion into China may have been occurring from Yunnan. A similar invasion route was demonstrated using microsatellite markers [47].
The two independent invasion routes from South-East China may be coinciding in Central China. Populations in this area share haplotypes with both the South-East China and South-East Asia regions and are distributed in both clades of the NJ tree. Genetic admixture supplied by different independent routes in the form of multiple introductions, which have also been found in our previous study [31], is the likely cause of the high genetic diversity of the species in Central China.
Genetic evidence found in this study for the origins of B. dorsalis in Hawaii is equivocal. No shared haplotypes were found between Hawaii and other regions. MJ network analysis indicates that haplotype H208, unique to Hawaii, is derived by mutation from H19, mainly found in Taiwan Island, South-East China, Hainan Island and South-East Asia. However the other haplotypes found in this region are mutations of the unique haplotype H154 attributed to Wanzhou, a recently invaded area of Central China. This discrepancy may be due to insufficient haplotype sampling size. Based on historical records and the results of PCR-RFLP Table 5. Estimates of population size and effective immigration rate between population pairs. analysis of a mitochondrial control region [48], it has been suggested that the oriental fruit fly was introduced to Hawaii from Saipan Island (Northern Mariana Islands) by humans after the Second World War [49]. Although high population densities are found in many areas of the Hawaiian Islands [50], genetic diversity is relatively low (n = 5, H = 0.6000, p = 0.0034, k = 1.7000), which agrees with the results of previous research [30,48]. Small numbers of initially introduced individuals (h 0 = 0.01) may be the main cause of this genetic homogeneity; constant pest-control pressure associated with strong quarantine regulations imposed on the fruit trade may be another reason. Very limited gene flows detected between Hawaii and other regions indicate that the Pacific Ocean is a barrier to migrational exchange. No shared haplotypes (see also [48]), limited gene flow to other regions and numerous private alleles [30] suggest an independent evolutionary process taking place for B. dorsalis populations in the Hawaii Islands.
Introduction to the HP region (India) may have been directly from South-East China, from the nearby region of Myanmar. This interpretation is based on haplotype H6 being shared with the FZ and NN populations, H215, H216 and H217 being mutated from H3 (shared by Taiwan Island and South-East China), H22 (found in South-East China, Central China and South-East Asia) and H193 (unique haplotype of Myanmar). The high genetic diversity of HP (Table 2) is likely due to the different origin sources. Only two haplotypes were found in the LAH population of Pakistan: H134 is shared with the PZH population, and the main haplotype H201 is a mutation of H43 (shared by several Central China and South-East Asia populations). While detailed invasion routes are as yet unknown, it can be inferred that the main area of origin of the LAH population is South-East Asia and that PZH is a secondary source. Although multiple introductions have been observed in LAH, genetic diversity here was very low (Table 2), probably due to a relatively small number of individuals initially introduced to this region.

Key genetic characters for successful invasion
Because of the considerable differences in ecological conditions between the endemic and non-native regions, natural selection and adaption may be determinants for the success of an invasion before or during the settlement phase. Sufficient genetic variation is essential for evolutionary adaption in response to environmental change and can facilitate the adaption to new environments [24]. High genetic diversity in the oriental fruit fly was observed in its area of origin (South-East China). Large numbers of initiallyintroduced individuals containing much of the native genetic diversity can be assumed to represent the main contribution to the high genetic diversity in non-native regions like Central China and South-East Asia [51]. Multiple introductions and hybridization among distantly related populations in the non-native range may further enhance genetic diversity. Sufficient genetic variation may facilitate the adaptation of introduced individuals to selection pressures encountered in new habitats during the invasion process and help offset genetic drift in population settle phase with small number of immigrants.

Management strategies
The oriental fruit fly has to be considered a highly invasive species, as it is able to disperse efficiently and establish adventive populations in various tropical and sub-tropical climate zones. It may however be possible to develop some management strategies based on available invasion history information. For populations that are genetically well-connected and show high gene flow (e.g. Taiwan Island, Hainan Island, Mainland China and South-East Asia), any intervention that is geographically limited to one region is likely to fail, as neighboring populations would readily recolonize the region. Area-wide interventions aimed at reducing population numbers and economic damage would be a more feasible choice in this case, which would be also appropriate for HON population with relatively low genetic diversity but high population density [50]. For the geographically extreme and genetically independent population LAH, a local intervention aimed at eradication may, however, be the optimal solution.

Supporting Information
Table S1 Accession numbers of the cox1 sequences obtained from GenBank. (DOC)