Multigene Phylogeography of Bactrocera caudata (Insecta: Tephritidae): Distinct Genetic Lineages in Northern and Southern Hemispheres

Bactrocera caudata is a pest of pumpkin flower. Specimens of B. caudata from the northern hemisphere (mainland Asia) and southern hemisphere (Indonesia) were analysed using the partial DNA sequences of the nuclear 28S rRNA and internal transcribed spacer region 2 (ITS-2) genes, and the mitochondrial cytochrome c oxidase subunit I (COI), cytochrome c oxidase subunit II (COII) and 16S rRNA genes. The COI, COII, 16S rDNA and concatenated COI+COII+16S and COI+COII+16S+28S+ITS-2 nucleotide sequences revealed that B. caudata from the northern hemisphere (Peninsular Malaysia, East Malaysia, Thailand) was distinctly different from the southern hemisphere (Indonesia: Java, Bali and Lombok), without common haplotype between them. Phylogenetic analysis revealed two distinct clades (northern and southern hemispheres), indicating distinct genetic lineage. The uncorrected ‘p’ distance for the concatenated COI+COII+16S nucleotide sequences between the taxa from the northern and southern hemispheres (‘p’ = 4.46-4.94%) was several folds higher than the ‘p’ distance for the taxa in the northern hemisphere (‘p’ = 0.00-0.77%) and the southern hemisphere (‘p’ = 0.00%). This distinct difference was also reflected by concatenated COI+COII+16S+28S+ITS-2 nucleotide sequences with an uncorrected 'p' distance of 2.34-2.69% between the taxa of northern and southern hemispheres. In accordance with the type locality the Indonesian taxa belong to the nominal species. Thus the taxa from the northern hemisphere, if they were to constitute a cryptic species of the B. caudata species complex based on molecular data, need to be formally described as a new species. The Thailand and Malaysian B. caudata populations in the northern hemisphere showed distinct genetic structure and phylogeographic pattern.

B. caudata is easily recognized by the possession of three yellow vittae on the thorax, a transverse black band across the furrow of the face, two pairs of scutellar bristles and a slightly enlarged costal band at the apex of the wing [1,2]. It was first described from Java, Indonesia [1].
In a study of B. caudata from Peninsular Malaysia based on 14 gene-enzyme systems with 17 loci, the proportion of polymorphic loci was P = 0.41 and the mean heterozygosity was H = 0.11 [3]. Most of the molecular and phylogenetic studies involving B. caudata used only a single individual and from a single locality, e.g. Ranong, Thailand [4], Brunei [5], and Chongqing region, China [6]. Our recent study, using the partial DNA sequences of cytochrome c oxidase subunit I (COI) and 16S rRNA genes, revealed that B. caudata from Peninsular Malaysia was distinctly different from B. caudata of Bali and Lombok, without common haplotype between them [7]. Phylogenetic analysis revealed two distinct clades, indicating distinct genetic lineage. However, the study did not include the nominal taxon from the type locality Java, Indonesia.
The present study examined the partial DNA sequences of the nucler 28S rRNA and ITS-2 genes, and the mitochondrial COI, COII and 16S rRNA genes in several populations of B. caudata from mainland Asia (Malaysia and Thailand) and B. caudata from Indonesia (Java, Bali and Lombok) to determine their taxonomic and phylogenetic relationships.. We report here the genetic relationships of the nominal species from Java, Indonesia to the other taxa from various geographical regions. The Indonesian taxon is distinctly different from that of the northern hemisphere.

Ethics statement
Bactrocera fruit flies are pests of agricultural crops. They are not endangered or protected species. No permits are required to study these insects.

Specimens, DNA Extraction, Polymerase Chain Reaction, and Sequencing
Male Bactrocera fruit flies were collected by application of cue-lure on the surface a green leaf. They were collected by means of specimen tubes, brought back to the laboratory and placed in deep freezer for about 10 min. The flies were then preserved in absolute ethanol and stored in deep freezer until use for DNA extarction. Flies were identified according to existing literature [1,7]. Bactrocera tau (Walker) was used as an outgroup. Details of the taxa studied and Gen-Bank accession numbers of representative sequences of the haplotypes found in this study are listed in Table 1.

Data Analysis
The sequences generated from this study and also other sequences of B. caudata from the Gen-Bank (Table 1) were used for data analyses. The sequence data were edited and assembled using ChromasPro v.1.5 (Technelysium Pty Ltd., Australia) software followed by alignment with ClustalX program [12]. The analyzed data were trimmed using BioEdit v.7.0.5.3 [13].
Kakusan v.3 [14] was used to determine the best-fit nucleotide substitution models for maximum likelihood (ML) and Bayesian (BI) analyses, selected using the corrected Akaike Information Criterion [15] and the Bayesian Information Criterion [16], respectively. Phylograms were constructed using TreeFinder [17] prior to the annotations of bootstrap values (BP) generated via 1,000 ML bootstrap replicates. Bayesian analyses were conducted using the Markov Chain Monte Carlo (MCMC) method via Mr. Bayes v.3.1.2 [18]. Two independent runs of 2x10 6 generations with four chains were performed, with trees sampled every 200 th generation. Likelihood values for all post-analysis trees and parameters were evaluated for convergence and burn-in using the "sump" command in MrBayes and the computer program Tracer v.1.5 (http://tree.bio.ed.ac.uk/software/tracer/). The first 200 trees from each run were discarded as burn-in (where the likelihood values were stabilized prior to the burn-in), and the remaining trees were used for the construction of a 50% majority-rule consensus tree. The effective sample size for COI+COII+16S was 1374.35 and for COI+COII+16S+28S+ITS-2 was 9268.35.

Haplotype Network Reconstruction and Genetic Divergence
The median joining (MJ) network [19] was used to estimate the genealogical relationships of the haplotypes. The MJ network was calculated using NETWORK v4.6.1.0 (http://www.fluxusengineering.com). PAUP Ã 4.0b10 software [20] was used to access the level of variation in the COI, COII and 16S rDNA among the selected samples of different taxa based on uncorrected (p) pairwise genetic distances.

Haplotype Diversity
The number of haplotypes based on the median-joining (MJ) network of the individual and concatenated aligned nucleotide sequences (outgroup was not included) is shown in Fig 1 and summarized in Table 1. The maps corresponding to the locations and hoplotypes are shown in Fig 2. The DNA variation sites for individual markers are shown in Tables A-E in S1 File. The 28S rRNA and ITS-2 genes were least variable, with two haplotypes each. Only one ITS-2 haplotype was found in the Indonesian taxa. Two 28S haplotypes were present in the populations of Peninsular Malaysia, East Malaysia, Indonesia and Thailand, with 2-bp difference.
Of the four COI haplotypes, C1 was the most common among the populations of Peninsular Malaysia and East Malaysia. Haplotype C2 occurred only in Indonesia, C3 in Pulau Tioman (Peninsular Malaysia) and C4 in Thailand. The largest difference of 37 bp was found between haplotypes C2 (Indonesia) and C4 (Thailand). Of the eight COII haplotypes, D1 was the most common. The Indonesian taxa were characterized by haplotypes D3 and D8. The largest difference of 28 bp was between haplotypes D3 (Indonesia) and D5 (Thailand). Three 16S haplotypes were present, with haplotype R2 present only in Indonesia. The largest difference of 13 bp was between haplotypes R2 (Indonesia) and R3 (Peninsular Malaysia).  The concatenated COI+COII+16S rDNA nucleotide sequences yielded eight haplotypes, with haplotypes M3 and M8 present only in Indonesia. The largest difference of 79 bp was between haplotypes M3 (Indonesia) and M8 (Thailand). A great diversity of haplotypes (12) was evident in the concatenated COI+COII+16S rDNA+28S rDNA+ITS-2 nucleotide sequences. The Indonesian taxa were characterized by haplotypes N4 and N12, with N4 having 82-bp difference from N11 (Thailand).

Phylogenetic Relationships
The total length of the aligned sequences of 28S rRNA, ITS-2, COI, COII, 16S rRNA, 28S+ITS-2+COI+COII+16S rRNA, and COI+COII+16S rRNA genes as well as the selected models used for ML and BI analyses are summarized in Table 2.
The support values and the grouping of the taxa of B. caudata varied from one marker to the others. The nucleotide sequences of the slower-evolving nuclear ITS-2 and 28S rRNA genes could not differentiate unequivocally the taxa of B. caudata from northern (Malaysia, Thailand) and southern (Indonesia) hemispheres ( Table 1). The mitochondrial COI, COII and 16S rDNA nucleotide sequences unequivocally distinguish the taxa from these hemispheres, as also reflected by the concatenated COI+COII+16S rDNA (Fig 3) and 28S rDNA+ITS-2+COI+COII +16S rDNA (Fig 4) nucleotide sequences.

Genetic Divergence
The uncorrected 'p' distances between different taxa of B. caudata based on COI, COII, 16S rDNA, 28S rDNA, ITS-2 (see Tables F-J in S1 File for the "p" distance for individual markers) and concatenated COI+COII+16S and COI+COII+16S+28S+ITS-2 nucleotide sequences are summarized in Table 3. Excepting 28S rRNA and ITS-2 genes, the uncorrected 'p' distances between the southern and northern hemisphere taxa for COI, COII, 16S and the concatenated sequences were several times higher than intra hemisphere values.
Based on concatenated COI+COII+16S nucleotide sequences, the Thailand (Korat) taxon differed from the other northern hemisphere (Malaysia) taxa with uncorrected 'p' distances of 0.56-0.77%; the 'p' distance for the Malaysian taxa was 0.00-0.21%. The uncorrected 'p' distance between the southern hemisphere (Indonesia) and Malaysian taxa was 4.46-4.53%, and 4.94% between Indonesian and Thailand taxa.

Discussion
In the present study, the taxa of B. caudata from the northern (Malaysia, Thailand) and southern (Indonesia) hemispheres form clearly two distinct genetic lineages (Fig 3).
The Tephritid genus Bactrocera is represented by several species complexes, especially in the subgenus Bactrocera [21]. An example in the subgenus Zeugodacus is the B. tau species complex represented by some seven cryptic species [22,23]. Morphological characters have proven to be problematic in identifying these sibling/cryptic species. Currently, molecular sequence data are used for determining the systematic status and phylogenetic relationship at various taxonomic levels, e.g. Bactrocera dorsalis species complex [24], molecular phylogeny of Dacini tribe [25], and higher phylogeny of frugivorous flies [26]. The mitochondrial COI, COII and 16S rRNA genes have been commonly used to study the phylogenetics of Bactrocera species [5,7,[27][28][29][30][31]. COI sequences revealed the occurrence of eight species of the Bactrocera tau complex [32]. Species in the B. tau complex showed a sequence divergence of 0.06 to 28%, and the sequence divergence was up to 29% between the complex and its tephritid outgroups, B. dorsalis and C. capitata [32].
In an earlier study we showed that B. caudata of Malaysia-Thailand-China in the northern hemisphere was genetically distinct from B. caudata of Bali-Lombok in the southern hemisphere [7]. It was not clear then which was the nominal taxon as the study did not include the taxon from Java, the type locality. In the present study, the taxon from Malang, Java was genetically similar to those of Bali and Lombok of Indonesia in the southern hemisphere. The occurrence of B. caudata in Lombok may be the result of introduction from Java and/or Bali.
The occurrence of two distinct genetic lineages of the taxa attributed to B. caudata based on morphology (possession of three yellow vittae on the thorax, a transverse black band across the furrow of the face, two pairs of scutellar bristles and a slightly enlarged costal band at the apex of the wing) [1,2] indicates that the taxa of northern (Malaysia and Thailand) and southern (Indonesia) hemispheres may be members of a species complex. In that case, in accordance with the type locality the Indonesian taxa belong to the nominal species. Thus the taxa from the northern hemisphere, if they were to constitute a cryptic species of the B. caudata species complex based on molecular data, need to be formally described as a new species.
In the present study, based on concatenated COI+COII+16S rDNA nucleotide sequences the Thailand and Malaysian taxa of B. caudata in the northern hemisphere had an uncorrected 'p' distance of 0.56-0.77%, compared to 'p' = 0.00-0.21% for the Malaysian taxa. The distinct difference is also reflected by the concatenated COI+COII+16S+28S+ITS-2 nucleotide sequences. It indicates the occurrence of distinct genetic structure and phylogeographic pattern of the B. caudata populations in the northern hemisphere.
Supporting Information S1 File. Supporting tables.  Table 3. Percentage of uncorrected "p" distance matrix between representative Bactocera caudata from various geographical locations in northern and southern hemispheres based on COI, COII, 16SrDNA, 28S rDNA, ITS-2 and concatenated COI+COII+16S and COI+COII+16S+28S+ITS-2 nucleotide sequences. Bactrocera species for ITS-2 from various localities. Table F is S1 File. Percentage of uncorrected "p" distance matrix of Bactocera caudata from various geographical locations in northern and southern hemispheres based on COI. Table G in S1 File. Percentage of uncorrected "p" distance matrix of Bactocera caudata from various geographical locations in northern and southern hemispheres based on COII. Table H in S1 File. Percentage of uncorrected "p" distance matrix of Bactocera caudata from various geographical locations in northern and southern hemispheres based on 16SrDNA. Table I in S1 File. Percentage of uncorrected "p" distance matrix of Bactocera caudata from various geographical locations in northern and southern hemispheres based on 28S rDNA. Table J in S1 File. Percentage of uncorrected "p" distance matrix of Bactocera caudata from various geographical locations in northern and southern hemispheres based on ITS-2. (DOCX)