Distinct Genetic Lineages of Bactrocera caudata (Insecta: Tephritidae) Revealed by COI and 16S DNA Sequences

The fruit fly Bactrocera caudata is a pest species of economic importance in Asia. Its larvae feed on the flowers of Cucurbitaceae such as Cucurbita moschata. To-date it is distinguished from related species based on morphological characters. Specimens of B. caudata from Peninsular Malaysia and Indonesia (Bali and Lombok) were analysed using the partial DNA sequences of cytochrome c oxidase subunit I (COI) and 16S rRNA genes. Both gene sequences revealed that B. caudata from Peninsular Malaysia was distinctly different from B. caudata of Bali and Lombok, without common haplotype between them. Phylogenetic analysis revealed two distinct clades, indicating distinct genetic lineage. The uncorrected ‘p’ distance for COI sequences between B. caudata of Malaysia-Thailand-China and B. caudata of Bali-Lombok was 5.65%, for 16S sequences from 2.76 to 2.99%, and for combined COI and 16S sequences 4.45 to 4.46%. The ‘p’ values are distinctly different from intraspecific ‘p’ distance (0–0.23%). Both the B. caudata lineages are distinctly separated from related species in the subgenus Zeugodacus – B. ascita, B. scutellata, B. ishigakiensis, B. diaphora, B. tau, B. cucurbitae, and B. depressa. Molecular phylogenetic analysis indicates that the B. caudata lineages are closely related to B. ascita sp. B, and form a clade with B. scutellata, B. ishigakiensis, B. diaphora and B. ascita sp. A. This study provides additional baseline for the phylogenetic relationships of Bactrocera fruit flies of the subgenus Zeugodacus. Both the COI and 16S genes could be useful markers for the molecular differentiation and phylogenetic analysis of tephritid fruit flies.


Introduction
Fruit flies of the family Tephritidae are represented by over 4400 species worldwide [1]. Some 200 species are considered pests, causing direct losses to a wide variety of fruit, vegetable and flower crops [2]. The larvae of about 35% of the species attack soft fruits, and about 40% of species develop in the flowers of Asteraceae [3].
In the Oriental Region, fruit flies of the genus Bactrocerapreviously referred to the genus Dacus [4,5] -are of great economic and agriculture importance because of damage caused to commercial fruits and vegetables. Some 22 species have been listed as of economic importance in Asia [6]. Among these species, Bactrocera caudata (Fabricius) had not been recorded in the Australasian and Oceanian regions [5].
Bactrocera caudata has a Paleartic and Oriental distribution. It occurs in India, Sri Lanka, Myanmar, Thailand, Vietnam, China, Malaysia, Brunei and Indonesia (Sumatra, Java, Flores) [2]. It is a member of the subgenus Zeugodacus and a pest of commercial and edible flowers. Zeugodacus is almost exclusively associated with the flowers and fruits of Cucurbitaceae [3]. Specimens of B. caudata had been reared from flowers of pumpkin Cucurbita moschata in Peninsular Malaysia [7]. To-date there are no additional reports on the host plants of this fruit fly.
Known also as Dacus caudatus Fabricius and Bactrocera maculipennis Doleschall, B. caudata is recognized from other Zeugodacus with three postsutural yellow vittae by possession of a transverse black band across the furrow of the face, two pairs of scutellar bristles, and the costal band slightly enlarged at the apex [2,7]. The males are attracted to cue-lure.
Compared to other members of the Zeugodacus group, little attention has been given to the study on the genetic variation in B. caudata. In a study of Peninsular Malaysian B. caudata involving 14 gene-enzyme systems with 17 loci, the proportion of polymorphic loci was P = 0.41 and the mean heterozygosity was H = 0.11 [8].
To-date the molecular and phylogenetic studies involving B. caudata used only a single individual and from a single locality, e.g. Ranong, Thailand [9], Brunei [10], and Chongqing region, China [11]. Genetic information on B. caudata from various geographical areas of its distribution range also appear to be lacking.
The present study examined the DNA sequences of COI and 16S rRNA genes in several populations of B. caudata from Peninsular Malaysia and Indonesia (Bali and Lombok). These two mitochondrial genes have been commonly used for the study of the phylogenetics of Bactrocera species [9][10][11][12][13][14][15]. Furthermore, mitochondrion DNA markers possess simple structure, uniform organization of the genome, lack of recombination, and with maternal inheritance and relatively rapid evolutionay rates [13,16,17]. The resulting COI and 16S sequences revealed the occurrence of distinct genetic lineages in this fruit fly. They are genetically distinct from closely related species of the subgenus Zeugodacus.

Ethics Statement
No specific permits were required for the described field studies. The Bactrocera fruit flies are collected in gardens and not from any national parks or protected areas. No specific permissions were required as the locations were in abandoned areas or in campus gardens. The Bactrocera species are agricultural pests and are not endangered or protected species.

Specimens
Adult male B. caudata were collected by means of the sex attractant cue-lure (4-[4-(acetyloxy) phenyl]-2-butanone) obtained from Sigma. A small amount of this lure was applied on the upper surface of a green leaf. Fruit flies attracted to the lure were collected with the aid of specimen tubes and plastic bags. The lure remained effective for many hours. A related species, B. tau was hatched from infested fruits of Momordica charantia (bitter gourd) collected at University of Malaya campus. As outgroups, Dacus (Callantra) longicornis (Dlon1) was collected by cue-lure in Perlis, and Dacus sp. (Dlon2) from Gombak, Peninsular Malaysia. The specimens were preserved in ethanol and stored in the freezer until use. Identification of the fruit flies was based on available literature [2,3,7]

DNA Extraction, Polymerase Chain Reaction, and Sequencing
The genomic DNAs were isolated from two legs or thorax of each adult fruit fly preserved in absolute ethanol using i-genomic CTB DNA Extraction Mini Kit (iNtRON Biotechnology, Inc, Korea).
PCR amplification of both molecular markers was carried out using MultiGene Gradient Thermal Cycler (Labnet, USA). The total volume for the PCR amplification was 50 mL consisting of 5.0 mL of 106 i-Taq TM plus buffer, 5.0 mL of dNTP mixture (2.5 mM each), 0.25 mM of each primer, 1.0 unit of i-Taq TM plus DNA polymerase, and 50 pg to 1.0 mg DNA. The parameters of PCR amplification were: 3 min at 95uC, followed by 30 cycles of Table 1. Percentage of uncorrected ''p'' distance matrix between Bactrocera caudata and related species based on 16S (above diagonal) and COI (below diagonal) DNA sequences. denaturation at 94uC for 1 min, annealing at 50uC for 1 min, extension at 72uC for 1 min, and a final extension at 72uC for 10 min. PCR products were assayed by electrophoresis on 1.0% agarose mini gel stained with SYBRHSafe DNA gel stain (Invitrogen, USA) and visualized under UV light. The target DNA fragments were isolated and purified by the LaboPass TM PCR purification kit (Cosmo Genetech, South Korea). The purified PCR products were sent to a commercial company for sequencing. Samples were sequenced using BigDyeH Terminator v3.1 Sequencing Kit and analyzed on ABI PRISMH 377 Genetic Analyzer. Cycle sequencing conditions were as follows: 25 cycles of 96uC for 10 sec, 50uC for 5 sec and 60uC for 4 min at rapid thermal ramp of 1uC/sec. Samples were purified by Ethanol/EDTA/Sodium Acetate precipitation. The control DNA sequence used was the pGEM-3Zf (+) control template with M13F (229) control primer.

Genetic Divergence
To assess the level of variation in the COI and 16S rDNA among the selected samples of different taxa, uncorrected (p) pairwise genetic distances were estimated using PAUP* 4.0b10 software [19].

Haplotype Network Reconstruction
The genetic diversity or haplotype networks of B. caudata were analysed using TCS 1.13 [20] to calculate the minimum number of mutational steps by which the sequences can be joined with .95% confidence. The minimum number of mutational steps required to connect the different groups of haplotypes obtained using the Templeton et al. [21] method was identified using the fix connection limit option, as implemented in TCS software. Three separate data sets were carried out for network estimations: (1) all the COI Bactrocera sequences obtained from this study and sequences from GenBank; (2) all the 16S rDNA Bactrocera sequences obtained from this study and GenBank; and (3) combined sequences of COI and 16S rDNA from this study.   T   372  399  417  438  450  481  483  489  499  546  549  588  594  600  606  615  619 621

Sequence Alignment And Phylogenetic Analysis
The COI and 16S rDNA sequences were preliminarily aligned using the CLUSTAL X program [22] and subsequently manually aligned. The sequences of the COI and 16S rDNA were also combined to further understand the systematic relationships among B. caudata and closely related species. Several researchers suggested the need to use the incongruence of length differential (ILD) test or partition homogeneity test to determine whether the sequences contain congruent phylogenetic information [23,24]. In this study, partition homogeneity tests [25][26][27] were performed in PAUP* 4.0b10 software [19] with 100 replicates, heuristic search using the tree-bisection-reconnection (TBR) branch swapping algorithm. Due to some recent criticism against the application of the ILD [28], additional analyses on each gene were conducted for topology comparison.
The aligned sequences were subjected to maximum-parsimony (MP) and neighbour-joining (NJ) analyses using PAUP* 4.0b10 [19]. The MP tree was constructed using the heuristic search option, 100 random sequences additions, tree bisection reconnection (TBR) branch swapping, and unordered and unweighted characters. Bootstrap percentage (BP) was computed with 1000 replications. NJ bootstrap values were estimated using 1000 replicates with Kimura's two-parameter model of substitution (K2P distance) evolution model.
Maximum likelihood (ML) analysis was performed by Treefinder version October 2008 [29]. Bayesian (BI) analysis was performed using MrBayes 3.1.2 [30]. The best fit nucleotide substitution model was determined using KAKUSAN v.3 [31], which also generates input files for ML and BI. Best fit models were evaluated using the corrected Akaike Information Criterion [32,33] for ML and the Bayesian Information Criterion (BIC) with significance determined by Chi-square analysis.
The best selected model for COI marker was general timereversible (GTR) model of DNA evolution with a gamma shape parameter (G); the best selected model for 16S rDNA was J1 model with a gamma shape parameter (G); while the best selected model for the combined sequences of COI and 16S rDNA was J2 model with a gamma shape parameter (G).
ML analyses were performed with 1000 bootstrap replicates. Two parallel runs were performed in MrBayes analysis using four chains of Markov chain Monte Carlo (MCMC). One million Markov chain Monte Carlo (MCMC) generations were run, with convergence diagnostics calculated every 1000th generation for monitoring the stabilization of log likelihood scores. Trees in each chain were sampled every 100th generation. A 50% majority rule consensus tree was generated from the sampled trees after discarding the first 20%.

Sequences Alignment and Statistics
The aligned partial sequences of COI consisted of 637 characters; 47 sites were variable and 148 sites were phylogenetically informative. MP analysis yielded one single most parsimonious tree of 447 steps, a consistency index of 0.6130 and retention index of 0.7496. The aligned partial sequences of 16S rDNA consisted of 441 sites; 30 sites were variable and 29 sites were phylogenetically informative. MP analysis produced four single most parsimonious trees of 89 steps, a consistency index of 0.8090 and retention index of 0.8712. The combined partial sequences of COI and 16S rDNA consisted of 1078 characters; 124 sites were variable and 101 sites were phylogenetically informative. MP analysis yielded one single most parsimonious tree of 309 steps, a consistency index of 0.8770 and retention index of 0.83199.
The PH test for our datasets showed that COI and 16S rDNA as well as the combined data set of COI and 16S rDNA shared the same phylogenetic information, where P = 0.01. Hence combined data sets were used for the phylogenetic analyses.   Genetic divergence The uncorrected 'p' distances between different species of Bactrocera based on COI, 16S rDNA and combined COI and 16s rDNA sequences are summarized in Tables 1 and 2.

Haplotype Network Reconstruction
The aligned sequences of COI for B. caudata consisted of 637 sites. The haplotype network reconstruction showed two divergent groups of haplotypes ( Figure 1, Table 3). A minimum of 36 mutational steps was required to link these groups. The haplotype C1 differed from haplotype C2 by 36 changes.
The aligned sequences of 16S rDNA for B. caudata consisted of 435 sites. The haplotype network reconstruction showed three divergent groups of haplotypes ( Figure 2, Table 4). A minimum of 12 mutational steps was required to link these groups. The haplotype R1 differed from haplotype R2 by one base pair at site 23 of the aligned sequences and differed from R3 by 12 basepairs. Haplotype R2 differed from R3 by 13 basepairs.
The aligned combined sequences of COI and 16S rDNA for B. caudata consisted of 1072 sites. The haplotype network reconstruction showed three divergent groups of haplotypes ( Figure 3, Table 5). A minimum of 48 mutational steps was required to link these groups. The haplotype CR1 differed from haplotype CR2 by one change and from haplotype CR3 by 48 changes. Haplotype CR2 differed from CR3 by 49 changes.

Phylogenetic Relationships
All phylogenetic analyses produced the same topology of the phylogenetic trees. They differed only in associations at poorly supported nodes. Only ML trees were presented for the three sets of sequences.
The second group could be divided into two main subgroups. The first subgroup consisted of B. scutellata, B. ishigakiensis, B. diaphora and B. ascita sp. A as the most basal species and they were supported with moderate to high bootstrap values of 79% for ML; 99% for BI; 77% for MP and 87% for NJ. The second subgroup consisted of B. caudata and B. ascita sp. B and supported with low to high bootstrap values (ML = 91%; BI = 100%; MP = 60%; NJ = 79%).The second subgroup was further divided into two main clades: (1) Clade 1 comprising of only B. ascita sp. B with no bootstrap support; and (2) Clade 2 comprising B. caudata with low to moderate bootstrap support values of 55% for ML and 85% for NJ. Clade 2 was sub-divided into two sub-clades: B. caudata from Malaysia, China and Thailand; and B. caudata from Bali and Lombok -Nusa Tenggara, Indonesia and were supported with full bootstrap supports for all analyses.  Table 5. Variation sites in DNA sequences of Bactrocera species for mitochondrial COI and 16S rDNA from various localities.

Combined COI and 16S rDNA
The ML tree for the combined COI and 16S rDNA sequences consisted of two main groups with B. tau as the basal group but

Discussion
Among the component species of the subgenus Zeugodacus of the genus Bactrocera of tephritid fruit flies, distinct genetic lineages (cryptic species) have been found in B. ascita [9] and B. tau [12] based on COI sequences. The present finding of distinct genetic lineages in B. caudata based on COI and 16S sequences increases the number in the list.
The type locality of B. caudata is Java, Indonesia. However it had not been recorded to be present in Bali and Lombok [34] as well as Sulawesi [35]. The present specimens of B. caudata from Bali and Lombok are genetically clearly different from B. caudata of Malaysia and other parts of mainland Asia. Based on COI sequences, the uncorrected 'p' distance between B. caudata of Malaysia-Thailand-China and B. caudata of Bali-Lombok (Indonesia) was 5.65% (Table 1). The genetic distance based on 16S sequences ranged from 2.76 to 2.99% (Table 1). For the combined COI and 16S dataset, the genetic distance ranged from 4.45 to 4.46% (Table 2). These values for COI and 16S as well as the combined dataset were clearly different from the intraspecific values ('p' = 0-0.23). Furthermore, they are comparable to the genetic distance between related species of the subgenus Zeugodacus, e.g.'p' = 1.15% for 16S between B. cucurbitae and B. tau, and 'p' = 0.94% for COI sequences between B. diaphora and B.  (Table 1) -these 'p' values are the lowest between two distinct species.
Based on the main morphological characters (black band across the face, three yellow postsutural vittae and the costal band slightly enlarged at the apex) the Bali and Lombok specimens concur with the description of B. caudata. There are no distinct differences in other gross morphological characters that have been used for taxonomic determination. As in the case of B. ascita sp. A and B. ascita sp. B as well as B. tau complex, a detailed study based on bigger samples and specimens from various localities in Indonesia as well as other parts of the distribution range is needed to delimit the occurrence of B. caudata and determine the extent of distinct genetic lineages (or cryptic species). In particular, attention should be given to the taxa found in Sumatra, Java and Flores [2,34].
It is noteworthy that B. caudata from Indonesia is found in two adjacent islands, Bali and Lombok. Biogeographically, Bali is part of the Sundaland while Lombok is in Wallacea. However the close proximity of the two islands could easily facilitate the spread of the fruit fly from one island to the other through particularly movement of infested host plants. Studies are needed to determine the distribution of this species east of Bali and west of Lombok. Earlier studies based on COI [12] and COI and 16S sequences [11] indicated distinct separation of B. caudata from the group consisting of B. cucurbitae and B. tau. The present analysis concurs with these findings. Indeed B. tau, B. cucurbitae, B. cucumis (a member of subgenus Austrodacus) and B. depressa form a distinct clade from the other species.
In the present study which included species (e.g. B. ascita, B. scutellata, B. ishigakiensis, B. depressa) not treated together in earlier studies [9,11,36,37], the phylogenetic analysis based on COI sequences indicated that B. ascita sp. B was the closest relative of B. caudata and was clearly separated from B. ascita sp. A which formed a clade with B. scutellata, B. ishigakiensis and B. diaphora (Figure 4). The analysis based on COI sequences ( Figure 4) indicated that B. ascita sp. B was a sister group to B. caudata while B. ascita sp. A grouped with B. scutellata, B. ishigakiensis and B. diaphora (Figure 4). In an earlier study [37], without the inclusion of B. ascita and B. caudata, B. scutellata and B. ishigakiensis formed a clade with Bactrocera sp. (Japan).
In summary, this study has demonstrated distinct genetic lineages (or cryptic/sibling species) in B. caudata. Whether there exist more distinct genetic lineages (or cryptic species) in different parts of its distribution range needs to be studied. This study also confirms the usefulness of COI and 16S markers for species differentiation and phylogenetic determination.