The Mitochondrial Genome of Elodia flavipalpis Aldrich (Diptera: Tachinidae) and the Evolutionary Timescale of Tachinid Flies

Tachinid flies are natural enemies of many lepidopteran and coleopteran pests of forests, crops, and fruit trees. In order to address the lack of genetic data in this economically important group, we sequenced the complete mitochondrial genome of the Palaearctic tachinid fly Elodia flavipalpis Aldrich, 1933. Usually found in Northern China and Japan, this species is one of the primary natural enemies of the leaf-roller moths (Tortricidae), which are major pests of various fruit trees. The 14,932-bp mitochondrial genome was typical of Diptera, with 13 protein-coding genes, 22 tRNA genes, and 2 rRNA genes. However, its control region is only 105 bp in length, which is the shortest found so far in flies. In order to estimate dipteran evolutionary relationships, we conducted a phylogenetic analysis of 58 mitochondrial genomes from 23 families. Maximum-likelihood and Bayesian methods supported the monophyly of both Tachinidae and superfamily Oestroidea. Within the subsection Calyptratae, Muscidae was inferred as the sister group to Oestroidea. Within Oestroidea, Calliphoridae and Sarcophagidae formed a sister clade to Oestridae and Tachinidae. Using a Bayesian relaxed clock calibrated with fossil data, we estimated that Tachinidae originated in the middle Eocene.

Tachinid flies have a worldwide distribution and comprise nearly 10,000 described species [29]. Despite Tachinidae being the second-largest dipteran family, the mitogenomes of only two species have been sequenced completely: Exorista sorbillans (Exoristinae, Exoristini) and Rutilia goerlingiana (Dexiinae; Rutiliini) [26,27,28]. They are natural enemies of many lepidopteran and coleopteran pests of forests, agricultural crops, and fruit trees, and thus are of economic importance. The Palaearctic tachinid fly, Elodia flavipalpis Aldrich, 1933 (Exoristinae, Goniini), is usually found in Northern China and Japan and is in the same subfamily, Exoristinae, as Ex. sorbillans [30,31]. It is one of the primary natural enemies of the leaf-roller moths (Tortricidae), which are major pests of various fruit trees [32,33]. The monophyly of Tachinidae is broadly supported by phylogenetic studies, but questions remain about its place in the superfamily Oestroidea, particularly the relationship between Tachinidae and several large families of Oestroidea [27,28,[34][35][36]. There have been various studies of the divergence times of different groups of flies [7,11,12,37], with a recent study placing the rapid radiation of Schizophora 65 mya in the Paleocene [28]. However, owing to the uncertain taxonomic position of Tachinidae in Oestroidea, the evolutionary timescale of tachinid flies has not been well studied.
Here we describe the complete mitochondrial genome of El. flavipalpis. The mitogenome contributes to discussion on the evolutionary relationships and taxonomic positions of the Tachinidae, puts forward a bold hypothesis regarding the timescale in which the family originated, and will aid further molecular research of related taxa through the findings on primer selection for atypical regions and the optimization of PCR experiments.

Materials and Methods
Specimen Collection, DNA Extraction, and DNA Amplification Adult individuals of Elodia flavipalpis Aldrich, 1933 were collected directly from the pupae of their host species, the leafroller moth Spilonota lechriaspis Meyrick (Tortricidae). Moth pupae were collected at an organic apple orchard in Beijing, China, and hatched in the laboratory. The specimens were preserved in 99.5% ethanol and stored at 220uC for preservation of nucleic acids. DNA extraction from a single specimen was performed using the DNeasy Tissue kit (QIAGEN) following the manufacturer's instructions.
The fragments were first amplified with the universal PCR primers from Simon et al. [38] and some dipteran-specific primers from Han [39] and Weigl et al. [24] (Table 1). Primer pairs for amplification of the mitochondrial control region were modified according to Lessinger et al. [40] and Oliveira et al. [41]. Speciesspecific primers were designed using Primer Premier 6.0 software [42], based on the initial fragments aligned with sequences from three closely related species, Ex. sorbillans, Calliphoridae spp., and Oestridae spp. (Table 2). PCR products covering the remaining regions of the mitogenome were amplified using universal and species-specific primers ( Table 1). The entire genome of El. flavipalpis was amplified in 21 fragments. All of the primers were synthesized by Shanghai Sangon Bio-technology Co., Ltd (Beijing, China).
In order to reduce time required for sequencing and walking, we used both standard and nested PCR techniques. The PCR conditions for all of the fragments are shown in Table 1. The control region was amplified using a nested PCR approach. The ''external'' primers SR-J-14646 and N2-N-757 were used for the first step of the PCR (long target), followed by nested amplification (specific target) with the ''internal'' primers SR-J-14646 and N2-N-309. All of the fragments were amplified using TaKaRa LA Taq (Takara Co., Dalian, China), and performed on an Eppendorf Mastercycler gradient in 50 ml reaction volumes. The reaction volume consisted of 25.5 ml of sterilized distilled water, 5 ml of 106 LA PCR Buffer II (Takara), 5 ml of 25 mM MgCl2, 8 ml of dNTPs Mixture, 1.5 ml of each primer (10 mM), 3 ml of DNA template, and 0.5 ml (1.25 U) of TaKaRa LA Taq polymerase (Takara).

Sequence Assembly and Annotation
The PCR products were detected via electrophoresis in 1% agarose gel, purified using the 3S Spin PCR Product Purification Kit, and sequenced directly with the ABI-377 automatic DNA sequencer. All amplified products were sequenced directly using upstream and downstream primers from both directions. Sequencing was performed using ABI BigDye ver 3.1 dye-terminator sequencing technology and run on ABI PRISM 3730 61 capillary sequencers. Raw sequences were manually checked and assembled using the software BioEdit version 7.0.9.0 [43] and SeqMan (DNAStar, Steve ShearDown, 1998-2001 version reserved by DNASTAR Inc., Madison, Wisconsin, USA).
Protein-coding and ribosomal RNA genes and gene boundaries were identified by BLAST search (http://www.ncbi.nlm.nih.gov/ BLAST) and by comparison with sequences from other brachyceran species. The genomic positions and secondary structures of 20 of the 22 transfer RNAs (tRNAs) were identified by tRNAscan-SE software v.1.21 [44], with the remaining two (tRNA Arg and tRNA Ser(AGN) ) identified by visual inspection of genome sequence followed by inference using RNAstructure Ver.5.3 [45] (Table 3). Nucleotide composition was calculated using MEGA 4.1 [46]. Sequence data have been deposited in the NCBI database [GenBank: NC_018118].

Phylogenetic Analysis
We conducted a phylogenetic analysis using all of the mitogenomes of Diptera that were available as of August 2012 (58 species from 33 genera and 23 families), along with a lepidopteran outgroup species (Spilonota lechriaspis) ( Table 2). To assess the impact of saturation at third codon positions, we constructed two mitochondrial sequence alignments. The first dataset consisted of all 13 protein-coding genes and the 2 ribosomal RNA genes. The second dataset was the same, except for the exclusion of the third codon sites from the protein-coding genes. Sequences of protein-coding genes were translated into amino acid sequences, aligned using ClustalX v2.0.9 [47], then back-translated. Ribosomal RNA genes were also aligned using ClustalX, and refined by eye to account for their secondary structures. The GTR+I+G model was selected by the Akaike information criterion for both datasets in MODELTEST v.3.7 [48]. Maximum likelihood (ML) and Bayesian methods were used for phylogenetic analysis. For the ML analyses, we used RAxML v7.0.4 [49] with 1000 bootstrap replicates. The Bayesian phylogenetic analysis was performed using MrBayes v3.1.2. [50], with posterior distributions estimated using Markov chain Monte Carlo (MCMC) sampling. We conducted two independent runs, each with one cold and three heated chains. Samples were drawn every 100 MCMC steps over a total of 2 million steps. The first 25% of steps were discarded as burn-in.
Divergence times were estimated using the Bayesian phylogenetic method implemented in BEAST v.1.6.2 [51]. Rate variation among lineages was modelled using an uncorrelated lognormal relaxed clock [52], with the tree prior generated using a Yule speciation model. Based on previous research on divergence times in Diptera [7,11,12,28,37], we implemented fossil-based minimum ages for three clades and added one maximum age constraint. The origin of Diptera was assumed to be between 270 and 230 million years ago (mya) [28,53]. We also specified minimum age constraints of 195 mya for Brachycera and 64 mya for Schizophora [53][54][55]. Posterior estimates of parameters were obtained by MCMC sampling. Samples were drawn every 1000 steps over a total of 10 million steps. Tracer v1.5 [56] was used to confirm that the effective sample size of each parameter exceeded 100. The maximum-clade-credibility tree was calculated using TreeAnnotator v1.6.2, with the node times scaled to match mean posterior estimates.

Genome Organization
The 14932 bp mitogenome of El. flavipalpis contained all 37 genes usually present in bilaterians: 13 protein-coding genes, 22 tRNA genes, and 2 rRNA genes ( Figure 1, Table 3). The gene order corresponds to the typical pattern of brachyceran flies, such as Drosophila spp. [1,22,57]. Gene order is typically conserved in brachyceran flies, except for a few species in which additional tRNAs have been detected in the control region (Liriomyza trifolii, Chrysomya spp., and Cydistomyia duplonotata), and Dermatobia hominis, in which an additional tRNA Val is found on the minor strand. In contrast, there is an elevated number of tRNA rearrangements in nematoceran flies [13,14,20,21].
In total, there were 37 overlapping nucleotides between neighboring genes at 10 locations, with the length of overlapping sequence ranging from 1 to 7 bp. Excluding the control region, there were 90 intergenic nucleotides at 12 locations, in stretches ranging from 1 to 18 bp (Table 3).
As with other insects, the nucleotide composition of the El. flavipalpis mitogenome was biased towards A and T. The overall  Table S1). A detailed comparison of nucleotide composition, AT-skew, and GC-skew between the two closely related species, El. flavipalpis, Ex. sorbillans and R. goerlingiana, is given in Table 4.

Protein-coding Genes and Nucleotide Composition
Thirteen protein-coding genes were identified in the mitogenome of El. flavipalpis, with characteristics similar to those of other dipteran species (Table S1). The average A+T content across all protein-coding genes was 79.1%, similar to that of Ex. sorbillans (77.7%) and R. goerlingiana (76.2%). Table 4 shows the AT-skews and CG-skews of the three codon positions for El. flavipalpis, Ex. sorbillans and R. goerlingiana. In all three of these species, the A+T  (Table S1). AT-skew and GC-skew were used to analyse the biases in nucleotide composition. The A content was slightly lower than the T content at all three codon positions, but almost equal over the whole genome. Except for CO1 and ND1, all of the protein-coding genes have one of the common start codons for mitochondrial DNA, ATG, ATA, or ATT ( Table 3). The start codon TCG (Serine) in CO1 is also found in Ex. sorbillans, R. goerlingiana, and other Oestroidea species (Table 2). CO1 commonly uses nonstandard start codons in     many other flies [5,16,20,58,59]. Within Diptera, the start codon TTG (Leucine) in ND1 of El. flavipalpis is same as that in Dermatobia hominis (Oestroidea). Of the 13 protein-coding genes, CO2, ND4, and ND5 have incomplete stop codons and terminate with only a single thymine. Similar structural features have also been described for the mitogenomes of other dipteran taxa [4,5]. In addition, CYTB terminates with a complete stop codon TAG, whereas TAA or only a single thymine is utilized in some other fly species.

Other Regions of the Mitogenome
The mitogenome of El. flavipalpis bears all of the 22 standard tRNAs found in metazoan mitogenomes (Figure 1, Table 3). The total length of the tRNA genes is 1463 bp, with individual genes ranging from 62 to 71 bp and with A+T contents from 71% (tRNA Met ) to 92% (tRNA Glu ). With the exception of tRNA Ser (AGN) , all tRNAs possess the typical clover-leaf secondary structure ( Figure 2). The DHU-arm of tRNA Ser (AGN) is entirely absent, as observed in other insects [20,24,33,60]. In the secondary structures, the lengths of the amino acid acceptor arms (7 bp), anticodon arms (5 bp), and loops (7 bp) are relatively conserved, while the TyC loop (3-10 bp) is more variable. There are 21 mismatched base pairs in the tRNA genes, 14 of which are weak G-U matches (nine sites in DHU arms, three sites in amino acid acceptor arms, and two sites in anticodon arms). The other seven include U-U (5 bp), C-U (1 bp), and A-A (1 bp).
The lengths of the lrRNA and srRNA genes are 1337 bp and 783 bp, respectively. Both are encoded on the minor strand and the ends of those genes were assumed to be at the boundaries of the flanking genes [61]. As in other dipteran species, the lrRNA gene is flanked by tRNA Leu (CUN) and tRNA Val , while the srRNA gene is between tRNA Val and the control region. Their A+T contents were 84.6% for lrRNA and 81.7% for srRNA, which are within the range of other dipteran species (Table S1).
The length of the control region of El. flavipalpis is identical to that of Ex. sorbillans (105 bp), the shortest among the sequenced dipteran mitogenomes. It has an A+T content of 92.38%, which is lower than that of Ex. sorbillans (98.1%) and R. goerlingiana (92.6%) but higher than those of most other dipteran species. Owing to its short length, there is no distinct duplicate fragment found in this region. It should be noted that all three of the sequenced tachinid mitogenomes bear a control region that is shorter than those of most known in flies [62,63].

Phylogenetic Analysis
The phylogenies estimated using likelihood and Bayesian approaches were similar for both of the datasets that were analysed (Figures 3, 4, 5 and 6). Various higher-level relationships were consistent across the analyses. The monophyly of Brachycera, Cyclorrhapha, and Calyptratae were consistently supported (posterior probability = 1.00, ML bootstrap = 100), as was the monophyly of Schizophora (posterior probability = 1.00, ML bootstrap = 55, 66). The monophyly of superfamily Oestroidea has been widely accepted and has traditionally received good support from morphological characters [64][65][66][67]. Here we add support from the Bayesian analysis of mitochondrial DNA sequences that Oestroidea is a sister group of Muscidae (posterior probability = 1.00). However, our ML analyses of both datasets placed the family Muscidae within Oestroidea, although support was not high (bootstrap = 70, 60).
In Oestroidea, all four families are monophyletic, and Calliphoridae+Sarcophagidae was inferred as a sister group to Oestridae+Tachinidae. This result is similar to that obtained in analyses of morphology [34] and of 18S and 16S ribosomal DNAs [35]. The tree topology is broadly similar to that inferred from whole mitogenomes by Nelson et al. [27], except that Oestridae and a subfamily of Calliphoridae (Polleninae) are nested within Tachinidae. However, Nelson et al. [27] focused on the relationships within Calliphoridae, while species from other families of Oestroidea were only used as an outgroup in the analysis.
Our phylogenetic estimate differs from that obtained by Kutty et al. [36] in their analysis of four nuclear and four mitochondrial genes. They inferred the relationships (((Tachinidae, Oestridae), Calliphoridae), Sarcophagidae), with both Calliphoridae and Tachinidae paraphyletic (Oestridae is nested within Tachinidae, as are some calliphorid subfamilies). In contrast, Wiegmann et al. [28], considered Tachinidae to be more closely related to Calliphoridae, and as a sister taxon to Oestridae and Sarcophagidae. Owing to the morphological similarities shared by these four families, it is difficult to distinguish among these phylogenetic hypotheses using morphological data. The disparities among the molecular estimates are probably due to differences in the taxa sampled and the data being analysed. Kutty et al. [36] and Wiegmann et al. [28] used both nuclear and mitochondrial DNA sequences, but most were only partial sequences. Moreover, most of the Oestroidea species analysed by Wiegmann et al. [28] were represented by only two or three genes. Given that our analysis involved a larger and more complete data set, we believe that our results are more strongly supported.
The primary difference among the four phylogenetic trees here is in the placement of the superfamily Opomyzoidae, which consists of the families Furgusoninidae and Agromyzidae. It is closer to Drosophoridae in the Bayesian analysis (posterior probability = 1.00), but its position is unstable in the ML analysis (bootstrap = 55, 27). A similar result was obtained by Wiegmann et al. [28]. Another difference is seen in the placement of Bibionomorpha (Nematocera), which is the closest group to Brachycera in both of the trees inferred without third codon positions, but is unstable in the tree inferred from all three gene positions. It is even non-monophyletic in the Bayesian analysis, hinting at the possible negative phylogenetic effects of including third codon sites. The placement of Cecidomyiidae is problematic, which is also indicated by the long branch leading to this group. The two Cecidomyiidae species have undergone substantial reduction in mitogenome size, which results in their apparent distinctiveness and causes problems for sequence alignment. With additional sampling in Cecidomyiidae, future studies will be better equipped to reconstruct the molecular evolution of these mitochondrial genomes.

Estimates of Divergence Times
We used a Bayesian relaxed clock to estimate the evolutionary timescale of Brachycera (Figure 7). Our analysis suggests that the last common ancestor of extant Brachycera existed in the early Jurassic (,199 mya) (95% credibility interval: 195.9-206.7 mya). The schizophoran radiation took place during the late Cretaceous Figure 5. Bayesian tree of Diptera, inferred from a mitochondrial data set comprising 13 protein-coding genes (without third codon sites) and 2 ribosomal RNA genes. The tree was rooted using the outgroup taxon Spilonota lechriaspis (Lepidoptera). Numbers denote posterior probabilities of nodes. The lengths of very long branches have been reduced to aid viewing. The symbol '//' indicates a contracted branch, with the value above giving the length of contraction. Red lines indicate the differences among the four phylogenetic trees. doi:10.1371/journal.pone.0061814.g005 Figure 6. Maximum-likelihood tree of Diptera, inferred from a mitochondrial data set comprising 13 protein-coding genes (without third codon sites) and 2 ribosomal RNA genes. The tree was rooted using the outgroup taxon Spilonota lechriaspis (Lepidoptera). Numbers denote bootstrap values in percentages. Red lines indicate the differences among the four phylogenetic trees. doi:10.1371/journal.pone.0061814.g006 The Mitochondrial Genome and Timescale of Tachinid (,84 mya), and the clade Calyptratae containing Oestroidea and Muscidae (nested in Acalyptrata) split in the early Eocene (,60 mya) (95% credibility interval: 45.6-91.1 mya). These results are consistent with evidence from a number of amber specimens from Acalyptrata and Calyptratae [68]. Subsequently, Oestroidea divided into two groups in the middle Palaeocene (,56 mya) (95% credibility interval (CI): 42.7-85.0 mya). The split of Tachinidae with Oestridae is estimated to have occurred in the early Eocene (,48 mya) (95% CI: 37.8-77.9 mya), with this clade becoming the most speciose in Brachycera. Finally, the two tachinid flies El. flavipalpis and Ex. sorbillans are estimated to have separated in the early Oligocene (,33 mya) (95% CI: 15.5-60.1 mya).
Our estimates suggest that the most recent common ancestor of tachinid flies existed between 33 and 48 mya. The two tachinid samples used in this study are from the same subfamily Exoristinae, but are in separate tribes Goniini (El. flavipalpis) and Exoristini (Ex. sorbillans). Among the four subfamilies of Tachinidae, Exoristinae was probably the latest to emerge [69][70][71]. Therefore, the time of origin of tachinid flies should be in the earlier half of the time interval described above. We speculate that the most recent common ancestor of tachinid flies existed in the middle Eocene (,42-48 mya). Tachinid flies have a worldwide distribution, but are mainly found in Palaearctic and Nearctic regions. For a globally distributed family with significant differences in distribution between northern and southern continents, the relevant split is between the supercontinents Laurasia and Gondwana in the mid-Mesozoic. Our estimates for the origin of Tachinidae is much more recent than this. However, given that most tachinid species are distributed throughout the Holarctic Region, we suggest that the evolutionary history of tachinid flies is probably tied to the split of Laurasia in the Eocene rather than that between Laurasia and Gondwana.
We have shown that the availability of additional mitogenomes can make a valuable contribution to our understanding of the phylogeny and divergence times of Diptera. Our study serves as a useful primer for the evolution of tachinid flies, but the accuracy of divergence-time estimates can be improved by denser sampling of tachinid mitogenomes, a greater number of fossil calibrations, and combination with nuclear genes or morphological data. In addition, we suggest that the stable gene order among brachyceran species might be due to their comparatively short evolutionary timeframe.

Supporting Information
Table S1 Nucleotide composition of all available mitogenome sequences from Diptera. (DOC)