The Complete Mitochondrial Genome of the Beet Webworm, Spoladea recurvalis (Lepidoptera: Crambidae) and Its Phylogenetic Implications

The complete mitochondrial genome (mitogenome) of the beet webworm, Spoladea recurvalis has been sequenced. The circular genome is 15,273 bp in size, encoding 13 protein-coding genes (PCGs), two rRNA genes, and 22 tRNA genes and containing a control region with gene order and orientation identical to that of other ditrysian lepidopteran mitogenomes. The nucleotide composition of the mitogenome shows a high A+T content of 80.9%, and the AT skewness is slightly negative (-0.023). All PCGs start with the typical ATN codons, except for COX1, which may start with the CGA codon. Nine of 13 PCGs have the common stop codon TAA; however, COX1, COX2 and ND5 utilize the T nucleotide and ND4 utilizes TA nucleotides as incomplete termination codons. All tRNAs genes are folded into the typical cloverleaf structure of mitochondrial tRNAs, except for the tRNASer(AGY) gene, in which the DHU arm fails to form a stable stem-loop structure. A total of 157 bp intergenic spacers are scattered in 17 regions. The overlapping sequences are 42 bp in total and found in eight different locations. The 329 bp AT-rich region is comprised of non-repetitive sequences, including the motif ATAG, which is followed by a 14 bp poly-T stretch, a (AT11 microsatellite-like repeat, which is adjacent to the motif ATTTA, and a 9 bp poly-A, which is immediately upstream from the tRNAMet gene. Phylogenetic analysis, based on 13 PCGs and 13 PCGs+2 rRNAs using Bayesian inference and Maximum likelihood methods, show that the classification position of Pyraloidea is inconsistent with the traditional classification. Hesperioidea is placed within the Papilionoidea rather than as a sister group to it. The Pyraloidea is placed within the Macrolepidoptera with other superfamilies instead of the Papilionoidea.


Introduction
The animal mitochondrial genome is a double-stranded circular DNA molecule, 14 to 20 kb in size, which encodes a conserved set of 37 genes, including 13 protein-coding genes (PCGs) plus the two ribosomal RNA (rRNA) genes and 22 transfer RNA (tRNA) genes [1,2]. Additionally, it also contains a control region, known as A+T-rich region in insects [3], including initiation sites of the transcription and replication of the mitogenome [1,4]. While the length of the A +T-rich region vary highly in that the presence of the indels and tandem duplicated elements [5]. The mitogenome is characterized by its small size, maternal inheritance, non-recombination, and rapid evolution [1,2,6]. Mitogenomes have been studied in a variety of fields, such as structural genomic [1,7], genetic resources [8], molecular evolution [9], population genetics [10], phylogeography [11], inter-ordinal and intra-ordinal relationships [12][13][14].
Taxonomically, S. recurvalis is a member of the family Crambidae, superfamily Pyraloidea. However, the number of reported mitogenome sequences in this superfamily is very limited. For the S. recurvalis mitogenome, only a partial of COX1 gene was reported [27,28]. In this study, we sequenced and described the complete mitogenome of S. recurvalis and compared its characteristics with other known lepidopteran mitogenomes. Then we reconstructed phylogenetic relationships within nine lepidopteran superfamilies using Bayesian inference (BI) and Maximum likelihood (ML) methods.

DNA sample extraction
Adult individuals of S. recurvalis were collected in Chengdu, China. The samples were preserved in 95% ethanol and stored at -20°C until used for DNA extraction. The whole genomic DNA was isolated from a single sample by applying phenol-chloroform protocol [18,29]. Product and quality of the DNA was assessed by electrophoresis in a 1.5% agarose gel and staining with ethidium bromide.

PCR amplification, cloning, and sequencing
The whole mitogenome of S. recurvalis was amplified in nine overlapping fragments. All primer sequences are shown in Table 1. Primers F1F, F4F, F4R, and F6R were from Cameron and Whiting [7], primers F3R and F5F were from Simon et al. [30], primer F8F was from Bybee et al. [31], and primer F8R was from Skerratt et al. [32]. The other specific primers were designed based on the conserved nucleotide sequences of the mitogenome sequences in homologous lepidopteran species, or the mitogenome fragments that we have previously sequenced.
PCR amplification conditions were as follows: an initial denaturation for 2 min at 95°C, followed by 35 cycles of denaturation for 40s at 92°C, annealing for 80 s at 53-57°C (depending on primer combinations), elongation for 1-4 min (depending on putative length of the fragments) at 62°C, and a final extension step of 72°C for 10 min. All PCR amplifications applied Takara LA Taq (Takara Co., Dalian, China) and performed on an Eppendorf Mastercycler and Mastercycler gradient. The PCR products were assessed by electrophoresis in a 1.5% agarose gel and staining with ethidium bromide. All PCR products were sequenced directly from both directions except for fragment F1. Since fragment F1 encompassed the A+T-rich region and some complex structures (e.g., poly-T and microsatellite-like repeat), giving rise to the failures of sequencing. So we utilized short PCR amplification with the primer pair AF1F and AF1R, and the PCR amplification conditions were as the long PCR amplifications. The PCR product of AF1 was purified with the E.Z.N.A. Gel Extraction Kit (Omega, USA) and ligated into the pMD19-T Vector (Takara Co., Dalian, China). Reconstructive plasmids were isolated from the transformed E. coli DH5α competent cells and sequenced with the primers M13-F and M13-R. All fragments were sequenced using ABI BigDye ver. 3.1 dye terminator sequencing technology and run on ABI PRISM 3730×1 capillary sequencers.

Sequence analysis and gene annotation
The whole mitogenome of S. recurvalis was assembled and completed by aligning the overlapping sequences of neighboring fragments using CLUSTAL X [33]. The 13 PCGs, two rRNA genes, and the A+T-rich region were identified by comparison with the homologous lepidopteran mitogenomes sequences (e.g., Cnaphalocrocis medinalis, NC_015985 and Maruca vitrata, NC_024099). The nucleotide sequences of 13 PCGs were translated into amino acid sequences on the basis of the invertebrate mitochondrial genetic code. The A+T content of nucleotide sequences and relative synonymous codon usage (RSCU) were calculated using the MEGA ver. 6.0 [34]. The AT skewness was calculated according to the formula: AT skew = [A-T] / [A+T] [35]. The secondary structure of rrnL and rrnS were drawn by XRNA 1.2.0 b (developed by B. Weiser and available at http://rna.ucsc.edu/rnacenter/xrna/xrna.html). The tRNA genes and secondary structures were identified using the tRNAscan-SE ver. 1.21 (http://selab.janelia.org/ tRNAscan-SE/) [36]. The secondary structures of two tRNA Ser genes, which we were unable to predict by using the tRNAscan-SE, were analyzed by comparison with the nucleotide sequences of the tRNA genes in the Crambidae (e.g., Cnaphalocrocis medinalis, NC_015985 and Dichocrocis punctiferalis, JX448619). The tRNA genes secondary structures were drawn using DNA-SIS ver. 2.5 (Hitachi Engineering, Tokyo, Japan).

Phylogenetic analysis
Phylogenetic analysis was performed based on the 13 PCGs of the complete mitogenome of S. recurvalis and 54 other lepidopteran mitogenomes downloaded from GenBank ( Table 2). The mitogenomes of Anopheles gambiae (NC_002084) [37] and Drosophila melanogaster (NC_001709) [38] were used as outgroups. Nucleotide sequences of the 13 PCGs from the mitogenomes of the 54 lepidopteran, two outgroup species, and S. recurvalis were translated into amino acid sequences. They were aligned with CLUSTAL X using default settings, and then back-translated into nucleotide alignments, then the unaligned and unmatched regions were removed, and the remaining nucleotide alignments were concatenated together. Nucleotide sequences of two rRNA genes from the mitogenomes of the 57 species were aligned with CLUSTAL X using default settings, the unaligned and unmatched regions were removed, and then the concatenated nucleotide sequences were combined to the end of the aligned nucleotide of 13 PCGs respectively. The concatenated nucleotide alignments of 13 PCGs and 13 PCGs+2 rRNAs yielded a nucleotide matrix of 10,719 bp and 12072 bp in length, respectively, which were used for phylogenetic analysis with BI and ML methods. By using Akaike Information Criterion (AIC) [39], the substitution model selection was calculated using the program Modeltest ver. 3.7 [40]. The TVM+I+G model was chosen as the best-fitting model for BI analysis with the dataset of 13 PCGs, and the second one was GTR+I +G. The GTR+I+G model was chosen as the best-fitting model for BI analysis with the dataset of 13 PCGs+2 rRNAs. The BI analysis was performed using MrBayes ver. 3.1 [41] under the following conditions: 10,000,000 generations, four independent chains (one cold chain and three hot chains) with tree sampling every 100 generations and a burn-in of 2500 trees. The confidence values of the BI tree were expressed as the Bayesian posterior probabilities. The posterior probabilities more than 0.9 were considered strongly-supported [42].

Results and Discussion
Genome structure and organization The mitogenome of S. recurvalis was found to be a circular molecule with 15,273 bp in length, which is well within the range of other lepidopteran mitogenomes, with lengths ranging from 15,122 bp in M. leda to 16,173 bp in T. renzhiensis ( Table 2). The mitogenome of S. recurvalis contained the typical set of 37 typical mitochondrial genes (13 PCGs, 22 tRNA genes, and two rRNA genes), and a major non-coding region known as the A+T-rich region, as has been found in other lepidopteran mitogenomes. Twenty-three genes were coded on the majority strand (J-strand) and the rest were coded on the minority strand (N-strand) ( Table 3 and Fig  1). This mitogenome was submitted to GenBank under the accession number KJ739310. The gene order and orientation of the S. recurvalis mitogenome were identical to that of other reported ditrysian lepidopteran mitogenomes, but differed from non-ditrysian groups ※, Incomplete mitogenomes lack the partial rrnS gene, the entire A+T-rich region and partial tRNA Met gene. Termination codons were excluded in 13 PCGs.
with the ancestral arrangement of tRNA Ile -tRNA Gln -tRNA Met , such as the species Thitarodes renzhiensis and Ahamus yunnanensis in Hepialoidea [46]. All the ditrysian lineages of lepidopteran mitogenomes are characterized by the gene order tRNA Met -tRNA Ile -tRNA Gln , revealing a translocation of tRNA Met to a position 5'-upstream of tRNA Ile , which differs from the hypothesized ancestral gene order of insects [47]. This suggests that the mitogenome arrangement of the lepidopteran insects may have evolved independently after splitting from a stem lineage of insects [48]. The nucleotide composition (A 39.5%, G 7.8%, T 41.4% and C 11.3%) of the S. recurvalis mitogenome indicated a high A+T content of 80.9%, which is well within the range of lepidopteran mitogenomes, which vary from 77.0% in S. incertulas to 82.7% in C. raphaelis, similar to that of C. medinalis (80.9%). The mitogenome A+T content was 79.3% in 13 PCGs, 85.3% in rrnL genes, 86.0% in rrnS genes, and 93.9% in the A+T-rich region, respectively. These values were consistent with the high values found in other lepidopteran mitogenomes ( Table 2). The AT skewness of the mitogenome was -0.023, indicating the occurrence of more T nucleotides than A nucleotides, as has been found in other lepidopteran mitogenomes, with the values ranging from -0.048 in E. kuehniella to 0.059 in B. mori (Table 2).

Protein-coding genes
The PCGs regions of the S. recurvalis mitogenomes were consistent with those of other lepidopteran mitogenomes. Nine of the 13 PCGs (ND2, COX1, COX2, ATP8, ATP6, COX3, ND3, ND6, and CYTB) were coded on the majority strand (J-strand), and the remaining four PCGs (ND5, ND4, ND4L, and ND1) were coded on the minority strand (N-strand) ( Table 3 and Fig 1). All PCGs initiated with a canonical start codon ATN with the exception of COX1, which may use an arginine CGA as the start codon. Specifically, seven PCGs (COX2, ATP6, COX3, ND4, ND4L, CYTB, and ND1) started with ATG, four PCGs (ND2, ATP8, ND3 and ND5) started with ATT, and one PCG (ND6) started with ATC. As for stop codon, nine PCGs (ND2, ATP8, ATP6, COX3, ND3, ND4L, ND6, CYTB and ND1) terminated with the standard Fig 1. Map of the mitochondrial genome of Spoladea recurvalis. Genes coded on the J strand (clockwise orientation) are blue-or green-colored, while the genes coded on the N strand (anti-clockwise orientation) are pink-or orange-colored. COX1, COX2 and COX3 refer to the cytochrome c oxidase subunits; CYTB refers to cytochrome B; ATP6 and ATP8 refer to ATP synthase subunits 6 and 8 genes; and ND1-ND6 and ND4L refer to the NADH dehydrogenase subunit 1-6 and 4L genes. tRNA genes are denoted as one-letter symbols according to the IUPAC-IUB single-letter amino acid codes: L1, L2, S1 and S2 refer to tRNA Leu(CUN) , tRNA Leu (UUR) , tRNA Ser(AGY) , and tRNA Ser(UCN) , respectively. CR refering to the A+T rich region and is brown-colored. stop codon TAA, whereas COX1, COX2 and ND5 used single T nucleotide, and ND4 used TA nucleotides as an incomplete stop codon. The non-canonical termination codons will be corrected by post-transcriptional modifications, such as polyadenylation, which occur during the mRNA maturation process [49,50]. The partial stop codons observed in most lepidopteran species minimize the intergenic spacers and gene overlaps may be one strategy for the selection of a stop codon [51].
The start codons for the COX1 gene of the lepidopteran insects have been the source of controversy in current studies. In other insects groups, some canonical codons were proposed as the COX1 start codon, such as TTA [52], TCG [53], TTG [54], and ACG [55]. In addition, some tetranucleotides, such as TTAG [48,56], ATAA [57][58][59][60], and some hexanucleotides, such as ATTTAA [37,61,62], TATCTA [63], and TATTAG [20,64,65], located immediately upstream of the CGA, have also been proposed as the start codons for the COX1 gene. However, a recent study, based on the transcript information of Anopheles funestus (Diptera), revealed that the translation initiation codon for the COX1 gene was TCG (Serine), rather than the atypical and longer codons which have been proposed for several other insects [66]. Data from the transcript map, with expressed sequence tags (ESTs) for the mitochondrial genome annotation of the legume pod borer Maruca vitrata (Lepidoptera: Crambidae), showed that the COX1 gene started with the CGA codon for arginine [67]. This start codon has been found previously well conserved in other lepidopteran species [68]; therefore, we tentatively designated CGA as the start codon of COX1 gene.

Codon usage
The relative synonymous codon usage (RSCU) value of the S. recurvalis mitogenome is summarized in Table 4. Excluding all initiation and termination codons, the 13 PCGs were 11,118 bp in total length, encoding 3,706 amino acid residues. The codons CCG, UGG, CGC, CGG, AGC, GGC, and AGG were not presented in these PCGs. The codons UUA (12.6%), AUU (12.0%), UUU (9.2%), AUA (7.0%), and AAU (6.7%) were the five most frequently used codons in the S. recurvalis mitogenome, accounting for 47.5%. These codons were all composed of A or U nucleotide, indicating the high biased usage of A and T nucleotides in the S. recurvalis PCGs. Likewise, the most frequent amino acids in the S. recurvalis mitochondrial proteins were Leu2 (13.0%), Ile (12.6%), Phe (10.1%), Met (7.0%), and Asn (6.6%), accounting for 49.3%. The least amino acid was Cys (0.8%). Codon usage of PCGs showed a significant bias of high A + T content, which played a major role in the A+T bias of the entire mitogenome.

Transfer and ribosomal RNA genes
The mitogenome of S. recurvalis contained the typical set of 22 tRNAs genes as have been found in most lepidopteran mitogenomes. The tRNAs genes were scattered throughout the circular molecule and range from 64 bp (tRNA Arg ) to 71 bp (tRNA Lys ) in size (Table 3). Fourteen tRNA genes were coded on the J-strand and the other eight on the N-strand, as with other lepidopteran mitogenomes (Table 3 and Fig 1). The putative secondary structure of the S. recurvalis tRNAs are shown in Fig 2. All tRNA genes were folded into the typical cloverleaf secondary structures, except for the tRNA Ser (AGY) gene, in which the dihydrouridine (DHU) arm was simplified as a loop, which has been observed in several other metazoan species, including insects [2]. The anticodons of the tRNA genes were identical to those most reported in insect mitogenomes. A total of 26 pairs of mismatched base pairs were found in 16 tRNA genes, including ten pairs in the amino acid acceptor stems, eight pairs in the DHU stems, seven pairs in the anticodon stems, and one pair in the pseudouridine (TCC) stems. The 21 U-G mismatched bases may have formed weak bonds, while the other five U-U mismatches have not. The mismatched base pairs in tRNAs are modified via RNA-editing mechanisms that are well known in arthropod mitogenomes [69].
As seen in other insect mitogenomes, two rRNA genes (rrnL and rrnS) were present in the S. recurvalis mitogenome. The rrnL gene was located between the tRNA Leu(CUN) and tRNA Val , and the rrnS gene was located between the tRNA Val and the A + T-rich region. The lengths of the rrnL gene and rrnS gene were 1384 bp and 781 bp, respectively, which are well within the lengths reported for these genes for other lepidopteran mitogenomes. The A + T contents of the rrnL gene and rrnS gene of S. recurvalis mitogenome were 85.3% and 86.0%, respectively. These values are also well within the range of other lepidopteran mitogenomes. The lengths of the rrnL genes varied from 1304 bp to 1474 bp and the A+T contents varied from 82.0% to 85.6%. The lengths of the rrnS genes varied from 739 bp to 891 bp and the A+T contents varied from 81.1% to 86.3% (Table 2).
Both the secondary structure of rrnL and rrnS broadly conformed to the secondary structure models proposed for these genes from other insects [7,17,[70][71][72]. In the mitogenome of S. recurvalis, six domains with 49 helices were present in the rrnL secondary structure (Fig 3A   Fig 3. Predicted rrnL secondary structure in the S. recurvalis mitogenome. Tertiary interactions and base triples are connected by continuous lines. Base-pairing is indicated as follows: Watson-Crick pairs by lines, wobble GU pairs by dots and other non-canonical pairs by circles. A represents the 5' half of rrnL; B represents the 3' half of rrnL. and 3B), as in M. sexta [7], C. medinalis [17], C. suppressalis [17], A. emma [70], L. malifoliella [71] and P. xylostella [72]. An internal 31bp large loop was located among the H991, H1057, and H1087, which is similar to that of C. medinalis, C. suppressalis, A. emma, L. malifoliella and P. xylostella but different from that of M. sexta. As in the rrnL secondary structure of M. sexta, C. medinalis, C. suppressalis, A. emma, and L. malifoliella, but differs from that of P. xylostella which contains a (TA) 8 microsatellite-like repeat inserted into the loop region of H2347. This region is highly variable within Lepidoptera and a consistent secondary structure for it has not been found within the available lepidopteran mitogenomes [7].
Three domains with 29 helices were present in the rrnS secondary structure of S. recurvalis mitogenome (Fig 4). A small loop was located in the H47 region of the rrnS gene compared to that of M. sexta and P. xylostella. This region has been found to be variable within lepidopteran species [7], which has been used to predict the phylogenetic relationships when it is combined with H39 and H367 [73].
The H673, H1047, H1068, and H1074 present in the rrnS secondary structure of the S. recurvalis mitogenome were similar in length and secondary structures to those of C. medinalis and C. suppressalis, but different from those of M. sexta, indicating they are also variable regions in the secondary structure of rrnS gene within lepidopteran species [7]. The secondary structures of two rRNAs were predicted mainly based on sequence comparison and mathematical methods. The region in rrnS contains the H1047, H1068, and H1074 may yield several possible secondary structures, but it is not ascertained which one may be utilized among these structures [7].

Non-coding and overlapping region
The non-coding region of the S. recurvalis mitogenome was 157 bp in total, which consisted of 17 non-coding regions, ranging from 1 to 48 bp and including 5 major non-coding regions of more than 10 bp (Table 3). The longest intergenic spacer (Spacer1, 48bp) was located between the tRNA Gln and ND2, with an extremely high richness in A + T nucleotides (95.8%). This spacer has been a feature common reported in the other lepidopteran mitogenomes, which has been sequenced to date, but which has not been found in non-lepidopteran insect species [7]. Spacer 2 (13 bp) was located between the ND2 and tRNA Trp , which was only smaller than the 18 bp space of Hyphantria cunea in the similar location in lepidopteran mitogenomes [74]. Spacer 3 (11 bp) was located between the tRNA Asn and tRNA Ser (AGY) , and Spacer 4 (21 bp) was located between the ND5 and tRNA His . Spacer 5 (15 bp), which was located between the tRNA Ser(UCN) and ND1, contained the motif ''ATACTAA", which represents a conserved feature across lepidopteran insects [7]. The motif has been proposed as a possible mitochondrial transcription termination peptide-binding site (mtTERM protein) [4].
Eight overlapping sequences of the S. recurvalis mitogenome were found in eight different locations, ranging from 1 to 16 bp with a total of 42 bp. The longest overlapping sequence (16 bp) was located between the tRNA Leu(CUN) and rrnL, and the second longest overlapping sequence (8 bp) was located between the tRNA Trp and tRNA Cys . The third longest overlapping sequence was located between the ATP8 and ATP6 with a seven-nucleotide overlapping sequence (ATGATAA), which has been a common feature reported for many other lepidopteran mitogenomes, and also reported for many animal mtDNAs [1]. A similar-sized overlapping sequence in the same location has been reported in other lepidopteran mitogenomes [70]. The remaining overlapping sequences were all less than 3 bp.

A + T-rich region
The A + T-rich region of the S. recurvalis mitogenome was located between the rrnS and tRNA Met with a length of 329 bp and with A+T nucleotides accounting for 93.9%, which are within the range of other lepidopteran mitogenomes, which vary from 88.1% in C. vasava to 98.3% in P. atrilineata ( Table 2). The A + T-rich region has also been found to contain the origin sites for transcription and replication [4].
The A+T-rich of the S. recurvalis mitogenome region was comprised of non-repetitive sequences and had some features in common with other lepidopteran mitogenomes (Fig 5). In Bombyx, the O N (the origin of minority or light strand replication) has the motif ATAGA preceded by an 18 bp poly-T stretch and is located 21 bp upstream from the rrnS gene [75]. Though the length of the poly-T stretch varies between species, this motif ATAGA is conserved within Lepidoptera [7]. In S. recurvalis, the motif ATAG was similarly located 19 bp downstream from the rrnS gene and followed by a 14 bp poly-T stretch. The poly-T stretch has been postulated to be a transcription control and/or the initiation site of replication [76]. A (AT) 11 microsatellite-like repeat was preceded by the motif ATTTA located in the 3' end of the A+T-rich region. A poly-A element was present upstream of tRNA Met as has been found in most lepidopteran mitogenomes.

Phylogenetic relationships
The estimated Transition/Transversion bias (R) of the first, second, and third codon positions of the 13 PCGs were 0.9, 0.6, and 3.7, respectively. The substitution rates were estimated under the Kimura 2-parameter model in MAGE ver. 6.0 [34]. The transversions and transitions in the first and the second codon position increased linearly with the extension of the phylogeny distance, while the transversions and transitions in the third codon position tended to reach the plateau state (Fig 6).
In our study, 54 lepidopteran mitogenomes were downloaded from Genebank to reconstruct phylogenetic relationships. The phylogenetic trees were inferred from the concatenated nucleotide sequences of 13 PCGs and 13 PCGs+2 rRNAs using BI and ML methods. The four tree topologies were almost identical to each other and indicated that S. recurvalis grouped with the other species within the Pyraloidea with strong posterior probabilities and bootstrap   Kristensen and Skalski (1999). (B) Phylogenetic tree inferred from nucleotide sequences of 13 PCGs using BI method (the numbers abutting branches refer to posterior probabilities). (C) Phylogenetic tree inferred from nucleotide sequences of 13 PCGs using ML method (the numbers abutting branches refer to bootstrap percentages). (D) Phylogenetic tree inferred from nucleotide sequences of 13 PCGs+2 rRNAs using BI method (the numbers abutting branches refer to posterior probabilities). (E) Phylogenetic tree inferred from nucleotide sequences of 13 PCGs+2 rRNAs using ML method (the numbers abutting branches refer to bootstrap percentages). Drosophila melanogaster (NC_001709) and Anopheles gambiae (NC_002084) were used as outgroups.
doi:10.1371/journal.pone.0129355.g007 support. Only one node was weakly supported in the phylogenetic tree inferred from the dataset of 13 PCGs using ML method, two nodes were weakly supported in the phylogenetic tree inferred from the dataset of 13 PCGs+2 rRNAs using BI method and ML method, respectively (Fig 7B, 7C, 7D and 7E). In our analysis, the subfamilies relationships within the Pyraloidea are consistent with previous studies based on molecular and morphological characteristics [42,77,78].
The 54 species represent nine Lepidoptera superfamilies: Tortricoidea, Bombycoidea, Noctuoidea, Pyraloidea, Geometroidea, Hesperioidea, Papilionoidea, Yponomeutoidea and Hepialoidea (a non-ditrysian superfamily). All of these superfamilies are shown to be monophyletic with the exception of the Papilionoidea. According to the recent consensus view of lepidopteran relationships by Kristensen and Skalski [16] (Fig 7A), the Bombycoidea, Geometroidea, Noctuoidea, and Papilionoidea are a group of the Macrolepidoptera; the Pyraloidea, together with the Macrolepidoptera are considered as the Obtectonera; the Tortricoidea together with the Obtectonera are considered as Apoditrysia. However, in our analysis, the superfamily Pyraloidea was placed within the Macrolepidoptera instead of the Papilionoidea. The phylogenetic relationships reconstructed in our study showed that Papilionoidea was a sister to the clade of (Pyraloidea+(Noctuoidea+(Bombycoidea+ Geometroidea))), which is congruent with previous studies [14,51,70-72, 74, 79-82], but differs from the morphological analysis of Kristensen and Skalski [16].
The phylogenetic relationships between Papilionoidea and Hesperioidea have been a subject of controversy in a long time [80]. Traditionally, Papilionoidea and Hesperioidea are considered as two different superfamilies. Hesperiidae (skippers) are usually placed in their own superfamily Hesperioidea while all other butterflies (Papilionidae, Pieridae, Lycaenidae and Nymphalidae) are placed in Papilionoidea [83]. Papilionoidea and Hesperioidea were proposed to be sister group based on a total-evidence analysis of both traditional morphological characters and new molecular characters from three gene regions (COXI, EF-1α and wingless) [83]. However, our results indicated that the superfamily Hesperioidea was placed within Papilionoidea and shared close relationships with (Pieridae+(Lycaenidae+Nymphalidae)), which is in accordance with previous studies based on morphological characters, nuclear genes, and mitogenome sequences [14,42,77,84].
Pyraloidea is not considered as being in the Macrolepidoptera, but grouped with the Macrolepidoptera, whereas the Papilionoidea is more distantly related from Macrolepidoptera [77]. According to the yet-to-be-tested hypothesis of Regier et al. [77], the position of the thoracic or abdominal ultrasound-detecting ''ears", which have never been previously theorized to have a common origin, are a candidate synapomorphy supporting Pyraloidea as a member of the Macrolepidoptera. Phylogenetic analysis by Regier et al. [77], based on five protein-coding nuclear genes (6.7 kb in total) of 123 species representing 55 families from 27 superfamilies of Ditrysia yielded the tree topologies were very similar to ours, but our tree topologies were even more strongly supported. Phylogenetic analysis in our study suggests that the complete mitogenome sequences are significant molecular markers for deep-level phylogenetic studies to verify morphological relationships and reconstruct phylogenetic relationships.