Mitochondrial Genome Sequence and Expression Profiling for the Legume Pod Borer Maruca vitrata (Lepidoptera: Crambidae)

We report the assembly of the 14,054 bp near complete sequencing of the mitochondrial genome of the legume pod borer (LPB), Maruca vitrata (Lepidoptera: Crambidae), which we subsequently used to estimate divergence and relationships within the lepidopteran lineage. The arrangement and orientation of the 13 protein-coding, 2 rRNA, and 19 tRNA genes sequenced was typical of insect mitochondrial DNA sequences described to date. The sequence contained a high A+T content of 80.1% and a bias for the use of codons with A or T nucleotides in the 3rd position. Transcript mapping with midgut and salivary gland ESTs for mitochondrial genome annotation showed that translation from protein-coding genes initiates and terminates at standard mitochondrial codons, except for the coxI gene, which may start from an arginine CGA codon. The genomic copy of coxII terminates at a T nucleotide, and a proposed polyadenylation mechanism for completion of the TAA stop codon was confirmed by comparisons to EST data. EST contig data further showed that mature M. vitrata mitochondrial transcripts are monocistronic, except for bicistronic transcripts for overlapping genes nd4/nd4L and nd6/cytb, and a tricistronic transcript for atp8/atp6/coxIII. This processing of polycistronic mitochondrial transcripts adheres to the tRNA punctuated cleavage mechanism, whereby mature transcripts are cleaved only at intervening tRNA gene sequences. In contrast, the tricistronic atp8/atp6/coxIII in Drosophila is present as separate atp8/atp6 and coxIII transcripts despite the lack of an intervening tRNA. Our results indicate that mitochondrial processing mechanisms vary between arthropod species, and that it is crucial to use transcriptional information to obtain full annotation of mitochondrial genomes.


Introduction
The mitochondrial genome encodes genes involved in oxidative phosphorylation and a unique translation system of 2 rRNAs and 22 tRNAs used for synthesis of the 13 inclusive protein coding genes (PCGs). Within animal species the mitochondrial genome shows a lack of introns and little intergenic space, the exception being the AT nucleotide rich displacement loop (D-loop) which encodes the origin of replication and promoters for the translation of both the heavy (H) and the light (L) strands (heavy strand promoter, HSP; light strand promoter, LSP) [1]. Thus, both H and L strands of the mitochondrial genome are transcriptionally active, and produce polycistronic RNA transcripts (cistrons) [2,3,4]. The tRNA regions form characteristic cloverleaf secondary structure within the initial polycistronic transcripts, and they are recognized by cleavage mechanisms that result in processed transcripts that encode one or more PCGs [5]. This mode of gene expression is reminiscent of the bacterial cistronic mode of gene expression from which eukaryotic mitochondria are believed to have been derived. As a consequence of gene order in the mitochondrial genome, the tRNA-punctuated mode of transcript processing can result in mature transcripts that encode more than one PCG [5], where bicistronic transcripts are predicted for mitochondrial PCG sequences (cds) that overlap within the genomic sequence, which includes atp8/ atp6 and nd4/nd4L within insect mitochondria.
Mitochondrial genomes have been sequenced for insect species within the order Lepidoptera: the first ones sequenced were from the silk moth Bombyx mori [6] and the corn borer Ostrinia spp. [7]. Lepidopteran mitochondrial genomes have a relatively conserved gene order and orientation [8], which is shared with that of Drosophila species [9]. Despite the generation of full mitochondrial genome sequences, the corresponding annotation of gene coding regions in Lepidoptera has largely relied upon comparison to Drosophila gene models. There are discrepancies in mitochondrial gene boundaries in Lepidoptera, among them (1) the ambiguity in the translation start site of cytochrome c oxidase subunit I (coxI) at either a proposed arginine codon (CGR) or a TTAG site [6], and (2) the assumed polyadenylation following a T nucleotide at the terminus of coxII that would complete a truncated stop codon [9]. Expressed sequence tags (ESTs) are a source of gene expression information, and are typically inclusive of mitochondrialderived transcripts. The use of EST data to assemble mitochondrialderived transcripts has proven valuable in characterization of gene boundaries, polycistronic transcripts, and differential transcript processing among tissues, as well as for the quantification of mitochondrial transcript stability [10,11]. Despite the utility of ESTs in the annotation of mitochondrial genomes, these data are rarely incorporated into annotation efforts.
In this paper, we describe the nearly complete mitochondrial DNA sequence for the legume pod borer (LPB), Maruca vitrata Fabricius (Lepidoptera: Pyraloidea: Crambidae). This insect species is a crop pest found throughout tropical and subtropical regions of the world. Larval stages of LPB feed upon leaves, flowers and pods of leguminous plants [12,13,14,15] and cause significant yield loss to legume crops cultivated in Southeast Asia [16,17,18], South Asia [19,20,21], and Central America and South Americas [22,23]. Its most significant impact is in sub-Saharan Africa [24,25,26,27], where between 20-80% yield reductions are incurred [25]. Accordingly, M. vitrata has been identified as a major emerging threat to legume production in developing and under-developed nations. The mitochondrial genome information provided here complements the mitochondrial coxI DNA markers developed by Margam et al. (2010) [28], by providing additional sequence data to assist in defining population structure, and gene flow, which will hopefully lead to sustainable and economically viable methods of pest management. With the exception of Drosophila melanogaster, this study provides the first instance where (i) mitochondrial genome annotations have been validated by transcribed sequence data, and (ii) predictions of processed mitochondrial transcripts (cistron) have been used for both structural and functional annotation of genes. This added value information suggests that available EST information is a valuable resource for mitochondrial genome annotations.

Materials and Methods
Sequence and annotation of the M. vitrata mitochondrial genome Genomic DNA was isolated from a M. vitrata specimen (from Burkina Faso; BUR38) [28] using a DNeasy animal tissue kit following the manufacturer's instructions (Qiagen, Valencia, CA). A total of 100 ng genomic DNA was used as the template in 50 ml PCR volumes that also contained 10 pmol primers, 5 ml 106PCR buffer, 0.4 ml Taq polymerase (New England Biolabs, Ipswich, MA), and 1.2 ml 10mM dNTP. Polymerase chain reaction (PCR) was carried out in an Eppendorf Mastercycler thermocycler (Eppendorf, Hamburg, Germany) using the thermal cycling profile of 95uC denaturation for 2 min, followed by 35 cycles of 95uC for 30 s, 52uC for 45 s, 72uC for 1 m, as well as a final cycle of 72uC for 8 m. The PCR products were cleaned to remove any residual primers and nucleotides using Qiaquick PCR purification kits (Qiagen, Valencia, CA) following the manufacturer's protocols.
The PCR products were cycle sequenced using 1 ml of purified template, 1 ml of 106 BigDye TM (ABI PRISM TM BigDye TM Terminator Cycle Sequencing Ready Reaction Kit; Applied Biosystems, Foster City, CA), 1 pmol forward or reverse primer (Table 1), and 6 ml of ddH 2 O using the conditions: 95uC for 2 min, followed by 98 cycles of 95uC for 10 s, 50uC for 5 s, and 60uC for 4 min. Cycle-sequencing products were precipitated using an ethanol-sodium acetate procedure. Precipitates were dissolved in 35 ml of ddH 2 O. From this solution, 15 ml was separated on an ABI 3500 Sequencer (Applied Biosystems, Foster City, CA). Sequences from PCR fragments obtained from BUR38 and PR08 were assembled independently using the NextGENe software (SoftgeneticsH, State College, PA), and the assemblies were exported in FASTA format.
The FASTA-formatted BUR38 mitochondrial sequence was imported into a local database using BioEdit [29], and was queried with O. nubilalis protein and tRNA sequences from GenBank ID: AF442957.1. Protein cds were defined based on invertebrate mitochondrial translation table using the Virtual Ribosome (http://www.cbs.dtu.dk/services/VirtualRibosome/) [30]. The tRNA gene boundaries and folded structures were defined using tRNAscan-SE (http://lowelab.ucsc.edu/tRNAscan-SE/) [31] with the following parameters: (i) search mode -organellar; (ii) search type -cove only; (iii) cove cut off score 215; (iv) translation code used -invertebrate mitochondrial; (v) maximum intron length -40; and (vi) pseudogene checking -disabled. The mitochondrial sequences were imported into the Artemis visualization and annotation tool [32], positional information of sequence features was added, and the annotated data exported in GenBank(.gb) format.

Phylogenetics
The FASTA-formatted mitochondrial genome assembly for M. vitrata was used in a query in the National Center for Biotechnology Information (NCBI) non-redundant nucleotide database (nr) with the blastn search algorithm [33]. The resulting hits to full mitochondrial genome sequences for species of Lepidoptera were downloaded in FASTA format. The FASTA formatted mitochondrial genome sequences were imported into the MEGA 4.0 software package sequence alignment application [34] and a multiple sequence alignment was performed with the ClustalW algorithm using default parameters (gap opening penalty 15, gap extension penalty 6.66, weight matrix IUB, and transition weight of 0.5). The nucleotide diversity (p), base composition, transition to transversion bias (R; [35]), disparity in substitution among nucleotide sites (Disparity Index), and test substitution homogeneity [36] of the aligned sequence was analyzed by MEGA 4.0 [34].
Maximum likelihood analysis of derived amino acid sequences, to infer the phylogenetic relationships among mitochondrial genome sequences, was performed using the PHYLIP software package [37]. Concatenated protein sequences from each species of Lepidoptera that has a complete mitochondrial genome within GenBank (current December 7, 2010) were imported into the MEGA 4.0 sequence alignment suite, aligned using the ClustalW algorithm (default parameters using the PAM protein weight matrix), and all gaps deleted manually. One thousand bootstrap pseudo-datasets were constructed using the program seqboot, which was used as input into ProML that used a Hidden Markov Model method to infer evolutionary rates among residue positions [38] with the Jones, Talyor, and Thorton [39] probability matrix. An unrooted majority rule consensus tree was generated using Consense, and viewed using TreeView 1.6.6 [40].

EST sequencing and mitochondrial transcript mapping
A total of 60 LPB larvae comprising of 3 rd , 4 th and 5 th instars (20 each from Taiwan, Puerto Rico and Australia) were dissected to obtain the midgut and the salivary-gland tissues. Total RNA was extracted from these tissues using a TRIzolH Reagent protocol (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. RNA was quantified on a NanoDrop 2000 (Thermo Scientific, Wilmington, DE). First-strand cDNA was synthesized from 1 mg of total RNA using a BD Smart PCR cDNA synthesis kit (BD Biosciences, San Jose, CA). The first-strand cDNA was then amplified by a BD mix for 15 cycles (BD Biosciences, San Jose, CA) following the manufacturer's protocol except that a modified CDS II/39 primer 59 -TAG AGG CCG AGG CGG CCG ACA TGT TTT GTT TTT TTT TCT TTT TTT TTT VN -39 (IDT Inc.) was used to avoid long homopolymer repeats. Subsequent to first-strand synthesis, the cDNA was then amplified using PCR Advantage II polymerase (Clontech Inc.) with the following thermal cycling program: (i) 1 min at 95uC, (ii) 21 cycles of 95uC for 7 sec, and 68uC for 6 min. A 2 ml aliquot of the PCR product was analyzed on a 1% agarose gel to determine the amplification efficiency. The PCR product was then subjected to SfiI digestion (10 units) for 2 h at 50uC to remove the concatemers formed by CDSIII/39 and the SMART IV primers. A Qiaquick PCR purification kit (Qiagen, Valencia, CA) was used to remove the leftover primers and nucleotides from the amplified cDNA. The quality and quantity of the cDNA library was evaluated by both spectrophotometry and gel electrophoresis.
Sequencing and assembly: Amplified cDNA was submitted to the Keck Genomic Center (University of Illinois at Urbana Champaign) for library construction and sequencing. Two mg of amplified cDNA was used for library construction followed by pyro-sequencing on a Roche 454 GS-FLX (Roche, Basel, Switzerland) using established protocols [41]. The adaptor sequences were identified and the trim positions changed in .sff files using cross-match (http://www.phrap.org), sff tools from Roche (https://www.rocheapplied-science.com) and custom-built Java scripts. Sequences shorter than 50 nucleotides or containing homopolymers (in which 60% over the entire length of the read is represented by one nucleotide) were not included for assembly. The sequences were assembled with the Lasergene software (http://www.dnastar.com), using the modified .sff files. Raw sequence data were obtained from .sff files, and assembled into contigs using the Roche GS De Novo Assembler (i.e., Newbler assembler), and data comprising all non-redundant contigs were exported to FASTA format.
A local database, MvEST01, containing all Newbler-assembled non-redundant contigs was constructed using BioEdit [29]. MvEST01 was queried with 13 Ostrinia nubilalis mitochondrial protein sequences and 21 tRNA sequences obtained from GenBank accession AF442957 using the tblastn search algorithms [33]. The results were then filtered for hits with $90% identity, and inclusive contigs were retrieved from MvEST1.

Scaffolding of M. vitrata ESTs and transcript mapping
The EST reads were scaffolded against the M. vitrata mitochondrial genome assembly using the NextGENe software (Soft-geneticsH, State College, PA; parameters used were -matching base number . = 8 and matching base percent . = 70), and contiguous overlapping sequences exported as consensus contigs in FASTA format. The coding region(s) of contig sequences were annotated manually via blastx search of the M. vitrata mitochondrial gene model developed previously, and derived peptide sequence were predicted using the Virtual Ribosome v.1.1 [30].

Results and Discussion
Sequence and annotation of the M. vitrata mitochondrial genome Nineteen overlapping PCR products generated using the oligonucleotide primer pairs shown in Table 1 yielded a total of 15,385 bp of sequence data. The data were assembled into a 14,054 bp partial sequence of the M. vitrata mitochondrial genome that was submitted to GenBank (HZ751150). This sequence lacks only the displacement loop (D-loop) that encodes the origin of replication or heavy strand promoter (HSP) and light strand promoter (LSP). The M. vitrata mitochondrial genome showed a  high A+T nucleotide content of 80.0%, typical of animal and insect mitochondrial genome sequences [42,43]. The A+T bias in M. vitrata is lower than that in any other mitochondrial genome sequence from Lepidoptera (Fig. 1), but this likely is the consequence of our partial sequence that does not contain the A-T rich D-loop. Although it does not include the entire mitochondrial genome, the M. vitrata assembly allowed for the identification and analysis of 2 rRNAs, 13 protein coding genes, and 19 of 22 tRNA genes (Fig. 2). The gene order and orientation seen in M. vitrata is typical for insect mitochondrial genomes ( Table 2) [9], and is identical to that of the other Crambid species O. nubilalis and O. furnacalis [7]. Phylogenetic analyses indicated that the M. vitrata mitochondrial DNA grouped with a clade containing other species from the lepidopteran Family Crambidae, and clustered with other Families. A clear distinction between moth and butterfly species was apparent (Fig 1; alignment not shown). Thus, the use of the entire mitochondrial genome may clarify phylogenetic relationships that could not be determined using smaller sequence sampling methods [44].
Protein coding regions encompassed 10,266 of the 14,052 bp assembled sequence (75.5%) and showed an A+T content of 82.9%, which is slightly higher than that of the entire sequence, but both were consistent with the high A+T content of insect [9,43] and lepidopteran mitochondrial DNAs [45]. This AT-bias was reflected in the codon usage of the 14 PCGs (Table 3), where phenylalanine (F; UUU), isoleucine (I; AUU), leucine (L; UUA), methionine (M; AUA), and tyrosine (Y; UAU) were proportionately the most represented. The AT-bias in codon usage also was likewise observed in that 3218 of 3422 codons had third positions with either an A or T nucleotide (94.0%), and was significantly higher proportion than at the first codon positions (73%; x 2 = 4.69, P-value = 0.030) and second codon positions (69.8%; x 2 = 6.23, Pvalue = 0.013). This difference may reflect the selection for optimal codon usage [46] or codons that match the anti-codons of tRNAs [47].
The methionine start codons, UTG, were used by 6 of 13 protein coding genes (46.2%). In contrast, an atypical Ile codon (AUU) was used to initiate protein synthesis in the atp8, nd3, nd5, and nd6 genes, and Phe (UUR) was used in the nd1 and nd2 genes, and Arg (CGA) was used for the coxI gene. The use of Ile for initiation of peptide synthesis has previously been reported in other insects [48] including Lepidoptera [6,7], which suggests that it is common in this insect Order. Additionally, the cox1 gene was first annotated with an Arg (CGA) initiation codon for protein synthesis in Drosophila yakuba [9], and the same codon was also subsequently shown for the Lepidopteran species B. mori [6], O. nubilalis and O. furnacalis [7], Adoxophyes honmai [49], Coreana rephaelis [50], Antheraea pernyi [45], B. mandarina [51], Ochrogaster lunifer [52], Artogeia melete [53], Eriogyna pyretorum [54], and Hyphantria cunea [8]. The nucleotide sequence, TTAG, located immediately upstream of the putative Arg CGR start codon of coxI was proposed to serve in a non-standard initiation process [6]. Our alignment of the mitochondrial genome sequence from 19 species of Lepidoptera ( Fig. 1; alignment not shown) indicated that this putative TTAG initiation was conserved among 15 of 27 (55.6%) coxI genes from full lepidopteran mitochondrial genomes (Fig. 1). The lack of absolute conservation of the TTAG suggests that there may be flexibility in the function of the sequence, or that it may not serve as an initiation codon. In any case, further study is required to determine the mechanism of coxI initiation, but transcript information provided by Stewart and Beckenback (2009) [55] and our data described below indicate that the Arg (CGR) indeed functions as the initiation codon.
The corresponding stop codons used by M. vitrata mitochondrial genes were predicted to be TAA in all instances except for coxII and ND1 where T and TAG motifs were observed, respectively ( Table 2). Based on invertebrate mitochondrial genetics, we predict that the UAG codon serves as a stop codon, and it has only been observed in M. vitrata, Ostrinia sp. [7], Parnassius bremeri (GenBank ID: FJ871125.1), and Parnassius maraho (GenBank ID: FJ810212.1) mitochondrial genomes. The infrequent use of the TAG stop codon is likely a consequence of the high A+T bias at the third codon position [43]. The incomplete stop codon of coxII (T nucleotide only) was previously reported in insects [7,9], and for other mitochondrial genes in bivalves [56] and mammalian species The tRNAs are presented in the order they occur on the mitogenome along with their beginning and ending nucleotide positions. The anticodon region detected and the corresponding amino acid the predicted tRNA transfer are presented. The orientation is indicated with respect to the heavy (H) or light (L) strand is indicated. doi:10.1371/journal.pone.0016444.t002 [3]. Mitochondrial RNA processing occurs by the tRNA punctuation model [3], wherein the cloverleaf-like secondary structures of tRNAs are required for the cleavage of polycistronic transcripts into mRNAs. Maturation of the mitochondrial mRNAs occur by the addition of poly(A) tails, and results in completion of the TAA stop codon for coxII (A nucleotides added via polyadenylation are underlined; [55]). Similarly, we present EST data in the following section that indicate that mature M. vitrata coxII transcripts are also polyadenylated.

M. vitrata EST assembly and mitochondrial cistronic transcripts
The de novo assembly of the M. vitrata larval midgut and salivary EST read data resulted in a total of 3729 contigs (mean length 459.66287.3 bp; range 96 to 3299 bp), of which 10 contigs were annotated as derived from the mitochondrial genome (1335.76510.3 bp; Supplemental Data S1); these contigs resulted from the clustering of 7608 raw reads (Supplemental Data S2). The arrangement of genes within assembled contigs likely represent mature transcripts that are derived from tRNA punctuated cleavage of a polycistronic transcript [3,57], and in M. vitrata it showed that three contigs had more than one gene (contig 4: atp8, atp6, and coxIII; contig 6: nd4 and nd4L; and contig 7: nd6 and cytb; Fig 2; Table 4). The atp6 and atp8 genes, as well as the nd4 and nd4L genes, were located upon the same mature mitochondrial transcript within the Dipteran insects D. melanogaster and D. pseudoobscura [55,57,58] and the swine Sus scofa [10]. According to the tRNA punctuated model, all M. vitrata mitochondrial polycistronic transcripts are cleaved at junctions with tRNAs, such that the 39 end of cistrons are each interspersed by a tRNA gene [3,5]. By default, mitochondrial genes on the same cistron that are contiguous within the genome lack an intervening tRNA. This contiguous structural arrangement of the atp8/atp6/coxIII, nd4/nd4L, and nd6/cytb genes was observed in the M. vitrata mitochondrial genome, and was also predicted within the resulting mature transcripts (cistrons; Fig. 2; Table 4).  The overlap of the gene coding sequence between those of M. vitrata atp8 and atp6 (7 bp), and nd4 and nd4L (1 bp), would logically dictate that the mature transcripts remain intact in order to preserve their respective coding sequences. This scenario was also predicted due to the absence of an intervening tRNA that would be necessary for transcript cleavage [3,5]. To our knowledge, the polycistronic atp8/atp6/coxIII transcript identified from M. vitrata has yet to be reported in insects despite ancestral genome arrangements that likely would lead to this tricistronic transcript. Drosophila melanogaster showed polyadenylation at the 39 terminus of bicistronic atp8/atp6 and monocistronic coxIII mitochondrial transcripts [55], which suggests that variation in mitochondrial transcript cleavage events may exist among insect species. Indeed, differences occur between species of salamander, where a tricistronic atp8/atp6/coxIII transcript from Ambystoma tigrinum is processed into atp8/atp6 and coxIII transcripts within A. mexicanum [59]. Our mitochondrial transcript mapping data are the first to be reported for a lepidopteran, so direct comparison with data from other species cannot readily be made. Undoubtedly the inclusion of mitochondrial transcript mapping data from additional species will allow us to understand the functional variation in mitochondrial transcript processing.
Only two out of the eight (25%) mature processed M. vitrata mitochondrial transcripts have possible poly(A) tails (coxII and atp8/atp6/coxIII cistrons; Supplemental Data S1), which is in contrast to the observation of 39 polyadenlyation for all cistronic transcripts in Drosophila [55]. Generation of initial M. vitrata cDNA by use of a poly(T) oligonucleotide would indicate that adenlyated forms of each cistron would be within the EST assemblies, if indeed present within the biological samples. The post-transcriptional 39 addition of adenosine nucleotides results in increased transcript stability in many systems, but it has been linked to increased degradation in plant mitochondria [60]. RNAi knockdown of the enzymatic machinery involved in polyadenylation was previously reported to produce no apparent change in mitochondrial transcript stability in humans [61,62]. Hence, the role of adenylation in mitochondrial transcript stability remains unknown [63]. Adenylation is required by the coxII transcript for the completion of its termination codon, but the functional or protective role of poly(A) tails remains uncertain given the absence from most cistronic transcripts in M. vitrata.

Scaffolding of M. vitrata ESTs and transcript mapping
Since polycistronic transcripts are cleaved to generate mRNAs for each mitochondrial gene-derived cistron, the initial molar ratios are equal and subsequent variation in processed transcript quantities result from differences in mRNA stability [64]. The scaffolds comprised of short read EST data from M. vitrata were representative of full-length transcripts (Fig. 2), and appear not to be significantly influenced by reported over-representations of the 39 end of fragments sometimes observed from 454 data [57]. Furthermore, the scaffolds aligned with our previous contig (cistron) assemblies that were representative of mature processed mitochondrial transcripts, and showed a read depth of 1 to 784 ( Fig. 2; Table 4). Similar data from Next Generation Sequencing (NGS) technologies has been used to characterize gene expression [57], and applied to describe the sequence and relative quantities of processed mitochondrial-derived transcripts [10,11]. The quantification described in those studies used simple calculations of the proportion of raw sequence reads mapped to each transcript (cistron) as a relative measure of cellular transcript level, which was used as an estimation of gene expression [57,65].
A similar quantification of M. vitrata mitochondrial-derived transcript levels from scaffolded EST data indicated a ,321-fold variance in cistron level, where the rrnL gene was present in the highest proportion (Table 4). The rrnL transcript was also present in the highest proportion of D. melanogaster [11] and porcine cistrons [10], and the analogous large abundance of rrnL transcripts in human expression data was attributed to the presence of a second HSP upstream of the gene that terminates in proximity near the tRNA LeuUUR downstream of the 16 S rRNA [66,67,68]. The HSP and LSP sequences have diverged rapidly in mammalian systems [69], such that defining these functional regions is difficult by comparative sequence analyses [70]. The proportional range of cistron levels for those that encode for proteins involved in the same process showed #11.15-fold difference in cistron quantities for the NADH dehydrogenase complex (nd1 vs. nd2), and a #3.59-fold variance among cistrons for the cytochrome c oxidase (coxII vs. coxIII). The estimated expression levels of coxI and coxII were approximately equal in M. vitrata ESTs. An overall lack of correlation between transcript levels within a protein complex has been observed among species of Drosophila [11], and likely can be attributed to differences in individual transcript stabilities [64]. Furthermore, these estimates do not take into account any differences in protein stability or turnover in the cell that would influence cellular metabolic processes [71]. The expression data represent additional data for annotation, but comparison with ESTs derived from different M. vitrata tissues or growth stages are likely to provide valuable information regarding metabolic variation.

Conclusions
This is the first study of a species of Lepidoptera to provide functional annotation of a complete mitochondrial genome using gene expression data. These results will be valuable for future studies to provide comparisons for other species of Lepidoptera and to describe tissue-specific mitochondrial transcript processing pathways. This study provides a paradigm for the mating of EST and functional mitochondrial genome annotations that can be accomplished for any species for which both data sets are present.