Complete Mitochondrial Genome of Eruca sativa Mill. (Garden Rocket)

Eruca sativa (Cruciferae family) is an ancient crop of great economic and agronomic importance. Here, the complete mitochondrial genome of Eruca sativa was sequenced and annotated. The circular molecule is 247 696 bp long, with a G+C content of 45.07%, containing 33 protein-coding genes, three rRNA genes, and 18 tRNA genes. The Eruca sativa mitochondrial genome may be divided into six master circles and four subgenomic molecules via three pairwise large repeats, resulting in a more dynamic structure of the Eruca sativa mtDNA compared with other cruciferous mitotypes. Comparison with the Brassica napus MtDNA revealed that most of the genes with known function are conserved between these two mitotypes except for the ccmFN2 and rrn18 genes, and 27 point mutations were scattered in the 14 protein-coding genes. Evolutionary relationships analysis suggested that Eruca sativa is more closely related to the Brassica species and to Raphanus sativus than to Arabidopsis thaliana.


Introduction
Mitochondria supply energy in the form of ATP through oxidative phosphorylation in almost all eukaryotic cells [1]. In comparison to their counterparts in animals and fungi, plant mitochondrial (mt) genomes have unique features, such as large and dramatic variations in size [2], dynamic structure [3], extremely low rate of point mutations [4] and incorporation of foreign DNA [5]. The largest known mitochondrial genomes are those of seed plants, with sizes ranging from 208 kb for Brassica hirta [6] to over 11.3 Mb for Silene conica [4]. The dramatic variation may occur within closely related species [7]. Active recombination via repeated sequences appear to be responsible for the dynamic nature and multipartite organization of the mt genome in all angiosperms investigated [8], which may produce significantly different gene orders even among close relatives [9].
Mitochondria play an important role in plant growth and development. Genomic rearrangements involving substoichiometric shifting (SSS), a consequence of intermediate repeat DNA exchange [10], is often accompanied by changes in the plant's phenotype. SSS activity in plant mitochondria has been reported to be associated with cytoplasmic male sterility [11], nitrate sensing and GA-mediated pathways for growth and flowering [12]. Plant mitochondria have also been associated with stress responses [13] and regulation of programmed cell death [14]. Therefore, determining mitochondrial genomes is important for determining specific metabolic activities of plants [15].
Eruca sativa Mill.or Eruca vesicaria subsp. sativa (Miller) (Garden rocket), a member of the Cruciferae family, has several desirable agronomic traits, such as resistance to salt, drought, white rust and aphids [16][17][18]. Introducing these beneficial genes of E. sativa into economically important cultivated species will promote crop improvement [19,20]. Crosses of E. sativa with other species of the family Cruciferae, including B. rapa, B. juncea, and B. oleracea, have been reported [20].
To date, several mt genomes from the Cruciferae family have been sequenced, including Arabidopsis thaliana (tha) [21], Raphanus sativus (sat) [22] and five species from the Brassica genus, i.e., B. napus (pol, nap), B. rapa (cam), B. oleracea (ole), B. juncea (jun), and B. carinata (car) [23][24][25]. In this study, we reported the complete mitochondrial genome sequences of E. sativa and provide a comparison with other sequenced cruciferous mt genomes. This research will help to characterize the E. sativa crop and further our understanding of the evolution of mitochondrial genomes within the Cruciferae family.

Mitochondrial DNA isolation and sequencing
A commercial cultivar of E. sativa was used in this study. Mitochondrial DNA was isolated from 7-day-old etiolated seedlings according to Chen's methods , and stored at 280uC until use. Genome sequencing was performed using the GS-FLX platform (Roche, Branford, CT, USA). The reads were assembled into contigs using Newbler v.2.6. Sanger sequencing of PCR products was used to join the contigs to form the complete genome.

Results
The mitochondrial genome of E. sativa The mitochondrial genome of E. sativa was assembled as a single circular molecule of 247 696 bp (Figure 1, deposited in GenBank under the accession KF442616). The overall GC content of the mtDNA is 45.07%, which is comparable to those of other mtDNAs of Cruciferae. The largest part of the E. sativa mtDNA comprises the non-coding sequences (85.14%), which is slightly smaller than the average non-coding sequences content (89.463.1%) in other reported angiosperm mitochondrial genomes [29]. Genes account for 26.27% of the genome (65 070 bp in total length), 56.61% of which represent exons (36 837 bp) and 43.39% represent introns (28 233 bp).
18 tRNA sequences (1 383 bp) were found in E. sativa mtDNA ( Table 2), in the range of 71-88 bp in length. The A+T content of the tRNA genes is 48.81%, which is lower than the overall A+T composition of the mtDNA. Among these genes, tRNAs for 15 amino acids, including duplication of the methionine (Met) and triplication of the serine (Ser), are encoded. The genome lacks tRNAs for the amino acids alanine (Ala), valine (Val), phenylalanine (Phe), threonine (Thr) and arginine (Arg). To enable gene expression for protein synthesis in mitochondria, the missing tRNAs may be supplied by either the chloroplast or nuclear genomes [30].
Using ORF-Finder and BLAST searching, 50 ORFs longer than 100 codons were identified in the E. sativa mitochondrial genome. Among the 50 ORFs, only the orf112, orf121, orf122, and orf275 have two copies. All others are single-copy ORFs. Most of the ORFs are between 300 and 500 bp in length, except for 10 ORFs that are longer than 500 bp, including the 1 200 bp orf399 and the 1 911 bp orf636.

Subgenomic circles mediated by large repeats
Large repeats (.1 Kb) have been identified in most of the seed plants analyzed, except for white mustard (Brassica hirta) (Palmer and Herbo, 1987). The repeats in the E. sativa mitochondrial genome were analyzed. Three pairs of large repeats were identified, accounting for 13.48% of the genome. The large repeats were designated as R1, R2 and R3 (Table 3). R1 (10 320 bp) has a pair of large repeats in the opposite orientation, while R2 (4 864 bp) and R3 (1 513 bp) have a pair of large repeats in the same orientation. Large repeat R1 contains two ORFs, orf112 and orf122, while R2 and R3 contain orf275 and the orf121, respectively. No known protein coding gene was found in these large repeats. Large repeats have been implicated in mediating high frequency, reciprocal DNA exchange that can result in subdivision of the genome into a multipartite configuration [31]. The formation of the multipartite structure of the E. sativa mitochondrial genome was predicted based on the assumptions of intramolecular homologous recombination ( Figure 2). Six isometric master circular (MC) genomic structures of the same length (including MC1 shown in Figure 1) could be produced by intramolecular recombination between different repeat pairs. In addition, MC molecule 1 and 6 may generate four subgenomic circles, including two small circles of 129 447 bp (SC1) and 118 249 bp (SC2) via the pairwise large repeat R2, and another two small circles of 132 016 bp (SC3) and 115 680 bp (SC4) mediated by the pairwise large repeat R3. MC3 may produce SC1 and SC2, and MC4 may produce SC3 and SC4, mediated by the pairwise large repeat R1.

Sequence comparison between E. sativa and B. napus mtDNAs
We compared the sequences of the mtDNAs from E. sativa and B. napus. Most of the protein coding and RNA genes were conserved in length, except ccmFN2 and rrn18. The 59 portion of the coding region of ccmFN2 in E. sativa mtDNA was quite different ( Figure S1) and a 25-bp deletion in rrn18 was found in E. sativa mtDNA ( Figure S2) compared with that in B. napus. The E. sativa mitotype is devoid of cox2-2, compared with that of B. napus. 27 single nucleotide polymorphisms (SNPs) were detected in 14 genes when compared with B. napus (Table 4). Thirteen synonymous substitutions were found in atp6, ccmB, cob, cox1, nad2, nad6, rpl2, rps3, and rps4. Fourteen nonsynonymous mutants were found in 11 genes, including an S to N (199aa) switch in atp1, a V to I (18aa) and an H to F (51aa) switch in atp6, a P to L (107aa) switch in ccmB, an R to K (113aa) switch in ccmFC, an H to Y (285aa) switch in cob, a P to L (112aa) switch in cox1, an S to L (126aa) and an S to N (438aa) switch in matR, a C to R (72aa) switch in nad2, an S to L (29aa) switch in rpl2, an L to P (172aa) switch in rpl5, and an M to I (50aa) switch in rps7. Of these 27 SNPs, most were transitions and only three were transversion (GRT in nad2, TRA in cox1, and TRA in atp6). All tRNAs in the B. napus mitochondrial genome were detected in E. sativa mtDNA. However, the ORFs were quite different between these two mitotypes.  Evolutionary relationships of the cruciferous mitotypes To further illustrate the evolution of mitochondrial genomes within the Cruciferae family, the E. sativa mtDNA and other reported Cruciferous mtDNAs were compared using BLASTN [32]. E. sativa mtDNA was used as the reference sequence and similar regions in two or more mtDNA sequences were aligned. The alignable E. sativa sequence (93%) was 81% identical to that of R. sativus mtDNA. The sequence identity shared by the mtDNA of E. sativa and Brassica was more than 83%, with a coverage in the range of 83-85%. Only 63% of the E. sativa mtDNA matched those of Arabidopsis thaliana, with an identity of more than 68%, and the longest fragment was only 8.0 kb. This result suggested that the evolutionary relationship of mitochondrial genomes among E. sativa, the Brassicas and R. sativus is closer than that between E. sativa and A. thaliana.
In support of this hypothesis, a dot matrix analysis showed that the lengths of syntenic regions between E. sativa and A. thaliana are shorter than those between E. sativa and Brassica or R. sativus. Additionally, the distribution of syntenic regions between the mtDNAs of E. sativa and A. thaliana is more dispersed, and the identity is lower, than that between E. sativa and the Brassica mitotypes ( Figure 3). Moreover, the phylogenetic relationships among the Cruciferae family ( Figure 4) were inferred using the neighbor-joining method and 23 conserved genes among the  reported Cruciferae mitotypes. The results are mainly consistent with previous reports based on mitochondrial genome analysis [22] and strongly support the conclusion that E. sativa is more closely related to the Brassica species and R. sativus than to A. thaliana.

Discussion
The Cruciferae family is one of the largest dicot families of the flowering plant kingdom and includes several vegetable and oilseed crops, as well as several model species of great scientific, economic and agronomic importance [33]. Annotations for mitochondrial genomes from closely related species would improve the understanding of molecular evolution and phylogenetic relationships [34] in the Cruciferae family. E. sativa, a member of the Cruciferae family, is a conventional crop consumed as food and fodder. The economic potential of E. sativa lies in various other aspects, including the protein sources for edible purposes, a potential source of industrial oil, an effective biological control of crop pests and traditional pharmacopoeia for various purposes [35]. To better understand this important crop, the mitochondrial genome of E. sativa was sequenced and annotated.
Cruciferae mitochondrial genomes are generally small (208-367 kb) compared with other seed plants. The E. sativa mt genome (248 kb) is larger than most Brassica mitotypes, but smaller than that of B. oleracea (360 kb) and A. thaliana (367 kb). Comparison of the E. sativa mtDNA with the B. napus mtDNA revealed that the cox2-2 gene was absent from the E. sativa mt genome. This gene was also absent from the genomes of B. oleracea, B. carinata, and Ogura-cms-cybrid (oguC) rapeseed mitotypes [25,36]. A distinguishing feature of Cruciferae mitochondrial genomes is that the ccmFN genes are divided into two reading frames (ccmFN1 and ccmFN2) [23]. The translation of ccmFN2 has been confirmed in A. thaliana mitochondria, which demonstrated that ccmFN2 was not a pseudo gene, although it lacks a classical ATG initiation codon [37]. Sequence alignments of ccmFN2 from reported Cruciferae mtDNAs showed that the first 45 bp of the putative ccmFN2 gene in E. sativa mt genome is quite different from the ccmFN2 gene in Brassica and A. thaliana mitotypes ( Figure S1), suggesting that this non-conserved region may not be critical for gene function. However, the tryptophanrich WWD domain in ccmFN2, which is responsible for heme binding [38], is conserved among these mitotypes.
The 5S and 18S rRNA genes in E. sativa mtDNA are closely linked, as they are in other plants, and the 26S rRNA gene is separated from the 18S and 5S by 26 459 bp. To elucidate the evolutionary origins of mitochondria, the ribosomal RNA genes have been extensively examined [39]. Sequence analysis of the rrn18 gene from wheat, maize and soybean showed highly similarity between the plant mitochondrial rrn18 genes and the eubacterial 16S rRNA, suggesting that there is a much slower rate of sequence change in plant mitochondria compared with their animal counterparts [40]. We compared the rrn18 among the reported Cruciferae mitotypes and found a 25-bp deletion in rrn18 in E. sativa mtDNA ( Figure S2) compared with that in Brassica mitotypes. We also noticed a 46-bp deletion in rrn18 within the same region of the Brassica mitotypes when compared with that in A. thaliana mtDNA. However, the overall nucleotide identities of the rrn18 gene sequences were markedly high, from 89.50% between E. sativa and A. thaliana to 93.92% between E. sativa and the Brassica family. The nucleotide identity of the rrn18 gene between E. sativa and R. sativus was 93.86% ( Figure  S2). This result is consistent with the results of the phylogenetic analysis based on 32 protein coding genes (Figure 4), which suggested that E. sativa is closer to Brassica and R. sativus than to A. thaliana.
18 tRNA genes were identified in E. sativa mtDNA, accounting for only 0.56% of the mitochondrial genome. Among them, six seem to be chloroplast derived, which exhibit high sequence identity (.99%) to their chloroplast counterparts. The chloroplastderived trnH-GTG, trnM-CAT, trnS-GGA, trnW-CCA, trnD-GUC, and trnN-GTT genes, which are frequently found in mitochondrial genomes of angiosperms [15], were identified in the E. sativa mtDNA. An additional chloroplast-originating tRNA gene (trnL-CAA), which is found in the R. sativus and Brassica mitotypes [22], was also identified in E. sativa mitochondrial genome. This result indicated that mt tRNA genes are frequently transferred from chloroplast genomes during the evolution of angiosperms. However, another two gene (trnP-GGG and trnQ-UUG) transfer events reported in dicots [41,42] were not found in E. sativa.
Genes with known functions are relatively conserved among the Cruciferae mitotypes, especially for the protein coding genes. However, the mitochondrial genomes structural differences are remarkable among the Cruciferae family. Multipartite structures of mtDNA mediated by large repeats have been commonly observed in plant species [43]. Direct electron-microscopic evidence of the coexistence of multipartite molecules in the plant mitochondrial genome has been found in tobacco [44]. The large repeat, RB, which is 2,427 bp in length and has been identified in most of the reported Brassica (except the oguC rapeseed) mitotypes, was not found in the E. sativa mtDNA. Instead, three pairwise large repeats were identified. Large repeat R1 in E. sativa mtDNA showed significantly high sequence similarity to the 6 580bp large repeat R in B. carinata mitochondria ( Figure S3). The 1 513-bp large repeat R3 showed 99% identity to the corresponding segments of the large repeat R2 in B. oleracea mitochondrial genome. Only 2% and 23% of R1 in E. sativa mtDNA showed high similarity (.83%) with the large repeats in A. thaliana and R. sativus mtDNA, respectively. The tripartite structure of the mitochondrial genome, including one master circle and two smaller subgenomic circles, has been reported in Brassica species (except the ole mitotype) and R. sativus [22,25,36]. The predicted multipartite structure of the mitochondrial genome in E. sativa is more complex than other Cruciferae species because of the three pairwise large repeats, including six master circles and four smaller subgenomic circles.

Conclusions
In this study, we reported the complete mitochondrial genome sequence of E. sativa, a member of the Cruciferae family. The E. sativa mtDNA is 247 696 bp and harbors 33 known protein coding genes, three rRNAs (5 S, 18 S, and 26 S rRNAs) and 18 tRNAs. In addition, the cox2-2 gene is absent, the ccmFN2 and rrn18 genes have different lengths and 27 SNPs are involved in 14 protein coding genes in comparison with B. napus mtDNA. Reorganization of the genome may have occurred via three pairs of large repeats, resulting in a more dynamic structure of the E. sativa mtDNA compared with other cruciferous mitotypes. This may produce six master circles and four smaller subgenomic circles. The evolutionary relationships analysis among reported Cruciferous mitotypes revealed that the mitochondrial genome of E. sativa is divergent from A. thaliana, but closely related to those of Brassica and R. sativus. This study will improve our understanding of the E. sativa crop and the evolution of mitochondrial genomes within the Cruciferae family. Figure S1 Sequence alignments of ccmFN2 from reported Cruciferae mtDNAs. The highly and partly conserved amino acids are shaded black or grey respectively. The black block diagram indicates the un-conserved region of ccmFN2 in E. sativa mtDNAs compared to other reported Cruciferae mtDNAs. (TIF) Figure S2 Sequence alignments of rrn18 from reported Cruciferae mtDNAs. The highly and partly conserved amino acids are shaded black or grey respectively. The black block diagram indicates the deletion region of rrn18 in E. sativa mtDNAs compared to other reported Cruciferae mtDNAs. (TIF)  Figure S3 Alignment of the large repeats in Eruca sativa mtDNA with the large 6.6 kb repeats in car. The alignment was made using Mauve. Blocks of the same color denote homologous regions; the B. carinata blocks above or below the middle line represent direct or inverted, respectively, compared with E. sativa. The extent to which a block is filled indicates the similarity of the syntenic region. (TIF)

Author Contributions
Conceived and designed the experiments: RG. Performed the experiments: YW. Analyzed the data: YW PC SC QY. Contributed reagents/materials/ analysis tools: JC MH RG. Contributed to the writing of the manuscript: PC.