Figure 1.
Sequence divergence, genome size, and gene content in seed plant mitochondria.
Branch lengths are scaled to the number of synonymous nucleotide substitution per site (dS) on the basis of an analysis of all shared protein genes. Genome size ranges are reported for species with multiple sequences available. Gene counts exclude duplicates and putative pseudogenes.
Figure 2.
Levels of synonymous (dS) and nonsynonymous (dN) sequence divergence in terms of substitutions per site for protein genes in Silene mitochondrial genomes.
Estimates were generated using B. vulgaris and A. thaliana as outgroups.
Figure 3.
Number of indels in mitochondrial protein genes and introns that are unique to each of the four Silene species.
Figure 4.
Protein and RNA gene content in sequenced seed plant mitochondrial genomes.
Dark shading indicates the presence of an intact reading frame or folding structure, whereas light shading indicates the presence of only a putative pseudogene. The numbers at the bottom of each group indicate the total number of intact genes for that species. Note that the ccmFc gene, which is universally present in all other seed plants surveyed to date [104], is classified as a pseudogene in S. conica. It has experienced numerous structural mutations in this lineage, including multiple frame shifts in the second exon that introduce premature stop codons. However, cDNA sequencing confirmed that this gene is transcribed, spliced, and RNA edited in S. conica (unpublished data), so it is possible that the gene is still functional in its truncated form. In some cases, the presence of an intact gene may not indicate functionality. This is particularly true for tRNA genes embedded within recently transferred regions of plastid DNA [20],[105]. For example, the trnN(guu) and trnR(acg) genes in S. vulgaris may not be functional, as they are within a 2.6-kb region that appears to have been recently transferred from the plastid genome (on the basis of its perfect sequence identity with the exception of a single 18-bp deletion). These two tRNA genes are not orthologous to the plastid-derived copies of trnN(guu) and trnR(acg) in other seed plant mitochondria. Intron-containing plastid-derived tRNA genes such as trnA(ugc) in Bambusa, trnV(uac) in Cycas, trnK(uuu) in Vitis, and trnI(gau) in Zea are also unlikely to be functional. In Cycas, the trnL(uaa), trnP(ugg), trnQ(uug), trnR(ucu), and trnV(uac)- Ψ genes are classified on the basis of sequence homology to other land plant tRNAs even though their genomically encoded anticodons differ (CAA, CGG, CUG, CCU, and CAC, respectively). It is possible that these anticodons undergo C-to-U RNA editing to restore the ancestral codon as has been observed in other vascular plants [106],[107]. Plastid-derived tRNAs with substitutions in their anticodons, such as Citrullus trnT(ugu) and Silene latifolia trnP(ugg), are also classified (as pseudogenes) on the basis of homology.
Table 1.
Summary of four Silene mitochondrial genomes.
Figure 5.
Size distribution of repetitive content by the number of repeat pairs (left column) and total repeat length (right column).
Both datasets are based on all repeat pairs identified with BLAST by searching each genome against itself. Note that this method is different than counting individual repeat copies, which cannot be unambiguously identified when repeats exist in numerous partially overlapping copies, as they do in these genomes. For example, a repeat with four copies would be associated with six unique repeat pairs. Because of the enormous number of multicopy, overlapping repeats in S. conica, the total length of repeat pairs exceeds the size of the genome even though more than half of it is single-copy. For these same reasons, the distribution of repeat lengths in this figure differs from the repeat coverage statistics reported in Table 1, which consider what fraction of the genome is covered by repeats but not the total number of repeat pairs. The reported 50% coverage threshold represents the median of the total repeat length distribution.
Figure 6.
Repeat-mediated recombinational activity in the low mutation rate S. latifolia and S. vulgaris mitochondrial genomes (A) and the fast-evolving S. noctiflora and S. conica mitochondrial genomes (B).
Each point represents a pair of repeats, and its position on the y-axis denotes the proportion of recombinant genome conformations detected with paired-end 454 reads. The dashed lines indicate the level at which equal frequencies of read pairs support recombinant and nonrecombinant conformations. The S. latifolia mitochondrial genome was not sequenced with 454 paired-end reads, but Southern blot hybridizations indicated that alternative genome conformations associated with its six-copy 1.4-kb repeat exist at roughly equivalent frequencies [38], as indicated by the large X.
Figure 7.
Distribution of percent sequence identity between pairs of repeats detected by BLAST.
Only repeat pairs greater than 300 bp in length were used to calculate these distributions.
Figure 8.
Silene mitochondrial genome sizes relative to all sequenced mitochondrial and eubacterial genomes from the National Center for Biotechnology Information (NCBI) Genome database.