Mitochondrial Inverted Repeats Strongly Correlate with Lifespan: mtDNA Inversions and Aging

Mitochondrial defects are implicated in aging and in a multitude of age-related diseases, such as cancer, heart failure, Parkinson’s disease, and Huntington’s disease. However, it is still unclear how mitochondrial defects arise under normal physiological conditions. Mitochondrial DNA (mtDNA) deletions caused by direct repeats (DRs) are implicated in the formation of mitochondrial defects, however, mitochondrial DRs show relatively weak (Pearson’s r = −0.22, p<0.002; Spearman’s ρ = −0.12, p = 0.1) correlation with maximum lifespan (MLS). Here we report a stronger correlation (Pearson’s r = −0.55, p<10–16; Spearman’s ρ = −0.52, p<10–14) between mitochondrial inverted repeats (IRs) and lifespan across 202 species of mammals. We show that, in wild type mice under normal conditions, IRs cause inversions, which arise by replication-dependent mechanism. The inversions accumulate with age in the brain and heart. Our data suggest that IR-mediated inversions are more mutagenic than DR-mediated deletions in mtDNA, and impose stronger constraint on lifespan. Our study identifies IR-induced mitochondrial genome instability during mtDNA replication as a potential cause for mitochondrial defects.


Introduction
Mitochondria are responsible for generating most of ATP in a eukaryotic cell. Mitochondria have their own genome, which is a small DNA molecule between 13 kb and 26 kb in most animals, encoding central components of the electron transport chain and some rRNAs and tRNAs. Mitochondria play the central role in the signal transduction of apoptosis and have been linked to cell loss and aging [1,2].
The role of mitochondria in aging was proposed 40 years ago [3], and is still a subject of a hot debate [1,[4][5][6][7][8][9][10][11][12]. Mitochondrial free radical theory of aging proposes that reactive oxygen species that are produced in the mitochondria, cause damage to macromolecules such as proteins, lipids and mtDNA and organisms age as they accumulate free radical damage over time over time [3]. This theory is supported by the findings that the catalase targeted to mitochondria attenuates aging in mice [13]. However, accumulating evidence argues against a simple link between free radical damage and aging [1,[5][6][7].
The role of mtDNA in aging was first observed in filamentous fungus Podospora anserina [14][15][16]. Later, mtDNA point mutations and deletions have been found to accumulate in aging animal tissues [4,[17][18][19][20]. MtDNA deletions accumulate to high levels in post-mitotic tissues, such as brain, heart, and skeletal muscle, and the fraction of deletions could exceed 60% in neurons [21,22]. Deletions were also suggested to be a driving force behind premature aging in mitochondrial mutator mice [23]. Deletions often arise between direct DNA repeats [24,25], and the frequency of direct repeats (DRs) in mtDNA was shown to negatively correlate with mammalian lifespan [26,27], suggesting that more DRs are associated with more deletions and faster aging.
Here we report that mitochondrial inverted repeats (IRs) have a stronger negative correlation with animal lifespan than DRs, and that IRs induce mitochondrial genome instability during mtDNA replication and cause mtDNA inversions that accumulate with age in mice.

Inverted Repeats (IRs) have a Stronger Correlation with Maximum Lifespans (MLS) than DRs
We analyzed the correlation between mtDNA IRs and lifespans for all animal species (a total of 529) that had both mitochondrial genome reference sequences in NCBI and lifespan information in AnAge database [28] as of March 1, 2012. We used our algorithm, RollingRepeat, to accurately count repeats of all lengths. For any repeat between two genomic positions, RollingRepeat extends it to the longest matched sequence. To quantify the burden of repeats of different lengths, we calculated a mutagenic score for each repeat with i identical matches in l bps according to a formula i 2 25l 6 , and then added all repeats (excluding D-loop that is often highly repetitive and not well sequenced) up to obtain a single score for each species. The formula was derived from the empirical relationship between repeat length and DR-mediated deletion rate in yeast mitochondria [29]. We then tested the formula with MITOMAP database [30], which contains reported human mitochondrial deletions. The total mutagenic score of all DRs of where i is the number of identical matches in l bps for each DR length, has a good fit with the number of experimentally reported human mtDNA deletions. The human mtDNA deletions are from MITOMAP database [30]. (B, C) The correlation between total mutagenic score, calculated as a sum of mutagenic scores of all DR (B) and IR (C) lengths for each species, and MLS within mammals. (D, E) The correlation between total mutagenic score and MLS within Chordata. (F, G) The correlation within Rodentia. R., Rattus. doi:10.1371/journal.pone.0073318.g001 each length matched well with the reported number of mtDNA deletions in human (Fig. 1A).
The mutagenic score of IRs had a stronger Pearson's correlation with MLS than DRs (Fig. 1B, 1C) in all the available 202 mammalian species, with r = -0.55 for IRs and r = -0.22 for DRs. The negative correlation did not exist for DRs in all Chordata ( Fig. 1D) but existed for IRs in Chordata (r = -0.26, Fig. 1E) (and in Animalia, r = -0.31, Fig. S1) even after excluding Mammalia (r = -0.22, p,10 -4 ), suggesting that IR-mediated mutagenesis is a universal mechanism involved in animal aging. Since our data did not meet normality assumption, we also calculated Spearman's r, which only slightly affected the correlations for IRs but removed the significant negative correlations for DRs (Table 1). The weaker correlation between IRs and MLS outside Mammalia may be due to the diverse physiology of these animals. We took a closer look at the order Rodentia that contains several important model animals and has a wide range of MLS. Long-lived rodents, such as naked mole rat and beaver, have fewer IRs but not DRs than short-lived rodents (Fig. 1F, 1G). All the above negative correlations between IRs and MLS remained significant after phylogenetic correction by phylogenetically independent contrasts [31] (Table 1). To confirm the robustness of our RollingRepeat algorithm for repeat discovery, we used NCBI blastn program to identity repeats and the results were similar (Fig. S2).

Short Repeats also show Negative Correlation with MLS
IRs have a strong negative correlation with mammalian MLS even for repeats as short as 2 bps (r = -0.46, p,10 -11 ; Spearman's r = 20.47, p,10 -11 , Fig. 2A). Short repeats may be important in aggregate, because the total repeat number increases exponentially with the decrease of repeat length. Short DRs also showed a significant, albeit weaker, negative correlation (p,0.03 if r,-0.15, two-tailed) with MLS. This is contrary to previous studies that did not use an algorithm specifically for short repeat discovery and found no correlation between DRs and MLS for repeats shorter than 10 bps [26,27]. The correlation for a long repeat length was calculated after excluding species that do not have repeats of that length. The weaker correlation for the longest repeats was likely due to the small total number of long repeats. Similarly, after randomly discarding short repeats so that their expected numbers were close to the 12-bp repeats of the same species, the correlations for short repeats also decreased (Fig. 2B).

Correlation of IRs and DRs with MLS Tolerates Mismatches and Variations in Spacer Length
By default, we allowed a mismatch every three matches. However, when allowing two mismatches every match, even very long repeats had a strong negative correlation with MLS ( Fig. 2C), suggesting that repeats with multiple mismatches still exert an effect on aging. The length of the spacer sequence between the two repeats had a negligible effect on the correlation, especially for IRs (Fig. 2D), suggesting that repeats are capable of long-distance interaction.

Repeats Correlate with MLS throughout the Mitochondrial Genome in Mammals
Repeats are found throughout the mitochondrial genome. Figures 2E, 2F show repeat maps of all repeats longer than 10 bps for mouse and human. Repeat maps for yeast S. cerevisiae and the 529 animal species analyzed are available upon request. Arrangement of genes in mtDNA is conserved in mammals. The strong correlation between repeats and MLS exists throughout the mitochondrial genome except for the D-loop, which showed weaker correlations with MLS (Fig. 2G).

IRs cause Inversions in Mouse mtDNA
IRs can cause inversions in bacterial and eukaryotic chromosomes [32][33][34]. To test whether inversions are generated in mtDNA, we designed PCR primers that anneal to the same DNA strand, with one primer being outside of the repeat pointing towards the repeat, and the second primer inside the spacer region. These primers can yield a product only after an inversion occurs in mtDNA (Fig. 3A). We used seven such primer pairs amplifying inversions caused by different IRs in C57BL/6 mouse brain and heart. Sequencing confirmed that the PCR products corresponded to inversions in the mtDNA ( Fig. 3B and Table 2).

Inversions Accumulate with Age in Mouse Brain and Heart
We next tested whether the frequency of inversions increases with mouse age. Three sets of primers amplifying across different repeats were used in a semi-quantitative PCR reaction normalized to un-rearranged mtDNA. No inversions were detected in the brains of one-week old mice. Inversions appeared in 4 months old mice (Fig. 3C). The frequency of the inversion #1 increased with age. Inversions #2 and #3 showed a more complex behavior suggestive of secondary rearrangements taking place in older animals (Fig. 3C). We next tested for inversion #1 in the brain, heart, and liver of 12-and 30-months old mice (Fig. 3D). No inversions were detected in the liver in both ages, and in the 12months old heart, while brain and heart of 30-months old animals showed inversions. These results suggest that inversions tend to accumulate with age in tissues with high-energy metabolism.
PCR amplification can potentially give rise to artifacts such as deletions or inversions. The following experiments argue that the inversions we observed in aged brain and heart were not a result of a PCR artifact. First, PCR amplification using the same sets of primers and DNA from the livers of 20 young and 20 old mice did not show inversions. Second, inversions in brain and heart displayed age-related pattern. Third, the same sets of PCR primers did not detect any inversions in the DNA from a mouse fibroblast cell line even after 50 PCR cycles. In the latter experiment an equal amount of mtDNA from a 24-month-old mouse brain was used as a positive control that showed inversions before 40 cycles on the same PCR plate. Interestingly, previous analysis [35] using high-throughput sequencing identified a large number of inversions, although the inversions were deemed to be sequencing errors. Quantitative PCR (qPCR) revealed that the relative concentration of the inversion #1 in Fig. 3B in 30 months old mouse brain was about 1 in 62900 mtDNAs. Since the corresponding IR had 17 matches in 18 bps and a mutagenic score of (17 2 /25/ 18) 6 = 0.0702, and the whole mouse mitochondrial genome has a total IR mutagenic score of 36.3 (32.6 excluding D-loop), this gives an estimate of 1 inversion in every 6290060.0702/36.3 = 122 mtDNAs. This number seems too low to drive aging. However, given that inversions can induce complex secondary rearrange-ments ( Fig. 3C; Fig. 4) which would not be detected by the original set of primers the actual frequency of inversions may be much higher.

IR-mediated Inversions are Generated through mtDNA Replication
Two models were proposed to explain IR-mediated inversions [36] (Fig. 4A). In the first model, inversions are generated by homologous recombination repair of damaged DNA, resulting in a simple inversion of the sequence between the repeats. In the second model, inversions are generated via DNA replication error, producing a head-to-head dimeric circular DNA molecule. Remarkably, duplicated mtDNA molecules were shown to accumulate in aged human tissues [37]. A simple inversion can be distinguished from a mtDNA dimer using long range (LR) PCR with a single primer amplifying towards the repeat (Fig. 4A). Such PCR reaction will only yield a product on a head-to-head dimeric template. The existence of such dimers was confirmed by sequencing the inversions generated with a single primer (the last three products in Table 2). To compare the levels of head-to-head dimers and simple inversion products we performed an 8-or 12-cycles of LR PCR nested with a qPCR. The LR PCR reactions contained either two primers, p LR1 and p LR2 , a single primer p LR1 (Fig. 2E & Fig. 4A), or no primers. Reactions containing two primers are able to amplify both the simple inversions and head-to-head dimers caused by the same IR, while a single primer can only amplify the head-to-head dimer. Following LR PCR, a qPCR using primers p q1 and p q2 (Fig. 2E & Fig. 4A) was used to specifically quantify the inversion amplified in the LR PCR. The qPCR Ct value (inversely proportional to log of template concentration) after the singleprimer LR PCR was nearly the same with the Ct value after the two-primer LR PCR (Fig. 4B). The calculated ratio of template concentrations of single-primer-amplifiable inversions to twoprimer-amplifiable inversions ranged from 92% to 119%, and is larger than 66% with 95% confidence (n = 12) for the lowest replicate (12 LR cycles for 30 month brain in Fig. 4C), suggesting that the majority of, if not all, inversions are generated through replication-dependent mechanism. This is consistent with the hypothesis proposed based on the studies of mutator mice that mtDNA damage arises as a result of errors by mtDNA polymerase [25].

Why do IRs have a Stronger Correlation with MLS than DRs?
The dimeric mtDNA molecule caused by IRs is itself a large inverted repeat (Fig. 4A), which is highly unstable and can cause additional complex rearrangements. This may explain why some of the inversions did not show a perfect gradual increase with age in Fig. 3C and why some inversions and deletions did not have a repeat at the junction ( Table 2). Another explanation is that inversions and deletions may have a dominant-negative effect. mtDNA deletions and inversions can interrupt mitochondrial genes or create hybrid gene products by generating novel junctions. These hybrid proteins and RNAs may disrupt proteostasis and impair mitochondrial function. Since nearly all mtDNA consists of coding sequences, IRs may be more harmful than DRs because, an inversion produces two aberrant products, while a deletion produces only one. In summary, the correlation we uncovered between the IRs in the mtDNA and lifespan highlights mtDNA inversion as a type of mtDNA rearrangement having a strong connection to lifespan. We propose a model where IRs and DRs in the mitochondrial genome cause mtDNA inversions and deletions during mtDNA replication, these mtDNA rearrangements accumulate with age, disrupt ATP production, trigger apoptosis, and promote aging and age-related diseases.

Ethics Statement
All animal experiments were approved by the University of Rochester Committee on Animal Resources (UCAR).

Rolling Repeat Algorithm for the Analysis of Mitochondrial Repeats
We take two copies of the mitochondrial sequence, either exactly the same (for DRs) or reverse complementary (for IRs), and denote as seq 1 and seq 2 . First align them together, and then rotate them relative to each other one bp by one bp. For each rotation state, we do local alignments throughout the sequences and count all matched sequences. So, if two genomic positions are repeats, then there will be two rotation states for DRs in which one repeat sequence on seq 1 will overlap with the other repeat on seq 2 , and one rotation state for IRs in which the two repeat sequences on seq 1 will simultaneously overlap those on seq 2 (Fig. S3). So, after a full rotation circle, each possible repeat will be counted exactly twice. The main algorithm in pseudocode is in Text S1.
For the local alignment, we reward a score of 1 for a match, and punish a score of 3 for a mismatch as the default condition as done by standard BLAST search. Although RollingRepeat allows extending the repeats in a low match region for 20 bps as long as the sequence at the end has high match quality, the stringency is not too low. For example, long repeats in human mitochondrial genome identified by RollingRepeat have at most 3 mismatches, with the longest repeat being 18 matches in 21 bps. A repeat has to have at least 16 bps to have 3 mismatches.
Since mitochondrial genomes of some species have highly repetitive sequences, usually tandem repeats (DRs), in the D-loop regions, which sometimes were also not well sequenced, we counted repeats only between gene regions (marked with ''gene'', ''tRNA'', ''rRNA'' or ''CDS'' in their GenBank files) when calculating the mutagenic scores, so the D-loop was excluded when calculating the correlations. However, for the repeat maps, we counted repeats between all regions.

Self-BLAST Algorithm for Repeat Discovery
We used NCBI blastn program to do local BLAST. The program was configured for short repeat discovery. Since blastn gives very few hits when one takes the whole mitochondrial genome to BLAST into itself, we first cut the mitochondrial genome into 100 bp blocks without overlaps and then BLAST each of these blocks onto the whole mitochondrial genome. Then, we counted all hits in gene regions except for self-hits and sorted them into direct and inverted repeats. The exact command we used for blastn was as follows: blastn -task 'blastn-short' -num_descriptions 500000000 -num_alignments 500000000 -ungapped -query query.fa -db blastdb -word_size 4 -evalue 1e300

Combining Different Lengths of Repeats into a Single Mutagenic Score
We used a power function l n to fit published data about the relationship between direct repeat lengths l and deletion rate in yeast mitochondria (Fig. S4). We found that deletion rate is roughly proportional to l 6 . Since l 6 can be an arbitrarily large number and the longest repeats (allowing only a few mismatches) in mammals are about 25 bps, (l/25) 6 can be used to calculate the mutagenic score of each repeat so that the longest repeats will have a mutagenic score close to 1. Using this constant 25 will give us a sense of the total mutagenic potential a mitochondrial genome has, and will not change the correlation between total mutagenic score and lifespan. To allow a few mismatches, we used i 25 : i l À Á 6 to calculate the mutagenic score of each repeat with i identical matches in a repeat of l bps, and i/l was used so that more mismatches would cause a reduced score (i = l if there were no mismatches). Then, the total mutagenic scores of DRs and IRs of a mitochondrial genome could be calculated, respectively.

Amplification and Sequencing of Inversions
We extracted total DNA from mouse brain, heart and liver tissues using DNeasy Tissue Kit (QIAGEN) with standard protocol. After PCR amplification with parallel primers (placed on the same mtDNA strand, synthesized by Integrated DNA Technologies), products were run on agarose gel. Different bands were cut and extracted using QIAEX II Gel Extraction Kit (QIAGEN). DNA extracted was directly sent for sequencing, or was cloned into plasmid using TOPO TA Cloning Kit (Invitrogen) and then sent for sequencing. For some primer pairs with a single pure amplified band, DNA was directly extracted from PCR  product using QIAprep Spin Miniprep Kit (QIAGEN) and then cloned or directly sent for sequencing. We originally designed primers between 22 to 30 bps and run PCR using an annealing temperature of about 50uC. Some inversions in Table 2 were amplified and sequenced by these primers. However, these primer pairs also gave nonspecific amplifications. Using longer primers (above 50 bps, usually below 60 bps) and a higher annealing temperature (above 60uC and below 72uC) allowed us to avoid nonspecific amplifications and primer dimer formation. Primers ended with G/C and were checked by program Amplify 3.1 (http://engels.genetics.wisc.edu/ amplify/) on Mac OS X to avoid nonspecific amplifications of mtDNA and to reduce the chance of primer dimer formation, with the ''Strigency'' (stringency) on the ''Dimers'' tab set close to lowest to design primers with lowest dimer formation potential. Inversions were amplified using 45 thermal cycles of 92uC 20 s, 64uC 30 s and 72uC 1 min using parallel long primers and Taq DNA polymerase.
Long range (LR) PCR was performed using Expand Long Template PCR System (Roche). The two primers of LR single primer PCR used to amplify the last three products of Table 2 were 59-2590CACCTTACAAATAAGCGCTCTCAACT-TAATTTATGAATAAAATCTAAATAAAATATATACGTA-CACCCTCTAACC2664-39 and 59-5405CGCTCAGGCTCC-GAATAGTAGATAGAGGGTTCCGATATCTTTGT-GATTGGTTGAG5351-39, respectively, with the first primer on the plus strand of mouse mitochondrial genome and the second primer on the minus strand. Brain total DNA of 30-or 24-monthold mice was used. The template was 0.2 mg of total genomic DNA. PCR was performed using the following thermal cycle: 92uC 2 min; 92uC 20 s, 68uC 4 min for 40 cycles; 68uC 5 min. PCR products were cloned into TA cloning vector for sequencing.

Quantification of Inversions using Regular PCR
Since primer dimers formed when inversion quantity was very low (as in young mice or liver), we combined regular PCR with Thin half-arrows indicate LR PCR primers (p LR1 and p LR2 ) and qPCR primers (p q1 and p q2 ) used to quantify the inversion. (B) Quantification of the recombination and replication inversion products using qPCR. Total DNA from a 24-months-old mouse brain was first amplified with 8 or 12 cycles of LR PCR with the primers p LR1 and p LR2 (B), p LR1 alone (S), or no primer control (N). Inversions resulting from replication errors can be amplified with p LR1 alone, while inversions resulting from homologous recombination require both p LR1 and p LR2 primers. . The LR PCR was followed by qPCR with primers p q1 and p q2 to specifically quantify the inversion. The Ct values, inversely proportional to log of template concentration, are plotted for each PCR reaction. Error bars indicate s.e.m. (n = 6 for N; n = 12 for groups B and S). (C) A replicate of (B) using the total DNA from a brain of a different mouse (30-months-old). doi:10.1371/journal.pone.0073318.g004 agarose gel electrophoresis to compare inversions in different tissues or of different ages, instead of using real-time quantitative PCR. The PCR was performed using HotStarTaq Master Mix Kit (QIAGEN).
Equal amounts of total mtDNA were used as templates. The equal concentration of template DNA was verified using PCR with two primers that amplify a region of the mitochondrial genome. The reaction mix (20 ml) consisted of 10 ml Master Mix (2x), 0.1 ml of each primer (100 mM) and 9 ml ddH 2 O (double distilled water) and the adjusted volumes of templates (averaging 1 ml). The thermal cycle was: 94uC 10 min; 92uC 15 s, 65uC 15 s, 72uC 30 s for .40 cycles; 72uC 5 min.
Primer pairs used in Figure 3 were

Quantitative PCR (qPCR) Analysis
For this analysis we chose the inversion with the IR between positions 2742 and 5285 (Inversion 1). qPCR was done using GoTaqHqPCR Master Mix. The primer pair used to amplify this inversion was p q1 (59-2604 GCGCTCTCAACTTAATTTAT-GAATAAAATCTAAATAAAATATATACGTACACCCTC-TAACC2664-39) and p q2 (59-5203 GAGATTTCTCTA-CACCTTCGAATTTGCAATTCGACATGAATATCACCT-TAAGACC5257-39) as shown in Fig. 2e. Primer dimers did not form with this primer pair even after .40 thermal cycles.
To quantify the relative concentration of inversion #1 in total mtDNA in 30-month-old mouse brain, we made a 4X serial dilution of template to quantify total mtDNA. A 20X dilution of template was made to measure total DNA. The Ct values of serial dilutions and the 20X dilution were measured by the same qPCR system on the same plate. For inversion #1, a 4X serial dilution of plasmid containing inversion #1 was made, and the Ct values of serial dilutions and brain mtDNA were also measured by the same qPCR system on the same plate. We used the same threshold for both inversion and total mtDNA to calculate Ct values. Amplification efficiency of total mtDNA (f m ) and inversion #1 (f i ) were calculated from the serial dilutions according to Fig. S5a. Given a Ct value of the total mtDNA (Ctm) and inversion (Cti), the relative concentration of inversion #1 in mtDNA was calculated as c = f m Ctm /(20f i Cti ).

Nested PCR of LR PCR and qPCR to Quantify Inversion caused by mtDNA Replication Error
Three groups of reactions were performed simultaneously: Group B, p LR1 +p LR2 ; Group S, p LR1 ; Group N, no primers. If inversions were simple inversions caused by homologous recombination, then they could only be amplified by Group B. However, if inversions were inside dimeric mtDNA circle caused by replication, then Group B and S would have the same efficiency in amplifying the inversion. p LR2 was 1.6 kb from the nearest primer in qPCR, so it would not interact with it in the 30 s annealing/extension time.
To ensure that B & S had the same concentration of p LR1 and that all three groups had the same concentration of template DNA, we first thoroughly mixed 300 ml ddH 2 O, 6 ml dNTP, 30 ml Buffer 1, 4.5 ml Taq and 6 ml template, and then took 65 ml of the mix out for Group N. Then we added 1.4 ml p LR1 , mixed thoroughly, and took 130 ml out for Group B and S, respectively. Then we added 0.7 ml p LR2 to B, and 0.7 ml ddH 2 O to S and N, respectively. B and S were aliquoted into 12 PCR tubes and N into 6 PCR tubes, respectively, with 10 ml in each tube.
For the qPCR, we first prepared a master mix (28 ml for each reaction) without template for 86 reactions. After the LR PCR finished, we added 2 ml of the LR PCR product into 27.5 ml qPCR mix to dilute the LR PCR mix and primers. Each reaction of B and S was repeated three times in qPCR and each reaction of N was repeated twice. We used the median Ct value of the replicates as the Ct of the product of each LR PCR reaction.

LR qPCR System
The LR qPCR system was the same as in the nested PCR, but with two additional dyes: Double-strand DNA dye, EvaGreen (from Biotium, similar to SYBR Green); and a reference dye CXR (from GoTaqH qPCR Master Mix Kit, Promega).

Animals
C57BL/6 mice were obtained from the NIA aged rodent collection.

Statistics
The p-values of Pearson's correlations were calculated by function [r, p] = corr(x, y), and the p-values of Spearman's r was calculated by [r, p] = corr(x, y, 'type', 'Spearman') in MATLAB. The Shapiro-Wilk normality test of the residues was performed by shapiro.test in R. Our data usually did not meet normality assumption even after log transformation, but all data were logtransformed to improve normality. Comparative analysis by independent contrasts was performed using CAIC algorithm [31] by treating MLS as independent variable, and then Spearman's r and p-values of the contrasts were calculated. Figure S1 IRs can be rotated to simultaneously match their reverse complementary sequences. O, origin of the circular DNA. (EPS) Figure S2 The correlations of DRs and IRs with MLS in all available animals. a, the correlation for DRs. b, the correlation for IRs. Arthropods (all labeled) have high level of repeats, possibly because of their low body temperature and metabolic rates, although available data is not enough to be conclusive. Caenorhabditis elegans has a relatively high level of repeats, since its mitochondrial genome is much shorter than other species, with only about 13.8 k bps in total. (EPS) Figure S3 Results generated by self-BLAST. a, b, the correlation of DRs and IRs with MLS, respectively. c, the correlation of repeats with different lengths with MLS. d, e, repeat maps of both DRs (orange) and IRs (red) for mouse (Mus musculus) and human (Homo sapiens). Genomic annotations were extracted from the GenBank file of the reference genomes of the species in NCBI. Line widths are linear to the repeat lengths. Some repeats may be ignored because of the cutting process and because blastn does not output most short repeats (such as 1 , 6 bp repeats). All repeats are a subset of repeats discovered by RollingRepeat even if we allowed extending a low match region of only 5 bps instead of the default 20 in RollingRepeat, confirming the robustness of the RollingRepeat algorithm. (EPS) Figure S4 Empirical relationship between DR length and deletion rate in yeast mitochondria. Since most animal repeats are short repeats and there are large errors for the longest repeats, we weighted more for short repeats in the curve fitting using MATLAB. (EPS) Figure S5 Supplementary qPCR results. a, qPCR Ct values of serial dilutions of the inversion amplified by primer pair of 59-2604,2664-39 and 59-5203,5257-39 (positions were on mouse mitochondrial genome) purified by QIAprep Spin Miniprep Kit. The slope was 1.35, so a 2-times increase in the replicon concentration needed 1.35 cycles. So, the amplification efficiency f of each cycle of the qPCR was 2 1/1.35 = 1.67 (since f 1.35 = 2). b, agarose gel of LR qPCR products of group B and S. Arrays indicate the sizes of marker bands surrounding the expected bands. c, LR qPCR with template being 4-times serial dilutions of the purified DNA between 5 kb to 6 kb of LR PCR group B. No template controls were below the threshold (green horizontal arrow). Delta Rn, signal strength of double stranded DNA during annealing stage. (EPS) Text S1.

Supporting Information
(DOC)