Genetic Analysis of a Rat Model of Aerobic Capacity and Metabolic Fitness

Aerobic capacity is a strong predictor of all-cause mortality and can influence many complex traits. To explore the biological basis underlying this connection, we developed via artificial selection two rat lines that diverge for intrinsic (i.e. inborn) aerobic capacity and differ in risk for complex disease traits. Here we conduct the first in-depth pedigree and molecular genetic analysis of these lines, the high capacity runners (HCR) and low capacity runners (LCR). Our results show that both HCR and LCR lines maintain considerable narrow-sense heritability (h2) for the running capacity phenotype over 28 generations (h2 = 0.47 ± 0.02 and 0.43 ± 0.02, respectively). To minimize inbreeding, the lines were maintained by rotational mating. Pedigree records predict that the inbreeding coefficient increases at a rate of <1% per generation, ~37-38% slower than expected for random mating. Genome-wide 10K SNP genotype data for generations 5, 14, and 26 demonstrate substantial genomic evolution: between-line differentiation increased progressively, while within-line diversity deceased. Genome-wide average heterozygosity decreased at a rate of <1% per generation, consistent with pedigree-based predictions and confirming the effectiveness of rotational breeding. Linkage disequilibrium index r2 decreases to 0.3 at ~3 Mb, suggesting that the resolution for mapping quantitative trait loci (QTL) can be as high as 2-3 cM. To establish a test population for QTL mapping, we conducted an HCR-LCR intercross. Running capacity of the F1 population (n=176) was intermediate of the HCR and LCR parentals (28 pairs); and the F2 population (n=645) showed a wider range of phenotypic distribution. Importantly, heritability in the F0-F2 pedigree remained high (h2~0.6). These results suggest that the HCR-LCR lines can serve as a valuable system for studying genomic evolution, and a powerful resource for mapping QTL for a host of characters relevant to human health.


Introduction
Increased intrinsic exercise capacity or aerobic capacity, generally measured as maximal work with a standardized treadmill test, is an excellent predictor of disease risk in humans, with higher capacities associated with enhanced health and resistance to metabolic disease [1][2][3][4][5][6][7][8][9]. Peak exercise capacity is a better predictor of mortality than other established risk factors such as hypertension, smoking, and diabetes [7,10]. This frequently observed statistical association suggests a causal connection. However, the biological basis for this connection remains largely unknown [11,12]. Aerobic capacity is a complex phenotype. While some studies document a strong genetic component of maximal oxygen uptake [13,14], other studies report low estimates of heritability [15,16]. In studies involving human subjects, it is often difficult to resolve the relative contributions of genetic and environmental factors to the inter-individual variability of aerobic capacity [13,[17][18][19][20][21], e.g., to resolve the relative effects of innate endurance from those due to aerobic training.
To understand the genetic and functional basis of aerobic capacity we sought to establish an animal model that allows indepth analyses of the biology and health impact of innate aerobic capacity. In 1996, a long-term experiment was initiated to create two lines of rats through bi-directional selection for untrained aerobic running capacity [22]. The two lines, termed high capacity runners (HCR) and low capacity runners (LCR), originated from a founder population of 186 genetically heterogeneous rats derived from outcrossing 8 inbred strains (N:NIH stock) [23]. The animals were selected by their performance in run-to-exhaustion tests on a progressively accelerating treadmill, with the highest and lowest runners, one for each sex from each of 13 families, entering into a rotational breeding scheme (see Methods).
One of the original aims for establishing the HCR-LCR lines was to test the hypothesis that artificial selection based on intrinsic aerobic capacity would yield models that also exhibit contrasts in disease risks. This hypothesis has been proven correct: after 28 generations of selection, the HCR and LCR diverged not only for running capacity, but also in other physiological measures, including blood pressure, body mass index, lung capacity, lipid and glucose metabolism [24]. The LCR, relative to the HCR, manifest numerous clinically relevant conditions, including increased susceptibility to cardiac ventricular fibrillation [25] and hepatic steatosis [26]. At the behavioral level the LCR score higher for dysfunctional sleep [27], diminished behavioral strategies for coping with stress [28], and impaired memory and learning [29]. In contrast, the HCR have reduced weight gain [30], increased resistance to the deleterious effects of a high fat diet [31,32], increased capacity for fatty acid oxidation in skeletal muscle [33] and liver [26], and a 28-45% increase of lifespan [34].
A major advantage of the HCR-LCR system is that the pedigree and running phenotype data (n = 11,422) are completely known; and tissue samples for most breeding members (n > 1,500) have been archived. This combination of existing data and reagents, combined with over 70 published physiological studies of the two lines, represents a valuable resource that allows comprehensive analyses of the effects of selection on genomic and phenotypic evolution.
In this study, we carried out a systematic analysis of the running phenotype and related traits over the known pedigree of 0-28 generations. We also collected a genome-wide 10K SNP dataset for a subset of breeding members from three generations (G) (n=142 over G5, G14, and G26), and used these data to examine patterns of genomic evolution in the two lines as they undergo selection. Finally, we performed the first intercross experiments between the HCR and LCR, and analyzed the phenotypic distribution and heritability of the F1 (n = 176) and F2 populations (n = 645). These analyses provided new insights into the genealogical structure, inbreeding patterns, and genetic variability of the two lines, and characterized the intercross animals to assess their suitability as a mapping population for identifying quantitative trait loci (QTL).

Rotational breeding and inbreeding coefficients
The protocols of animal maintenance, phenotyping, and rotational breeding have been described previously [22] (see also Methods). We analyzed the pedigree data for generations 1-28 (Files S1-S2), involving 5,976 HCRs and 5,446 LCRs. For each animal, we calculated its expected inbreeding coefficient (F) by tracing its parental lineage and documenting inbreeding loops. As expected, such pedigree-based estimates of F started to rise at G4-G5 and continued to increase over successive generations (Figure 1). The breeding history included occasional out-of-schedule pairings due to the lack of offspring of a certain sex in a given family or the need to substitute for unproductive mating pairs (see Methods for more details). Despite this, the pattern of F increase in actual pedigrees largely agrees with the expectation assuming perfect adherence to the planned rotation schedule (shown in solid lines in Figure 1). The cyclic rise of inbreeding coefficient every six generations is expected for 13 breeding pairs, due to the inevitable first-cousin paring every half cycle of the rotation [35,36] (further explained in Methods). Importantly, the average increase of estimated F is 0.94% and 0.95% per generation for HCR and LCR, respectively. These predictions are slower than the rate expected under random mating (shown in dotted lines in Figure 1), which increase at 1.51 per generation for both HCR and LCR, starting from the first generation.

Phenotypic response to selection and heritability
For each animal, we collected phenotype data that include maximal running distance, body weight at the time of running trial, and vertical work performed during each run. All animals were tested at 11-12 weeks of age. While both lines were derived from the same base population (indicated in yellow in Figure 2), their running performance gradually diverged over time. After 28 generations, the HCRs and LCRs differ by about 8.3-fold in running distance (9 times of the average within-line standard deviation), compared to ~2.8 fold (range of 298 to 840 meters) among eleven inbred lines commonly used in research [37]. The HCR continue to respond to selection (Figure 2, Table 1-2), with maximal running distance reaching >2000 m, 2.4 fold higher than the best recorded performance among the inbred lines [37]. The pattern of increase is consistent in both males and females ( Figure S1). Body weight increased in LCR and decreased in HCR In the first 12-13 generations, but did not diverge further after G13, stabilizing to a 0.7 to 0.8-fold difference through 28 generations (Figure 3, Table 1-2). In general, females are of lighter weight than males. However, as females tend to run longer, the overall vertical work is nearequivalent between males and females (1.3-fold difference in HCR, 1.1-fold difference in LCR) and larger in HCR than LCR by 6.8 fold.
The narrow-sense heritability (h 2 ) for the logarithm of running distance, which measures the proportion of total phenotypic variance explained by additive effects of genes, was calculated for each line separately, and was 0.47 ± 0.02 in HCRs and 0.43 ± 0.03 in LCRs when all 28 generations were considered. To evaluate potential change in heritability over time, we also calculated h 2 over four-generation, partially overlapping, intervals and found that while h 2 was variable across intervals, it maintained positive values, with no sign of abatement in later generations ( Figure S2). The within-line h 2 for bodyweight and vertical work are 0.45 ± 0.06 and 0.37 ± 0.02, respectively, for HCRs, and 0.17 ± 0.03 and 0.58 ± 0.02 for LCRs. These results indicate that although LCRs did not show a decrease of running capacity as dramatically as the increase in HCRs, the heritability of running performance was comparable in the two lines. Lower body weight is associated with better running capacity, as shown by the negative correlations between the two phenotypes for both sexes within each line. For HCRs, the spearman correlation is -0.41 ± 0.15 and -0.17 ± 0.16 for males and females, respectively. For LCRs, the correlation is -0.19 ± 0.16 and -0.12 ± 0.13 for males and females, respectively. The fact that HCRs continued to respond to selection and that both lines maintained within-line heritability suggest that causal genetic variants have not been fixed in either line, rather they continue to segregate in both pedigrees.

Increased genomic differentiation between lines
As an initial genetic characterization of the HCR-LCR system, we collected genotype data over a genome-wide panel of ~10K single nucleotide polymorphism (SNP) loci for 142 animals, consisting of 22-25 animals in each of three nonadjacent generations (G5, G14, and G26) in both lines (see Methods). We chose these three generations to profile the long-term genomic changes in the two lines.. We used the average heterozygosity of 61 X chromosome (ChrX) markers to infer sex (Figure S3A), and found no disagreement with the recorded sex of the 142 animals. Documented relatedness was also confirmed by plotting the pairwise proportion of not sharing DNA segments due to identity by descent (P(IBD) = 0, or Z0) against the proportion of sharing one copy IBD (Z1) using the 2,518 SNPs (SNP Panel-2, see Methods). Siblings and nonsibling relatives are separated into distinct clusters ( Figure  S3B), indicating that the sample identities reflected in the genotype data are consistent with the recorded pedigree. Multidimensional Scaling (MDS) analysis showed that at G5, HCRs and LCRs formed two readily separable clusters ( Figure  4). From G5 to G14 and from G14 to G26, between-lines  separation increased, indicating a progressively greater differentiation between LCR and HCR.
To measure the apportionment of total genetic variance into between-line and within-line components we performed an Analysis of Molecular Variance (AMOVA) [38]. The proportion of variance explained by among population difference, as a weighted average over all loci, is increasing over time, from 6.5% at G5, to 15.6% at G14, and to 26.5% at G26.

Decreased genomic diversity within lines
We analyzed genetic diversity at the individual level by calculating the average heterozygosity (H 0 ) across the 2,518 Panel-2 markers for each animal, and averaging within each of the six groups (two lines, at three time points) (   that reflect the altered genomic diversity. The observed rates of decrease are slower than in random mating populations. Using the known numbers of effective breeders at each generation we calculated the expected H 0 at each generation assuming random mating, and found that the expected rate of decrease in H 0 is on average 1.53 per generation for both HCR and LCR,. The observed rate of decrease, 0.95% and 0.97% per generation for HCR and LCR, respectively, is lower by 37-38%, consistent with the pedigree-based predictions ( Figure  1) and confirming that the rotational breeding scheme has successfully reduced inbreeding as predicted [39]. As the reduction of H 0 over time primarily reflects higher levels of inbreeding, a majority of the increase of homozygote genotypes should be accounted for by the emergence or expansion of long runs of homozygosity (ROH). Using 10,185 SNPs in SNP Panel-1 (see Methods), we found that for HCR, ROH covered an average of 46% of the genome in G5 animals, and this rate increased to 54.8% in G26. For LCR, ROH covered 46.8% of the genome in G5, and 55% in G26. Thus the non-ROH regions shrink by 0.73-0.77% per generation in the two lines, accounting for most of the decrease of H 0 .

Linkage disequilibrium
We examined linkage disequilibrium (LD) patterns using the Panel-1 SNPs on Chromosome 1 (n = 978). The LD index r 2 decays to 0.3 at ~3 Mb in both HCR and LCR (Figure 6). The level of LD is similar between HCR and LCR, showing slightly higher r 2 in later generations, and is consistent with those reported for the NIH Heterogeneous Stock [40]. These results also suggest that the resolution of QTL mapping using HCR-LCR can be as high as 2-3 cM, considerably higher than the 20-40 cM resolution of F2 intercross of inbred lines [41].

"F2" intercross of HCR-LCR
As HCR and LCR have evolved separately, a direct between-line comparison of phenotypes and genotypes would incur the effect of population stratification. To create a QTL mapping population with randomized genomes, we performed "F2" intercross experiments using 28 HCR-LCR pairs and obtained 242 F1 animals (176 phenotyped). From the phenotyped F1 population we set up 63 mating pairs and produced 645 F2 animals. The term "F1" or "F2" is applied loosely in this context because our crosses were not based on inbred lines. However, we use F0, F1, and F2 to indicate the generations.
The running phenotype of F1 fell in an intermediate range between that of their HCR and LCR parents, and the F2 animals exhibited larger variations than the F1 rats (Figure 7). This pattern is consistent with the model in which most neutral alleles no longer co-segregate with the phenotypes; and at functional loci (i.e., those responsible for the phenotypic differences between HCRs and LCRs), F1s tend to be heterozygous and F2s carry a wider assortment of genotypes.
Importantly, the h 2 of the maximal running distance in the F0-F2 pedigree remained high, at 0.60 ± 0.05. The h 2 for vertical work is comparable to those of the ancestral lines under selection (0.61 ± 0.05), while the h 2 for body weight is lower (0.03 ± 0.04). In addition, we phenotyped a number of other physiological measures in a subset of F2 animals, including heart/body mass ratio (n=380), extensor digitorum longus (EDL) mass/body mass ratio (n=387), and percent body fat (n=490), which showed strong heritability (0.42 ± 0.11, 0.36 ± 0.10, and 0.48 ± 0.10, respectively). These results indicate that the running capacity and related traits are influenced by genetic factors, confirming that it is feasible to use the F2s as a mapping population for QTL.

Discussion
While previous studies have focused on functional, physiological comparisons between HCR and LCR rats, here we conducted the first in-depth pedigree and molecular genetic analysis of the two lines. Phenotypic data over the 28generation pedigree not only revealed substantial heritability for the running capacity trait, but also showed that the heritability is maintained in later generations, suggesting continued selection. In addition, the strong heritability is recapitulated in the F2 intercross. These findings suggest that causal variants continue to segregate in both HCRs and LCRs and persist in the F2 population. Our study generates the first direct evidence that the trait under selection is highly heritable, providing justification for intercross-based QTL mapping. In addition, the heritability for vertical work, heart/body ratio, EDL/body ratio, and percent body fat suggests that the model can be used for simultaneous QTL mapping for multiple traits.
We observed continued response to selection in HCR during 20-28 generations. This observation is notable because it can be interpreted in two possible scenarios. The first is that the aerobic running capacity may be influenced by many interacting QTLs, and as variants in some loci become fixed under selection, the previously hidden phenotypic effects of other variants can be "released" and come under selection, thereby fueling prolonged responsiveness. This agrees with previous observations in similar systems that long-term selection did not exhaust the genetic variation for the selected trait due to the graduate shifts in the capacitors of cryptic genetic variation [42,43]. The second scenario is that the trait may be governed by many QTL of small effects, hence the strength of selection (~20% of animals become breeders in each generation (see Table 3)) may not have effectively driven the rapid changes of allele frequencies. The two scenarios are not mutually exclusive, and the observation means that not all causal alleles have been differentially fixed in the two lines. Therefore the F2 mapping approach needs to consider the possibility that the causal variants may be segregating within one or both lines.
HCRs exhibited accelerated improvement of running capacity during G12-G15 (Figure 2). To identify the cause(s) of this acceleration we examined factors such as diet, running protocols, the breeding schedule, and "Operator", i.e., the experimenter or a team of experimenters assessing the running phenotype. The average litter size (i.e., fecundity) in the recorded HCR pedigree was not changed significantly during this period (Figure S4A), hence there was no noticeable change in fertility or the strength of selection (i.e., the fraction of animals chosen as breeders). There was no systematic correlation between litter size and inbreeding coefficient of the offspring (not shown); and there was no documented change in diet, running apparatus, or running protocol. The breeding schedules for the two lines were closely synchronized across all 28 generations (Figure S5A). The pedigree-based prediction of F was increasing in both HCR and LCR as expected ( Figure  1). However, a more detailed retrospective analysis of the breeding records found three factors having changes during the G12-G15 period. The first is Operator: a team supervised by Operator 3 performed the running tests during G7-G13, while a team supervised by Operator 4 performed the tests during G14-G15. The second factor is the number of animals in the pedigree with no entries for running data which accrued mostly from rats that "refused" to run. The number of noncompliant rats in both lines gradually increased during G7-G13 in both lines, dropped immediately at G14, and remained low for most of subsequent generations ( Figure S4B). Despite presumed standardization of the running protocol, the loss of running data may be Operator dependent in the sense that "refusal to run" is a subjective measure. Third, the fraction of mating pairs that were out-of-schedule increased in G12-G14 in HCR, and dropped after G15 ( Figure S4C). The simplest interpretation of these co-occurrences is that Operator 3 subjectively determined that a large number of animals refused to run. Those who did run showed no improvement over G6-G13. With Operator 4, nearly all animals were able to run, and ran better than previous generations. While plausible, this simple scenario does not explain all the observations. First, despite being kept and tested under the same conditions as the HCRs, the LCRs exhibited no comparable acceleration or deceleration in running capacity. Second, the acceleration in HCRs began in G12-G13 with the unexplained emergence in some families of one or two exceptional runners, whose running distance were often more than twice as long as that of their siblings ( Figure S6). The performance of these runners could not be easily explained by Operator. Partly because the exceptional runners tended to be selected as breeders, such improved performance spread wider across the cohort in G14-G15 and gradually became the norm after G16. However, there was not a clear-cut Mendelian segregation pattern in these generations: the pairing of two exceptional runners often still produced mediocre offspring. Among HCR mating pairs in G12-G15 there were 13 out-of-schedule pairs, which did not produce more exceptional runners than on-schedule pairs (not shown). The location of animal facility changed between G15 and G16 for both lines, but this change took place after the acceleration had started. Despite these complications, heritability estimates for HCR, when calculated for threegeneration sections of the pedigree and shifted by one generation, did not show dramatic changes over the generations (Figure S5B). The accelerated improvement of running capacity in HCRs during G12-G15 could also reflect genetic changes. However, emergence of a single high-impact de novo mutation is unlikely, as prodigious running capacity arose in multiple families concurrently. Such a pattern, however, is compatible with a scenario in which causal "high" alleles in multiple genes interact in a non-linear fashion. Various combinations of the high alleles could have undergone gradual enrichment and in G12-G15, began to manifest as improved phenotype when the most favored combinations were formed. Future studies, including linkage analysis of these intermediate generations, are needed to characterize the genetic changes accompanying the apparent varying tempo of trait evolution.
In the F2 generation of the intercross, the running distance distribution is wider than in F1, but did not reach the full range seen in F0 animals. The fact that none of the F2 animals performed as well as their HCR grandparents, and very few performed as poorly as the LCR grandparents, strongly suggests that multiple genetic loci are involved. The Castle-Wright estimator of the effective number of QTLs is calculated as 4-10 using our F2 data [44,45]. Caution should be taken as the calculation is based on simplifying assumptions such as unlinked loci of equal effects that have no interaction. Only the actual linkage or association studies can reveal the number and impact of QTL underlying the trait in question.
The HCR-LCR system was initiated in 1996 [22] and reached G28 in 2011. During this time, the two lines have diverged in innate endurance running capacity and showed marked differences in body type and metabolic traits. The HCR animals show a lower weight gain than LCR, in both young and adult rat, and this can be partly accounted for by higher spontaneous activity and lower fuel economy during activity in HCRs [32]. The two lines also diverged for many health indicators, with HCR showing a relative resistance to obesity, higher insulin sensitivity, lower blood pressure, improved lipid parameters, and enhanced longevity [30,31,34]. These phenotypes are of immense public health interest, as prevalence of diabetes, cardiovascular disorders, obesity, and metabolic syndrome is rising at an alarming rate and account for a major portion of disease burden worldwide [46]. The model system used in this study is ideally suited for elucidating the fundamental biology of metabolic health. Understanding the genetic architecture and molecular underpinnings of the remarkable HCR-LCR differences has the potential to provide new insights into the relationship between exercise capacity and metabolic health in humans.
Taken as a whole, the results presented above suggest that the HCR-LCR system is well-suited to serve as a novel model system for studying genome evolution under sustained selection and for dissecting the functional and genetic basis of polygenic traits. The model exhibits large phenotypic divergence, sustained heritability for a wide range of cardiovascular and metabolic traits, and maintained outbred character. The complete pedigree is known, with running phenotype for all animals already collected, and tissue sample for most breeders archived. Compared to inbred line-based gene mapping, our system offers some additional advantages. First, while the F2 generation could be subjected to conventional linkage mapping [47], the two lines have accumulated ~60 generations of historical recombination (~30 as the NIH Heterogeneous Stock, 28 generations of divergent selection, and F2 intercross). Consequently, animals in both lines carry fine-grained genomic mosaics of eight "ancestral" inbred strains, with LD structure on the order of 3 Mb, allowing for greater resolution in association analysis [48][49][50]. Second, our system has maintained genetic diversity through rotational breeding, such that networks of interacting QTLs may have evolved jointly under selection, making the system particularly suitable for detection of interaction QTL [51]. Combining QTL mapping with the wealth of existing knowledge of the HCR-LCR system is expected to allow the identification and prioritization of high quality candidate genes that will shed insight into the biology of oxidative capacity and metabolic fitness.

Ethics statement
This study was approved by the University Committee on Use and Care of Animals, Ann Arbor, Michigan (Approval Numbers: #08905 and #03797). The proposed animal use procedures are in compliance with University guidelines, and State and Federal regulations.

Rotational breeding scheme
In practice, each line contains at least 13 mating pairs through all generations. From each of the family produced, one male and one female are selected as breeders for the subsequent generation. For HCR, the male and female with the greatest running distance are selected, whereas in LCR, those with the lowest distance are selected. The breeders are paired between different families to avoid brother-sister mating, and the pairings rotate in successive generations to minimize inbreeding [39]. When the 13th rotation is reached, samefamily mating is skipped, and the pairings are reiterated starting again in the same way as rotation 1 (Figure S7A). Sometimes, if a particular mating fails, or if a family lacks animal of one sex, substitute mating is attempted involving a male from another family ( Table 2). In some cases, one male is mated to two females. After G12, female HCR with extremely low body weights were not selected to be breeders in order to avoid reduced fecundity [52]. During G9-G13 in HCR there were also three cross-generation matings, whose offspring were incorporated into subsequent generations. Further, occasionally additional pairs are bred to generate experimental cohorts for study by us or for sharing with collaborators, and the progenies in these "analytical families" are not used for maintaining the lines, and are not counted in our calculation of the expected inbreeding levels (see below). They are, however, used to calculate heritability and the distribution of trait values.
One inevitable consequence of this breeding scheme is the mating of first-cousins at every half interval (Figure S7B). For example, at G7, every breeding pair, such as 1M-7F (a male from Family 1 and a female from Family 7) involves first cousins, because they are from 1F-7M and 7F-13M matings respectively, in G6, in which 7M and 7F are siblings. This results in a 6% spike in inbreeding values in G8 (Figure 1), and such a cyclic pattern continues in subsequent generations, resulting in spikes at G14, G20, and G26. The actual pedigree deviates from a perfectly executed breeding scheme due to the inclusion of substitute breeders, and the resulting estimates of inbreeding coefficient from the actual pedigree depart slightly from the expectations (Figure 1). As some breeding pairs were assembled to generate offspring for research use rather than line propagation, the "effective breeders" are those that contribute offspring who are also used as breeders, and do not include those whose offspring were used only for research ( Table 3).

Running phenotype
Eleven week old animals are subjected to run-to-exhaustion tests without prior training, except for brief sessions of treadmill education during the week prior to the tests. The purpose of such education sessions is to familiarize the rats to the experimenters and the testing equipment and to ensure that each rat has the ability to achieve a minimal level of continual running for 5 minutes at least once, which constitutes the threshold performance necessary for inclusion in the actual running tests the following week. During education, the rats learn to keep running in order to avoid a mild shock (1.2 mA of current at 3 Hz) induced by the electrified grid located at the back of the treadmill. For all sessions the treadmill is set at a 15-degree upward slope.
During the run-to-exhaustion test, each rat was evaluated on five consecutive days (Mon-Fri) for G0-G16 and on three alternating days (Mon-Wed-Fri) for G17-28. Each trial starts at a velocity of 10 m/min, which increases by 1 m/min every 2 min until the rat reaches exhaustion. Exhaustion point is defined as the third time a rat can no longer keep pace with the treadmill and remains on the shock grid for two seconds rather than resuming running. At this point the rat is removed from the treadmill and weighed. For each rat, the best distance out of the multiple trials is taken as the best estimate of its intrinsic capacity, and used as the criterion for breeder selection. The vertical work during each trial is estimated using the equation: work = (running distance) x (body weight) x (sin[15°]) x (9.8m/s 2 )/1000 in which the unit for work is joule (J=kg•m 2 /s 2 ). Unit for running distance and body weight is meter and gram, respectively.

Phenotype distribution
To display phenotype distribution we produced violin plots (as shown in Figures 1, 2, and 3) using the vioplot function from the Vioplot package in R. To calculate the Spearman's rank correlation coefficient (ρ) between maximal running distance and body weight, separately for two sexes within each line, we used the cor and cor.test functions in R. The ρ values were calculated for each generation, and averaged over G1-G28.

Inbreeding coefficient and Heritability
We calculated the inbreeding coefficient (F) for each animal in the pedigree using the calcInbreeding function from the pedigree package in R [53]. The pedigree for earlier generations of NIH:H animals, i.e., those that preceded the founders of our lines, was not available. We are therefore limited to calculate the increase of inbreeding coefficients from those of the founders, effectively assuming they were unrelated, while in fact they were related according to the (unknown) breeding patterns in the preceding generations. To calculate the expected F under random mating, we used the equation F n+1 = F n +(N f +N m )/(8*N f *N m )-F n (N f +N m )/(8*N f *N m ), where the F n and F n+1 are the inbreeding coefficients at the n-th and (n+1)-th generation, respectively, and N f and N m are the numbers of male and female breeders at the n-th generation (N f = N m = 13 in every generation for a 13-family breeding scheme). To calculate the expected F under perfect adherence to the rotational scheme, we generated an idealized pedigree of 13 mating pairs of exact mating patterns as intended, and used the pedigree package to calculate F for every member of the pedigree.
To calculate the narrow-sense heritability (h 2 ), we applied the variance and covariance component models as implemented in SOLAR version 4.3.1 [54]. We estimated h 2 for maximal running distance both over the entire pedigree and for fourgeneration intervals (with one and three generation overlap) to assess the h 2 variation over time. We also estimated h 2 for body weight and work for the G0-G28 pedigree, and for additional traits for the F2 intercross. For the G0-G28 analysis, we included sex and operator as covariates; and for the F2 intercross we included sex and batch because the breeding was performed in two batches, containing 154 and 491 F2 animals, respectively.
Genotyping and data processing DNA from 22-25 breeders from both lines in three nonadjacent generations (G5, G14, and G26, n=142) was extracted from frozen liver tissue, and genotyped across 10,846 SNP loci using the Affymetrix Rat Mapping 10K GeneChip. Attempts to extract DNA from generations earlier than G5 revealed that many samples in G0 and G4 were degraded. We therefore chose G5 as the earliest generation in our analysis due to its assured DNA quality. In assessing the quality of SNP markers we removed 28 duplicate SNPs, 496 SNPs with genotype missing rate >10%, and 137 SNPs with Hardy-Weinberg Equilibrium test p < 0.001. These steps led to 10,185 SNPs that formed the "Panel-1" markers. As some analyses require a reduced set of SNPs without rare variants and without strong linkage disequilibrium, we removed from Panel-1 an addition set of 7,284 SNPs selected by trimming SNP pairs in linkage disequilibrium with r 2 value >0.05 (in windows of 10 SNPs, sliding by 2 SNPs each time), and 572 SNPs with minor allele frequency (MAF) <5%. After these steps, 2,518 SNPs remained and formed the "Panel-2" markers. The Panel-2 markers were used in calculations of IBD, AMOVA, and genome-wide average heterozygosity. Pairwise Identity-by-State (IBS) matrix was estimated in PLINK [55] using the -genome command and Panel-2 markers. Multidimensional scaling analysis of the IBS matrix was performed in R [56]. Analysis of molecular variance (AMOVA), as implemented in the program Arlequin, was used to calculate the within-and among-group differentiation [57].
We assessed the accuracy of recorded sex for each genotyped animal by calculating the average heterozygosity of the X chromosome SNPs. Male and female animals are confirmed by non-overlapping distributions of ChrX heterozygosity values. We confirmed known sibling pairs among the genotyped animals by plotting pairwise Z0 vs Z1 values in R. Z0 and Z1 values were determined in PLINK using the -genome command.

Runs of homozygosity
Using the 10,185 Panel-1 markers, we identified long runs of homozygosity (ROH), in PLINK using the -homozyg command. We defined ROHs as genomic segments with at least 4 homozygous markers and having a density of at least 1 SNP per 500Kb. Total ROH length in each animal was obtained by summing over all ROH and also reported as the fraction of the rat genome (2.75 Gb).

Genomewide average heterozygosity
Using the Panel-2 markers, we calculated the average heterozygosity in PLINK using the -hardy command, and compared across genotyped lines and generations using boxplots. The expected heterozygosity values were calculated using the equation H n+1 = H n (1-((N f +N m )/(8*N f *N m ))), where the H n and H n+1 are the heterozygosity at the n-th and (n+1)-th generation, respectively, and N f and N m are the numbers of male and female breeders at the n-th generation, respectively (N f = N m = 13 in every generation for a 13-family breeding scheme) ( Table 3).

LD calculation
Pairwise measurements of LD (r 2 ) were calculated for marker pairs within 5 Mb on Chromosome 1 using Haploview [58]. Chromosome 1 was chosen as a representaive autosome. To show the relationship between r 2 and inter-marker distance, we calculated average r 2 values for groups of marker pairs falling in discrete bins of inter-marker distance, in 500Kb increments, and plotted the values for G5, G14, and G26 in both HCR and LCR (as shown in Figure 6).

"F2" intercross
We performed the F2 intercross in two batches. For the first batch, we randomly selected 4 males and 4 females from G26 of each line to form 8 HCR-LCR reciprocal pairs, which generated 79 F1 rats, from which 20 males and 20 females were randomly selected to form pairs between different F1 families (i.e., avoiding brother-sister mating). This generated 154 F2 rats. For the second batch, we selected 9 males and 9 females from G28 of each line to form 18 mating pairs, which generated 163 F1 rats, from which 97 were phenotyped. Out of the 97 F1 animals, 40 males and 40 females were selected across the 18 families with equal representation between HCR/LCR parentage, such that we had 4 combinations (HCRmom male with HCR-mom female, HCR-mom male with LCRmom female, LCR-mom male with HCR-mom female, and LCR-mom male with LCR-mom female) and generated 491 F2 rats. The two batches together yielded 645 F2 rats (Figure 7). Phenotyping for running performance followed the same protocols as described above. Additionally, for F2 animals in the second batch we measured lean mass, fat mass, fluid mass, fasting blood glucose, heart mass, and EDL muscle mass at 16-20 weeks of age. Body composition was determined via NMR using Bunter Optics Minispec LF90 II.
Blood glucose after a 4-hour fast was determined using Accu-Check Aviva meter. At time of dissection, heart and EDL muscles were weighed immediately upon harvesting. Figure S1. Distribution of maximal running distance for generations 0 to 28 for males (A) and females (B). Shown are "violin-plots" for individual generations for females and males separately. The blue tick marks on the y axis indicate the maximal running distance for eleven inbred lines, which are ordered, from top to bottom for males (A) as DA (968m) and G15 (D). Shown are the scatterplots of midparent running distance (x-axis) versus male and female offspring running distance (y-axis) for on-rotation mating (indicated by XX.0), off-rotation mating due to mother (XX.1), off-rotation mating due to father (XX.2), off-rotation mating due to both mother and father (XX.3), and either father or mother from the enrichment cohort (XX.4). Data points above the solid diagonal line indicate offspring with better running performance than their mid-parent. (TIFF)