Genome Wide Identification of SARS-CoV Susceptibility Loci Using the Collaborative Cross.

New systems genetics approaches are needed to rapidly identify host genes and genetic networks that regulate complex disease outcomes. Using genetically diverse animals from incipient lines of the Collaborative Cross mouse panel, we demonstrate a greatly expanded range of phenotypes relative to classical mouse models of SARS-CoV infection including lung pathology, weight loss and viral titer. Genetic mapping revealed several loci contributing to differential disease responses, including an 8.5Mb locus associated with vascular cuffing on chromosome 3 that contained 23 genes and 13 noncoding RNAs. Integrating phenotypic and genetic data narrowed this region to a single gene, Trim55, an E3 ubiquitin ligase with a role in muscle fiber maintenance. Lung pathology and transcriptomic data from mice genetically deficient in Trim55 were used to validate its role in SARS-CoV-induced vascular cuffing and inflammation. These data establish the Collaborative Cross platform as a powerful genetic resource for uncovering genetic contributions of complex traits in microbial disease severity, inflammation and virus replication in models of outbred populations.


Introduction
Severe Acute Respiratory Coronavirus (SARS-CoV) emerged in humans in Southeast Asia in 2002 and 2003 after evolving from related coronaviruses circulating in bats [1,2]. SARS-CoV caused an atypical pneumonia that was fatal in 10% of all patients and 50% of elderly patients [3,4]. Patients infected with SARS-CoV experienced fever, difficulty breathing and low blood oxygen saturation levels [5,6]. Severe cases developed diffuse alveolar damage (DAD) and acute respiratory distress syndrome (ARDS) and disease severity was positively associated with increased age [7]. Host genetic background is also thought to influence disease severity but this understanding is complicated by inconsistent sample collection, varying treatment regimens and the limited scope of the SARS epidemic in humans [3,8,9]. Existing animal models of SARS-CoV infection have revealed that this lethal pulmonary infection causes a denuding bronchiolitis and severe pneumonia which oftentimes progresses to acute respiratory failure [10,11,12]. More recently, a second emerging coronavirus designated Middle East Respiratory Coronavirus (MERS-CoV) emerged from bat and camel populations [13,14,15], and has caused~38% mortality. Given the complex interplay between environmental, viral and host genetic variation in driving viral disease severity, as well as the difficulty of studying those factors in episodic outbreaks of pathogens such as SARS-CoV, MERS-CoV and other highly virulent zoonotic pathogens that cross the species barrier at regular intervals, novel approaches are needed to understand and identify those factors contributing to these diseases.
Host genetics play a critical role in regulating microbial disease severity, evidenced by the identification of highly penetrant host susceptibility alleles within CCR5, FUT2, IL-28B in controlling HIV, norovirus and HCV infection and disease severity, respectively [16,17,18]. However, most microbial infections cause complex disease phenotypes that are regulated by the interactions of oligogenic traits with reduced penetrance, making them extremely difficult to identify and validate in human populations during outbreaks. Mannose binding lectin (MBL) polymorphisms were alternatively associated with successful recovery from SARS-CoV infection and a poor outcome of infection [19,20], reflecting the complexity of performing candidate gene or genome wide association studies with limited human samples. The generation of a mouse adapted strain of SARS-CoV, MA15, allowed for development of a small animal model that replicated both human lung disease and the age-dependency of SARS-CoV pathogenesis [10]. MA15 infection of inbred mice deficient in various immune genes has greatly contributed to our understanding of the host response to SARS-CoV infection [21,22]. However, such studies have focused on extreme abrogation of rationally selected candidate genes and have not evaluated the role of undescribed polymorphisms in genes in a model mimicking the genetic diversity seen in the human population. As a complement to human genome wide association studies, here we apply a new approach designed to dissect the identity and contributions of monogenic and oligogenic variants on multiple traits in complex disease outcomes following acute virus infection in a mouse model of human populations.
The Collaborative Cross (CC), a novel eight-way recombinant inbred (RI) mouse strain panel, has recently become available to the scientific community [23,24,25]. The power of the CC for genetic mapping is enhanced by availability of complete genome sequences of the founder strains and rich bioinformatics resources [26,27,28]. The eight founder strains used to generate the CC (A/J, C57BL/6J, 129S1/SvImJ, NOD/ShiLtJ, NZO/HILtJ, CAST/EiJ, PWK/PhJ and WSB/EiJ) are phenotypically diverse and capture single nucleotide polymorphisms (SNPs) and insertion/deletions (In/Dels) at approximately twice the frequency of common variants in human populations [24,29,30,31,32]. The derivation of CC strains from these multiple founders has proven to be useful for identifying polymorphisms that are responsible for a variety of traits [23]. The CC supports precise genetic mapping and, because the CC strains are genetically reproducible, it also serves as a robust validation platform and reference resource for integrative systems genetics applications.
Here, we studied incipient lines of the CC (the preCC) to identify host genes that contributed to SARS-CoV MA15 infection and pathogenesis. We identified four novel quantitative trail loci (QTLs) contributing to SARS-CoV pathogenesis. Within the HrS1 QTL, a combination of approaches applied to the CC platform predicted a single gene candidate, Trim55, as the principle regulator of vascular cuffing after infection. Vascular cuffing is a commonly reported phenotype observed in response to a variety of insults including chemical injury and infection ( [33,34,35]; high levels of vascular cuffing have been observed in models of severely pathogenic SARS-CoV infection [21,22]. Fluid vascular cuffing has been reported to decrease lug compliance suggesting an important physiologic consequence of this response [36]. Using knockout mice, we confirmed the role of Trim55 in immune cell infiltration, demonstrating the utility of the CC platform for identifying single gene candidates that likely regulate novel immune functions in trans-endothelial migration and perivascular cuffing following virus infection.

Expansion of SARS-CoV associated phenotypes across the genetically diverse preCC
Mice from the eight founder strains as well as 147 eight to twenty week old female preCC mice were infected with 10 5 plaque forming units (PFU) of mouse adapted SARS-CoV, designated MA15 [10], and weight loss was observed over the course of a four day infection. At day four post infection mice were euthanized and tissue collected for assessment of viral load in the lung as well as virus-induced inflammation and pathology. A wide range of susceptibilities to SARS-CoV infection was found among the founder strains of the CC and the overall heritability of weight changes following SARS-CoV infection determined to have a coefficient of genetic determination of 0.72. NOD/ShiLtJ mice were resistant to infection and gained weight over the course of the experiment (Figs 1A and S1A). A/J, C57BL/6J,129S1/SvImJ and NZO/HILtJ mice experienced moderate and transient weight loss as previously described [21,22] while CAST/EiJ, PWK/PhJ and WSB/EiJ mice demonstrated extreme susceptibility to SARS-CoV infection including substantial weight loss and death (Figs 1A, S1A and S1B). Subsequent dose response studies using the three highly susceptible wild-derived strains indicated an LD 50 of between 100 and 500 PFU for CAST/EiJ mice, between 500 PFU and 1000 PFU for PWK/ PhJ and between 10 3 and 10 5 PFU for WSB/EiJ mice (S1 Table). PreCC mice infected with SARS-CoV ranged from over 30% weight loss by day four post infection to over 10% weight gain (Fig 1A), exceeding the range of susceptibilities observed in the founder strains. Additionally, 26 preCC mice (18% of the preCC cohort) succumbed to infection prior to the day four harvest point indicating extreme susceptibility to SARS-CoV infection.
Viral load in the lung at day four post infection was determined for each surviving preCC mouse as well as for each of the founder strains. Viral lung titers showed a heritability of 0.60 as measured by the coefficient of genetic determination amongst the 7 surviving founder strains. Amongst the founder strains, PWK/PhJ mice had the lowest viral loads in the lungs, with 1.75x10 3 PFU per lung at day four post infection (Figs 1B and S1B). PWK/PhJ mice also showed significant weight loss and a low LD 50 indicating that viral load was unlikely to be responsible for pathogenesis in these mice. In contrast, C57BL/6J mice had the highest amount of virus at 6.35x10 6 PFU per lung. Lung titers in the preCC mice ranged from below the limit of detection (100 PFU/lung) to over 10 8 PFU per lung, greatly exceeding the range of viral loads in the founder strains. Some preCC mice had viral loads in the lung below the 100 PFU limit of detection, despite having substantial weight loss. CAST/EiJ mice are extremely susceptible to SARS-CoV infection and do not survive until the day four post infection timepoint. Fig 1C shows the relationship between weight loss and lung titer at day four post infection. We found no correlation between viral load in the lung at day four post infection and weight loss (r = -0.014, p = 0.8938) when excluding those animals with viral loads below the limit of detection. When those animals were included in the analysis there is a significant, but not very explanatory correlation (r = -0.347, p = 0.00019) between the two phenotypes.
Multiple aspects of lung pathology were assessed in surviving preCC animals including disease and immune infiltrates in the airways, vasculature, alveoli and parenchyma and signs of DAD (S2 Table). A wide variety of lung pathologies were found across the preCC mice including denudation of airway epithelial cells, airway debris, eosinophilia, hyaline membrane formation and vascular cuffing (Fig 2A-2F). Quantification of the overall pathology score along with select data ranges are shown in S2 Fig. Hyaline membrane formation and pulmonary edema with accompanying inflammation in the alveoli was a hallmark of SARS-CoV infection in human cases and is also evident in aged mouse models of disease [11]. In contrast to young founder strain animals, robust hyaline membrane formation was observed in 13% of preCC  mice at day four post-infection, demonstrating that improved animal models are one likely outcome of infection studies in the CC. Phenotypic correlations of varying strengths were observed between aspects of lung pathology, inflammation, viral load at day four post infection, as well as weight loss across the course of the study (Fig 3).

Multiple genetic loci contribute to aspects of SARS-CoV pathogenesis
We genotyped 140 preCC animals at high density, including several that succumbed to infection prior to the scheduled day four harvest. As previously described [23,27], we conducted QTL mapping using Bagpipe (http://valdarlab.unc.edu/software.html) and the underlying eight founder strain haplotypes present in the CC to identify host genome regions containing polymorphisms significantly associated with SARS-induced disease responses. We identified four QTLs shown in  associated phenotypes at day four post infection. We identified a significant main effect QTL for vascular cuffing (Chr 3: 18286790-26668414), which explained 26% of the variation in vascular cuffing. We also identified two highly suggestive (genome-wide p-values based on permutations of 0.1>p>0.05) QTL for viral titer (Chr 16: 31583769-36719997) and eosinophil infiltration (Chr 15: 72103120-75803414), explaining 22% and 26% of the variation in these traits respectively. Finally, we also searched for modifier QTL, those QTL additively influencing a trait of interest, but whose presence was initially masked by our three other identified QTL. We identified a significant QTL further influencing vascular cuffing (Chr 13: 52822984-54946286), explaining an additional 21% of the variance in this phenotype. HrS4 was a moderate peak even without considering HrS1 status, suggesting that these interactions are additive. Table 1 details each of the SARS susceptibility QTLs including LOD and p-values. Analysis of other phenotypes did not lead to discovery of QTLs passing the p<0.01 significance threshold.

Integration of statistical, genetic and bioinformatic approaches identify high likelihood candidate genes
The genetic architecture of the preCC, with up to eight distinct haplotypes at each locus, provides unique opportunity for narrowing QTL regions to candidate genes or SNPs. To narrow QTL regions we estimated the additive allele effects associated with each haplotype and correlated these to the allelic states at candidate causative polymorphisms. Allele effects [23] describe the estimated effect of each of the eight founder haplotypes on the phenotype (e.g. a large positive allele effect for the PWK/PhJ haplotype suggests that having a PWK/PhJ allele will increase the phenotypic trait value of interest). In our analysis we focused on polymorphisms corresponding to the largest contrast between allele effects at the peak QTL locus. For HrS1 we identified two haplotypes, C57BL/6J and WSB/EiJ increasing vascular cuffing relative to the haplotypes of the other six founder strains. For each of HrS2-4, we identified a single founder haplotype altering the phenotype relative to the seven other founder haplotypes (HrS2: PWK/PhJ haplotype reduced viral titer; HrS3: A/J haplotype increasing eosinophillic infiltration; HrS4: CAST/EiJ haplotype reduced vascular cuffing).
We then used high coverage whole genome sequence from the eight founder strains [37] to identify either private SNPs or small In/Dels in the case of a single causative haplotype, or regions of shared descent (in the case of two causative haplotypes) to narrow down the large QTL regions. HrS1 was initially an 8.38 Mb region which contained 26 genes and 9 non-coding RNAs (ncRNAs). Identification of the sub-regions where C57BL/6J and WSB/EiJ share private, common ancestry reduced this region to 449 kb, which contained only one gene, one pseudogene and one miRNA of unknown function (Trim55, GM7488 and AC107456.1, respectively). Allele effects for all four QTLs can be seen in S3 Fig. The HrS2 QTL on chromosome 16 was a 5.4 Mb region containing 92 genes and 30 ncRNAs. Across the eight founder strains, there were 95,936 SNPs or small In/Dels, and 33,288 of these were private to PWK/PhJ. Seven ncRNAs and 74 genes had private PWK/PhJ SNPs or In/Dels (S3 Table). We further prioritized these variants based on whether the PWK/PhJ private polymorphisms were likely to cause major functional changes to the gene (missense, nonsense, stop gained/lost, splice alterations or nonsense mediated decay). When we did so, we further reduced this list to 48 candidate genes including several mucins as well as genes involved in T cell activation and apoptosis.
The HrS3 QTL on chromosome 15 was a 3.7 Mb region containing six ncRNAs and 63 genes. There were a total of 71,208 SNPs or small In/Dels in the region, 932 of which were private to A/J. No ncRNAs and only 25 genes contained a private A/J SNP or In/Del, and we further reduced these to one candidate gene with major functional changes (S4 Table), Bai1. Bai1 is a high priority candidate gene given the association between eosinophils and angiogenesis [38]; however we have not chosen to focus on Bai1 at this time because of the limited availability of tools for working on an A/J genetic background. Finally, HrS4 on chromosome 13 was a 2.12 Mb region containing three ncRNAs and 30 genes. There were a total of 461,46 SNPs or In/Dels in the region, 9,732 being private to CAST/ EiJ. 29 of the genes and all three ncRNAs contained private CAST/EiJ polymorphisms (S5 Table). When we further prioritized based on major functional changes, we reduced the region to only one ncRNA and nine genes including Cdhr2, a member of the protocadherin family [39].

Trim55 deletion results in altered immune cell infiltration
We focused our validation efforts on Trim55, the single HrS1 candidate and a member of the TRIM protein superfamily which has not previously been associated with any infectious disease phenotype. Although many TRIM proteins function in innate immunity and inflammation, Trim55 (also known as muscle-specific RING finger 2 or Murf2) has only been studied in the context of muscle development and cardiac function [40,41]. Trim55 is expressed in smooth muscle surrounding blood vessels [42], an appropriate location to influence perivascular cuffing phenotypes. Knockout mice on a C57BL/6J background have previously been reported [43] and were kindly made available to our laboratory. Groups of age matched Trim55 -/and C57BL/6J control mice were infected with 10 5 PFU of MA15 for four days and monitored daily for weight loss and signs of disease. Trim55 -/and C57BL/6J animals had similar weight loss profiles as well as similar viral loads in the lung at day four post infection (Fig 5A and 5B) and no differences in mortality. Hematoxylin and eosin stained lung sections showed significantly reduced vascular cuffing in the lungs of Trim55 -/-(mean score of 0.69) compared to control animals (mean score of 1.15) (p < 0.05 by students t test, Fig 5C), confirming the role of Trim55 in contributing to SARS-CoV-induced vascular cuffing. Additional mice were infected for flow cytometric analysis of inflammatory cell populations in the lung after MA15 infection. While we observed a general trend towards increased numbers of T cells, B cells and macrophages in the lungs of C57BL/6J control mice compared to the Trim55 -/mice, only monocyte numbers were significantly different between the two groups. Total monocytes, as well as the subset of Ly6C positive monocytes, were present in significantly higher numbers in the lungs of infected control mice compared to Trim55 -/mice ( Fig 5D).
RNA was isolated from the lungs of mock and infected control and Trim55 -/mice at two and days four post infection. 168 genes were identified as differentially expressed (DE, log2 fold change >2 relative to mock) between the two strains, predominantly at day two post infection (GEO accession GSE64660). We then used Ingenuity Pathway Analysis software to identify functionally enriched gene categories. This analysis identified the granulocyte and agranulocyte diapedesis gene ontology categories as among the most significantly enriched (first and third respectively) from genes with DE between Trim55 -/and B6 controls (Fig 6A). Diapedesis, or extravasation, is the process by which inflammatory monocytes and leukocytes bind to endothelial cells and migrate from the blood stream into surrounding injured tissues. The transcriptional analysis indicates decreased expression of tight junction genes and increased chemokine expression in C57BL/6J mice compared to that observed in Trim55 -/mice. Relative expression of genes involved in granulocyte adhesion and diapedesis at days two and four post infection is shown in Fig 6B and 6C.

Discussion
Emerging coronaviruses like SARS-CoV and MERS-CoV cause high morbidity and mortality in human populations. Because of limited access to clear human disease responses and samples from acute infections, as well as the limited number of overall infected individuals, it is extremely challenging to define the role of host genetic polymorphism in human disease. Coronavirus pathogenesis is heavily influenced by host genetics, as evidence by the extreme resistance of SJL mice, which encode a defective variant CEACAM1 receptor for mouse hepatitis virus entry and infection [44]. Furthermore, genetic monomorphisms in the cheetah have resulted in extreme hypersensitivity to feline infectious peritonitis coronavirus infection, underscoring the importance of abundant genetic variation in controlling lethal coronavirus infection [45,46]. In this study we examined numerous phenotypes following SARS-CoV infection and identified 4 QTL related to various aspects of SARS-CoV pathogenesis. These data support previous predictions that the CC platform can identify genetic variants contributing moderate effect sizes (e.g.~20%) to complex immune response traits.
Two of the four identified QTL, on chromosome 3 and 13 respectively, pertained to perivascular cuffing. Perivascular cuffing in the lung is frequently observed during microbial and nonmicrobial lung disease [34,47,48] and is associated in part with extravasation, the process by which inflammatory cells migrate from the blood to surrounding tissues [49,50]. Previous reports of perivascular cuffing include lymphocyte and granulocyte involvement with limited insights into the genetic underpinnings of this phenotype. In vivo models of SARS-CoV infection have shown that vascular cuffing increases in cases of severe disease [21,22,35] and vascular congestion was observed in human SARS-CoV patients [7]. A recent study of pneumococcal infection [51] identified several QTL governing disease susceptibility including one on chromosome 13. The authors also found an association between perivascular inflammation and susceptibility to infection but did not extend their genetic analysis to that phenotype; there was no overlap between their chromosome 13 QTL and Hrs4. Analysis of pulmonary inflammation following hyperoxia-induced lung injury [52] identified QTL on chromosomes 1, 2, 4, 6 and 7 and informative SNPS helped to identify Chrm2 as the causative gene on chromosome 6. In this study we identified QTL contributing to 26% and 21% of the total vascular cuffing phenotypic variance, respectively. The limited numbers of candidate genes under the larger effect size QTL allowed us to test and validate the role of Trim55 in SARS-CoV-induced perivascular cuffing phenotype.
The CC was conceived to expand upon the genetic variation and mapping precision found within classical recombinant inbred (RI) panels which often suffer from inability to narrow the numbers of candidate genes due to the close genetic relationship of the founding lines. The classical BxD panel-derived from C57BL/6J and DBA2/J founder strains-was used previously to identify Klra8, the resistance gene to mouse cytomegalovirus (Cmv-1) infection [53]. Importantly, the validation experiments were conducted over a decade after the initial identification of the Cmv-1 susceptibility locus [54] as the wide initial QTL interval was not sufficient for identification of specific candidate genes. The Collaborative Cross provides a significant advantage in comparison to two-way crosses and other bi-allelic RI strain panels-as illustrated by our study, allele effects associated with founder haplotypes can provide a substantial reduction in the list of plausible candidate loci. Moreover, the inclusion of a diverse set of founder strains increases the likelihood of variants existing at loci that can influence any given trait. Indeed, in our study five of the eight founder strains contributed minor, causative alleles to the four QTL we identified. As the breeding of the CC lines preserved genetic variation across the genome, the CC lacks genetic blind spots and has multiple variant alleles at each locus. With a wide range of phenotypes [23], the CC recapitulates aspects of the genetic diversity of the human population, making it a powerful system for use in causal genetic analyses.
This study was part of an early pilot project to demonstrate the utility of the CC panel [23]. As such we did not have access to fully inbred animals and were limited to a single animal per genotype. However, the increased control of the experimental conditions in these studies and high frequency of minor alleles within the CC population (each allele is present in roughly 12.5% of CC genomes [23] whereas minor allele frequencies in the human population are typically much lower) allowed us to identify multiple host genome regions contributing to differential SARS-CoV infection. Studies utilizing the full CC panel will be able to use the full potential of a reproducible genetic background to obtain repeated assays and high-precision phenotyping, even our limited proof-of-concept study proved to be adequate to identify multiple host genome regions contributing to differential responses to SARS-CoV infection.
Trim55 is part of the well-known superfamily of TRIM proteins, specifically in the C-II subfamily. This subfamily consists of Trim54, Trim55 and Trim63, and is defined by an N-terminal that contains a Ring Finger domain, B-box 2 domain and a coiled-coil domain [42]. The C-II Trim family genes are solely expressed by muscle cells and to date have only been studied in the frame of muscle cell development and cardiac function. Trim55 and Trim63, also known as Murf1, mediate muscle cell protein turnover through their E3 ubiquitin-ligase activities and function in muscle wasting phenotypes [40,43,55]. Trim55 specifically functions in myosin and myofibril maintenance and knockdown studies correlate Trim55 levels with modified posttranslational microtubule modifications and defects in myofibril assembly, critical components in extravasation [55].
Blood vessels are comprised of vascular endothelial cells, connective tissue and smooth muscle cells, all of which must be crossed during inflammatory cell trafficking into the lung. During extravasation, inflammatory cells tumble and bind to adhesion molecules, slowing their motion and expanding surface-surface interactions with endothelial cells [56]. Tissue Necrosis Factoralpha and thrombin expression levels increase following SARS-CoV infection [12,21] and these proteins have both been shown to increase endothelial permeability [57,58]. Here we observed a complicated picture of altered chemokine and tight junction gene expression in the absence of Trim55 (Fig 6B and 6C). Increased expression of Ccl24, CCR3, IL4 and Pdgfc in C57BL/6J mice compared to that in Trim55 -/mice at day four post infection correlates with increased inflammatory cell recruitment and binding to extracellular matrix proteins. These expression changes are consistent with changes in altered recruitment of inflammatory cells to the lung following SARS-CoV infection. Higher expression of Claudin19 at day two post infection in Trim55 deficient mice likely contributes to decreased tight junction permeability and reduced ability for inflammatory cells in the bloodstream to cross the endothelial barrier. Additionally, one of the high priority candidate genes under the modifier QTL on chromosome 13 is Cdhr2, a cadherin superfamily member that may also play a role in extravasation of inflammatory cells into the infected lung. Different specific VE-cadherin residues are known to regulate leukocyte extravasation and vascular permeability [59], demonstrating the importance of cadherin family members in these processes. More recent work details the role of Cdhr2 in intestinal brush border assembly via adhesion links between adjacent microvilli [60].
Intravascular crawling and signaling through RhoA induces actin, microfilament and microtubule reorganizations and the production of endothelial cell docking structures, which surround the inflammatory cell and span tight junctions [56]. Although controversial, myofibril contractile structures may also contribute in to the assembly of these structures. In any event, inflammatory cell transmigration requires the formation of actin-myosin II contractile structures which are attached to tight junction membranes by VE-cadherins, resulting in increased endothelial tension, and programmed separation and expansion of the tight junctions which allow for leukocyte/monocyte passage into the surrounding tissues [61]. It seems likely that Trim55, with its roles in myosin and myofibril maintenance and microtubule organization, contributes to the programmed formation of endothelial docking structures and regulation of inflammatory cell transmigration; key features associated with the formation of perivascular cuffs around vessels in the lung. Our data (Figs 5 and 6) demonstrate that Trim55 contributes to vascular cuffing following SARS-CoV infection. While the mechanism is not yet fully understood, the data strongly suggest that Trim55 is important for extravasation of inflammatory cells, and thus overall SARS-CoV pathogenesis, by altering intercellular junctions and chemotactic signals. Increased studies of Trim55 and Cdhr2 function within the CC population, either via specific crosses of lines with high and low alleles at the HrS1 and HrS4 loci, or via CRISPR-Cas9 modification of these loci will allow further insight into the role that these two genes play during SARS-CoV pathogenesis and recovery, as well as increasing understanding of the more general process of extravasation.
The Collaborative Cross was conceived of as a resource to drive insight into a variety of biomedically important diseases via the reassortment of genetic variants and expansion of phenotypic ranges [62]. Indeed, previous studies with various preCC subsets have demonstrated expanded phenotypes in preCC mice body weight and hematological parameters [23,63,64], response to Aspergillus [65] and susceptibility to Influenza A infection [27,66]. More recently it has been shown that novel combinations of alleles have also resulted in new models for human disease such a spontaneous colitis [67], and that F1 hybrids of CC mice were used to create an improved mouse model for Ebola virus disease [68] including hemorrhagic signs of disease previously not observed in a small animal model. Within our study of SARS-CoV infection within the preCC, we showed more extreme disease phenotypes than those seen within the eight founder strains of the CC. These disease phenotypes included virus titer, weight loss, pathology and lethality. Further, we saw the emergence of new phenotypes including ARDS and DAD not traditionally seen within young inbred strains [11]. Importantly, our results highlight another exciting aspect of the nature of CC genome: transgressive segregation, or the release of cryptic genetic variation [69,70]. As the three wild-derived CC founders all showed mortality early in the course of SARS-CoV infection, genetic variants within these three strains impacting later-stage SARS-CoV responses would normally not be seen. Only via the reassortment of these alleles into a variety of genetic backgrounds (some resistant to clinical disease, some susceptible) were we able to show that alleles from all three wild-derived founders impacted perivascular cuffing or viral titer levels independent of their effects on clinical disease or SARS-CoV mortality. Collectively, these data support the hypothesis that the CC population represents a robust platform for developing improved animal models that more readily replicate disease phenotypes seen in human populations. All told, our identification of multiple QTL related to SARS-CoV pathogenesis, identification of a novel function for Trim55, and the development of new models of acute lung injury, further solidify the utility of the CC as a valuable community resource for research of infectious diseases and other biological systems driven by complex host response networks.

Ethics statement
Mouse studies were performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. All mouse studies were performed at the University of North Carolina (Animal Welfare Assurance #A3410-01) using protocols approved by the UNC Institutional Animal Care and Use Committee (IACUC).

Virus and cells
Recombinant mouse-adapted SARS-CoV (MA15) was propagated and titered on Vero E6 cells. For virus titration half of the right lung was used to assess plaque forming units (PFU) per lung using Vero E6 cells with a detection limit of 100 PFU [71]. All experiments were performed in a Class II biological safety cabinet in a certified biosafety level 3 laboratory containing redundant exhaust fans while wearing personnel protective equipment including Tyvek suits, hoods, and HEPA-filtered powered air-purifying respirators (PAPRs). [8][9][10][11][12] week old female animals from the 8 founder strains (A/J, C57BL/6J, 129S1/SvImJ, NOD/ ShiLtJ, NZO/HILtJ, CAST/EiJ, PWK/PhJ, and WSB/EiJ) were obtained from the Jackson labs (jax.org), and bred at UNC Chapel Hill under specific pathogen free conditions. 8-20 week old female pre-CC mice were bred at Oak Ridge National Laboratories under specific pathogen free conditions, and transferred directly into a BSL-3 containment laboratory at UNC Chapel Hill. One preCC mouse per line was infected, for the founder strains at day four n = 2 (A/J), n = 3 (C57BL/6J, 128S1/SvImJ, NOD/ShiLtJ, CAST/EiJ, PWK/PhJ and WSB/EiJ) and n = 5 (NZO/HILtJ). Trim55 -/-(Murf2 -/-) mice on a C57BL/6 background were a kind gift from Christian Witt at the University of Mannheim. Validation experiments used 8-12 week old female mice. All experiments were approved by the UNC Chapel Hill Institutional Animal Care and Use Committee. Animals were maintained in SealSafe ventilated caging system in a BSL3 laboratory, equipped with redundant fans as previously described by our group.

Infections
Animals were lightly anesthetized via inhalation with Isoflurane (Piramal). Following anesthesia, animals were infected intranasally with 10 5 pfu of mouse adapted SARS-CoV (MA15) in 50 μL of phosphate buffered saline (PBS, Gibco), while mock infected animals received only 50 μL of PBS. Animals were weighed daily and at four days post infection, animals were euthanized via Isoflurane overdose and tissues were taken for various assays. No blinding was used in any animal experiments and animals were not randomized; group sample size was chosen based on availability of age-matched mice. Pearson's correlation was used to determine any correlation between weight loss and log-transformed viral load in the lung.

Histological analysis
The left lung was removed and submerged in 10% buffered formalin (Fisher) without inflation for 1 week. Tissues were embedded in paraffin, and 5 μm sections were prepared by the UNC Lineberger Comprehensive Cancer Center histopathology core facility. To determine the extent of inflammation, sections were stained with hematoxylin and eosin (H & E) and scored in a blinded manner by for a variety of metrics relating to the extent and severity of immune cell infiltration and pathological damage on a 0-3 (none, mild, moderate, severe) scale. Significant differences in lung pathology were determined by a two-sample student's t test. Images were captured using an Olympus BX41 microscope with an Olympus DP71 camera.

Flow cytometry
The right lung of each mouse was used for flow cytometric staining of inflammatory cells. Mice were perfused with PBS through the right ventricle before harvest, lung tissue was dissected and digested in RPMI (Gibco) supplemented with DNAse and Collagenase (Roche). Samples were strained using a 70 micron filter (BD) and any residual red blood cells were lysed using ACK lysis buffer. The resulting single cell suspension was stained with two antibody panels using the following stains (1)  anti-B220 clone RA3-6B2 (eBio). While this FACS analysis was solely performed on mice of a C57BL6/J background, these antibodies have all been shown recognize the relevant antigens in each of the CC founder lines. Samples were run on a Beckman Coulter CyAN, and data analyzed within the Summit software. Significant differences in lung inflammatory cell populations were determined by a two-sample student's t test.

Genotyping and haplotype reconstruction
Genotyping and haplotype reconstruction were done as described in [23]. Briefly, each pre-CC animal was genotyped using the Mouse Diversity Array [72] (Affymetrix) at 372,249 well performing SNPs which were polymorphic across the founder strains [31]. Once genotypes were determined, founder strain haplotype probabilities were computed for all genotyped loci using the HAPPY algorithm [73]. Genetic map positions were based on the integrated mouse genetic map using mouse genome build 37 [74].

Linkage mapping and identification of candidate regions
Linkage mapping was done as described in [23]. Briefly, QTL mapping was conducted using the BAGPIPE package [75] to regress each phenotype on the computed haplotypes in the interval between adjacent genotype markers, producing a LOD score in each interval to evaluate significance. Genome-wide significance was determined by permutation test, with 250 permutations conducted per scan. Phenotype data for mapping either satisfied the assumptions of normality or were log transformed to fit normality (titer data). For the likely regions of identified QTL peaks, SNP data for the eight founder strains from the Sanger Institute Mouse Genomes Project [37] were downloaded and analyzed as described in Ferris et al [27].

RNA preparation and oligonucleotide microarray processing
At two and four days after infection, mice were euthanized and a lung portion placed in RNAlater (Applied Biosystems/Ambion) and then stored at −80°. The tissues were subsequently homogenized in TriZol (Life Technologies), and RNA extracted as previously described [12]. RNA samples were spectroscopically verified for purity, and the quality of the intact RNA was assessed using an Agilent 2100 Bioanalyzer. cRNA probes were generated from each sample by the use of an Agilent one-color Quick-Amp labeling kit. Each cRNA sample was then hybridized to Agilent mouse whole-genome oligonucleotide microarrays (4 x 44) based on the manufacturer's instructions. Slides were scanned with an Agilent DNA microarray scanner, and the output images were then analyzed using Agilent Feature Extractor software. Microarray data has been deposited in the National Center for Biotechnology Information's Gene Expression Omnibus database and is accessible through GEO accession GSE64660.

Microarray data analysis/methods
Raw Agilent Microarray files were feature extracted Agilent feature extractor version 10.7.3.1. Raw Microarray files were background corrected using the "norm-exp" method with an offset of 1 and quantile normalized using Agi4x44PreProcess [76] in the R statistical software environment. Replicate probes were mean summarized, and all probes were required to pass Agilent QC flags for 75% replicates of at least one infected time point (41,267 probes passed). This microarray analysis was performed only on animals with a C56BL6/J background; thus it was not necessary to correct for probes with SNPs caused by the genetic variation of the other founder lines.

Statistical analysis
Differential expression was determined by comparing MA15 infected samples (C57Bl6/J mice vs. Trim55 -/mice) with mock and each other to fit a linear model for each probe using the R package Limma. Criteria for differential expression was an absolute log2 FC of 1 and a q value of < 0.05 calculated using a moderated t test with subsequent Benjamini-Hochberg correction. Differentially expressed (DE) genes were observed for both C57BL/6J and Trim55 -/infected mice compared to time matched mocks at two and day four post infection. DE analysis was also run on the Trim55 -/infected mice against the C57BL/6J infected mice which provided a direct observation of the transcription signatures in the Trim55 -/against the MA15 infected mouse background. To identify genes with similar patterns of variation at early and late times post infection, day two and day four gene signatures were intersected separately and then combined. There was no intersection of DE genes between the day two and four time points when the Trim55 -/infected mice were run against the C57BL/6J infected mice.

Functional analysis
Functional analysis of statistically significant gene expression changes was performed using the Ingenuity Pathways Knowledge Base (IPA; Ingenuity Systems) [77]. Functional enrichment scores were calculated in IPA using all probes that passed our QC filter as the background data set.