Expression profiling of disease progression in canine model of Duchenne muscular dystrophy

Duchenne muscular dystrophy (DMD) causes progressive disability in 1 of every 5,000 boys due to the lack of functional dystrophin protein. Despite much advancement in knowledge about DMD disease presentation and progression—attributable in part to studies using mouse and canine models of the disease–current DMD treatments are not equally effective in all patients. There remains, therefore, a need for translational animal models in which novel treatment targets can be identified and evaluated. Golden Retriever muscular dystrophy (GRMD) is a phenotypically and genetically homologous animal model of DMD. As with DMD, speed of disease progression in GRMD varies substantially. However, unlike DMD, all GRMD dogs possess the same causal mutation; therefore genetic modifiers of phenotypic variation are relatively easier to identify. Furthermore, the GRMD dogs used in this study reside within the same colony, reducing the confounding effects of environment on phenotypic variation. To detect modifiers of disease progression, we developed gene expression profiles using RNA sequencing for 9 dogs: 6 GRMD dogs (3 with faster-progressing and 3 with slower-progressing disease, based on quantitative, objective biomarkers) and 3 control dogs from the same colony. All dogs were evaluated at 2 time points: early disease onset (3 months of age) and the point at which GRMD stabilizes (6 months of age) using quantitative, objective biomarkers identified as robust against the effects of relatedness/inbreeding. Across all comparisons, the most differentially expressed genes fell into 3 categories: myogenesis/muscle regeneration, metabolism, and inflammation. Our findings are largely in concordance with DMD and mouse model studies, reinforcing the utility of GRMD as a translational model. Novel findings include the strong up-regulation of chitinase 3-like 1 (CHI3L1) in faster-progressing GRMD dogs, suggesting previously unexplored mechanisms underlie progression speed in GRMD and DMD. In summary, our findings support the utility of RNA sequencing for evaluating potential biomarkers of GRMD progression speed, and are valuable for identifying new avenues of exploration in DMD research.

Introduction Duchenne muscular dystrophy (DMD) is a devastating X-linked degenerative muscle disease affecting approximately 1 in 5,000 boys, caused by lack of functional dystrophin protein, Dystrophin is essential to muscle integrity via its role as a shock absorber that stabilizes the plasma membrane, or sarcolemma, of muscle cells against mechanical stress [1]. In the absence of dystrophin, myofibers undergo repeated cycles of necrosis and regeneration. As the disease progresses, stem cell reserves are depleted and muscle is replaced by fibrous connective tissue and fat.
Animal models of DMD have been essential to understanding the disease process and developing treatments. The well-known mdx mouse model has been used extensively to study disease pathogenesis and establish initial proof of principle for novel therapies [2]. The mdx model recapitulates the inflammatory environment seen in DMD-affected skeletal muscles and has been extensively used to examine the roles of inflammation in DMD [3]. However, while mdx mice lack dystrophin, they have a much less severe phenotype than DMD patients, raising questions about the degree to which preclinical data will translate to humans. Canine models of DMD, such as golden retriever muscular dystrophy (GRMD), more closely approximate phenotypic features observed in DMD [4]. In both DMD and GRMD, disease severity and progression differ among individuals and muscles. Phenotypic variation in dystrophindeficient species, individuals, and muscles suggests that modifier genes can influence the disease course. Gene expression array studies in the mdx mouse [e.g., 3,5], GRMD dogs [6,7], and DMD patients [8][9][10] have allowed identification of key potential genetic modifiers. These studies have shown that genes tied to inflammation and regeneration are particularly increased in dystrophic muscle, presumably in response to myofiber necrosis.
Genes that influence the speed of disease progression in dystrophin deficient individuals are of particular interest because they could potentially be therapeutic targets. In this study, we compared gene expression profiles of two groups of differentially affected GRMD dogs at 3-6 months of age, a period of particularly fast disease progression that corresponds to the [5][6][7][8][9][10] year period in DMD [11]. In keeping with prior gene expression studies, we hypothesized that dogs with more rapid disease progression would have a corresponding increase in genes associated with inflammation and regeneration and that differential expression would be particularly pronounced at the 3-month time point, as a prelude to fibrosis.
This study is one of the first examples of the use of RNA sequencing being used as a tool to investigate contributory genetic elements in a canine model of disease. It is also the first study to use RNA sequencing to investigate disease progression in GRMD dogs (though it should be noted that whole genome DNA sequencing has been done previously; for example [12,13]). Because phenotypically characterized samples were used, we anticipate that our findings will be relevant to DMD studies exploring analogous phenotypes in DMD.

Animals
All GRMD dogs were from the colony currently located at Texas A&M University, but located at the time of biopsy/necropsy at University of North Carolina-Chapel Hill (UNC-CH). This colony has been maintained through a breeding program in which at least one parent of each mating carries the same causal mutation inherited from a single founding sire [14]. Once the inbreeding coefficient of the colony reaches~0.20, unrelated dogs are introduced to the breeding stock to reduce neonatal mortality (which increases with inbreeding) [4]. As a result of this practice, variation is observable in the progression of disease and severity of the phenotype [15], giving the colony many of the advantages of an inbred population (e.g. reduced influence of confounding factors such as environmental variation), while capturing genetic diversity such as that of humans. As a result, studies using the GRMD colony require a smaller sample size than would a study using humans [16].
The dogs were maintained and treated according to the standards of the National Research Council Guide for the Care and Use of Laboratory Animals. Studies were approved by the UNC-CH Institutional Animal Care and Use Committee (IACUC) through protocols UNC IACUC 09-351, Standard Operating Procedures-Canine X-Linked Muscular Dystrophy. The GRMD genotype was diagnosed based on blood creatine kinase levels taken shortly after birth [17], and confirmed with PCR-based genotyping [18]. Nine dogs were included in this study based on the availability of longitudinal samples and phenotypic data. This cohort included 6 GRMD-affected and 3 unaffected dogs with samples taken via biopsy at~3 months (time point 1, T1) and via biopsy or necropsy at~6 months (time point 2, T2) of age. Of the affected dogs, 3 were classified as have slow-progressing disease and 3 were classified as fast-progressing, based on measurement values for objective GRMD biomarkers [4] found to be reliable indicators of GRMD disease severity. These biomarkers included tibiotarsal joint tetanic extension (N/kg), percent eccentric contraction decrement, maximum hip flexion angle, and pelvic angle [19]. We previously identified these particular biomarkers as having genomic inflation factors (λ) of 1, meaning that measurements for these biomarkers are useful for genetic association studies because their values are independent and not biased by pedigree relationships [20]. Each of the GRMD subsets consisted of 2 affected females and 1 affected male. The unaffected dogs were all males. A pedigree of the 9 dogs is presented in Fig 1. We studied the cranial sartorius (CS), a flexor muscle which is more severely affected than extensor muscles (such as the vastus lateralis) and undergoes a characteristic pattern of early necrosis with hypertrophy by 6 months [7,21]. CS hypertrophy, reflecting GRMD disease severity, has been previously correlated with measurement values for the aforementioned biomarkers [7].

RNA extraction
Total RNA was extracted from biopsy and necropsy samples from CS muscles archived at -80 o C for each of the 9 dogs using TriPure Isolation Reagent (Roche; Indianapolis, IN) as per manufacturer's instructions. RNA was precipitated using isopropanol and 75% EtOH, and dissolved in nuclease-free water (Ambion). To minimize DNA contamination, samples were DNase-treated using the DNA-Free DNase Treatment and Removal kit (Ambion). RNA was quantified using NanoDrop (Thermo Fisher Scientific) and quality was evaluated via Bioanalyzer (Agilent; RIN values ranged from 8 to >9).

Real-time quantitative PCR (RT-qPCR)
Total RNA was directly reverse transcribed to cDNA using SuperScript II (Invitrogen). DNase-treated RNA (100ng) was reverse transcribed with oligo-dT and random primers and Superscript II (Invitrogen, Carlsbad, CA). The reverse transcription reactions consisted of 100ng of DNase-treated RNA in a 50μl reaction, ultra pure water, oligo dT (2.5μl of 500ng/μl) and random hexamer (0.48μl of 1mM stock) heated to 65˚C for 5 minutes and cooled to room temperature. To this, Superscript II (2μl), 5X 1st Strand buffer (10μl), 0.1M DTT (5μl), 10mM dNTPs (2.5μl) and an RNase block Ambion's Superasin (1μl) were added. All were heated to 37˚C for one hour followed by reverse transcription inactivation at 90˚C for 5 minutes.
SYBR Green technology was used to measure expression levels for CHI3LI, IL-6, SPP1, and the housekeeping gene hPRT. Primers were designed from two neighboring exons flanking one intron (when possible), or from a single exon, using Primer3 software (http://biotools.umassmed. edu/bioapps/primer3_www.cgi; [22]). Each 20μl reaction contained 10μl Power SYBR Green PCR Master Mix (Applied Biosystems), 2μl each forward and reverse primers (3μM each), 5.5μl ultrapure water, and 0.5μl (1ng) cDNA. All RT-qPCR reactions were performed in triplicate using the 7900HT Fast qPCR System (Applied Biosystems); cycling parameters were 50˚C for 2 minutes, 95˚C for 10 minutes, and cycling 40 repeats of 95˚C for 15 seconds and 60˚C for 1 minute.
Primer information is listed in Table 1. Relative expression data were calculated using the ΔΔCt method of Livak and Schmittgen [23], and normalized to the housekeeping gene hPRT. Differential gene expression was statistically evaluated via ANOVA using JMP version 13 software [24]. In all cases, p 0.05 was considered significant.

RNA sequencing
Coding transcriptome sequence was captured using TruSeq RNA library preparation kits (Illumina, catalog numbers RS-122-2001 and RS-122-2002) and rRNA removed using RiboZero Note that this pedigree does not show the entire colony or all siblings of the dogs used in this study, but is presented to illustrate the degree of relatedness between dogs in this study. Circles represent females; squares represent males; open symbols represent non-GRMD (nondystrophic); blackened symbols represent GRMD (dystrophic); a small circle within a larger circle represents GRMD carrier females; animals listed in more than one position within the pedigree are indicated by a larger circle or square encompassing the primary symbol for the animal, along with a dotted line connecting the multiple positions for that animal in the pedigree. The 9 dogs investigated for this study are listed across the bottom of the pedigree, with "fast" and "slow" labels indicating disease progression speed for GRMD dogs and "control" indicating non-GRMD dogs.

Statistical analysis
The R (version 3.2.4)-based Bioconductor (version 3.3) package DESeq2 (release 3.3) [27] was used for statistical analyses. Un-normalized raw reads were used as input, because DESeq2 corrects internally for library size; we also took into consideration differences in sequencing depth by using variance stabilizing transformation [27]. The Benjamini-Hochberg approach was used to adjust P-values for multiple testing [28]. For the selection of differentially expressed (DE) genes, we used a false discovery rate (FDR) adjusted p-value (i.e., q-value) of < 0.05.

Gene ontology and pathway analysis
The PANTHER Overrepresentation Test in PANTHER version 11.0 [29, 30] was used to perform gene ontology and PANTHER pathways analyses. Furthermore, we used Ingenuity Pathway Analysis (IPA, QIAGEN Redwood City, www.qiagen.com/ingenuity) to identify orthologous genes and pathways in human, mouse, and rat databases. Multiple testing was performed in PANTHER suing the binomial test and Bonferroni correction, and in IPA using the right-tailed Fisher Exact test. For both PANTHER and IPA, we used default settings for statistical analysis, with pvalue < 0.05 and log 2 fold change >2 set as cutoff values.

Single nucleotide variant (SNV) discovery and functional assessment
We pooled short reads generated from RNAseq of dogs affected with disease (including both slow and fast progression groups) as one group and the healthy dogs as another group. The resulting read files were aligned to the dog (Canis lupus familiaris) reference genome using the STAR 2-pass alignment [31]. Picard (http://broadinstitute.github.io/picard) was used to add read group information, sorting, marking duplicates and indexing. We then used a function of the GATK tool called SplitNCigarReads to split reads into exon segments to get rid of Ns but maintain grouping information, and hard-clip sequences overhanging into the intronic regions. The variant calling was performed by using another GATK function-HaplotypeCaller. To filter the resulting callset, hard filters (FS > 30.0, QD < 2.0 and DP > = 20, where FS stands for Fisher Strand values, QD for Quality by Depth and DP for Depth of Coverage) were applied ([32-34],https://software.broadinstitute.org/gatk/documentation/article.php?id= 3891). After variant calling for diseased and control dogs independently, we compared the two groups to obtain the list of SNVs present only in the GRMD dogs. We predicted the effect of each SNV on protein function using the Variant Effector Predictor at the Ensemble website (https://useast.ensembl.org/info/docs/tools/vep/index.html).

Results
To evaluate the relationship between gene expression and speed of GRMD disease progression, we performed RNA sequencing analysis using cranial sartorius muscle from 9 dogs classified as GRMD slow-progression (n = 3), GRMD fast-progressing (n = 3), or unaffected dogs from the same colony (n = 3). After removal of low-quality reads (Phred score >30) a total of over 400 million reads (average of 24 million per dog) were mapped to the canine genome, Can-Fam3.1, with alignments ranging from 95.09-96.38%. Of these alignments, 76.25-80.84% were uniquely mapped. See Table 2 for a summary of RNA sequencing results.

Principal component analysis (PCA) and hierarchical clustering
Using DESeq2, we performed principal component and hierarchical clustering analyses to identify underlying stratification across samples (Fig 2). All control dogs clustered separately from GRMD samples for both types of analysis. GRMD dogs did not show additional clustering when viewed by time point or progression speed.

Differentially expressed genes between groups
We evaluated gene expression differences between dogs with fast-vs. slow-progressing GRMD disease, both GRMD groups vs. unaffected controls, and gene expression at 3 and 6 months representing early (T1) and stable (T2) disease stages, respectively (Fig 3). Using DESeq2, we identified those genes with significant differential gene expression (considered as q < 0.05; Table 3). Similar to human DMD and mdx mouse studies, we found differences in gene expression were greatest in those genes with functions pertaining to inflammation/immune response, metabolism, and myogenesis/muscle regeneration. Furthermore, these processes are often interrelated, with several of the most strongly differentially expressed genes (DEGs) having roles in both inflammation and regeneration. Top 10 DEGs, log 2 fold changes, and q-values are listed in S1 Table. DEGs between GRMD and unaffected control dogs a. DEGs related to muscle regeneration. Based on findings from studies using mdx mice [35-37] and DMD patients [8-10], we hypothesized that there would be an increase in expression of genes related to muscle regeneration in GRMD compared to control dogs. Myosin genes are expressed at different stages of muscle development, subsequently down regulated at birth, and then re-expressed during muscle regeneration [38]. Consistent with our hypothesis, expression of the embryonic form of myosin heavy chain (MYH3), associated with muscle regeneration following injury, was increased~7-fold at both time points in all GRMD dogs compared to controls, suggesting that muscle regeneration is a hallmark of GRMD for at least the first 6 months. MYH3 has previously been shown to be over-expressed in DMD [10] and has also been correlated with muscle regeneration in mdx mice [35]. Myosin light chain 4 (MYL4) is expressed later than MYH3 during normal mouse embryogenesis, corresponding to the beginning of the fetal stage of muscle development and the time of fiber type differentiation [38,39]. Not surprisingly, we found MYL4 to be among the most up-regulated genes in all GRMD dogs at the later (6-month) time point. Similarly, increased expression of cardiac troponin type 2 (TNNT2) has been described in transcriptome analyses of DMD and mdx [40,41]; this up-regulation coincides with early myoblast activation followed by a gradual decline in expression corresponding with maturation of the regenerated muscle [40]. TNNT2 is strongly expressed at later stages in embryonic skeletal muscle [42]. In keeping with this pattern, TNNT2 was one of the most up-regulated genes in all GRMD dogs at the earlier (3-month) time point.
b. DEGs related to metabolism. In GRMD and DMD (and, to a lesser extent, mdx), skeletal muscles show aberrations in energy metabolism, or "metabolic crisis." Based on prior studies [43, 44], we anticipated decreased expression for genes involved in energy metabolism. In fact, many of the top 10 DEGs that were down-regulated in GRMD relative to control-regardless of time point or progression speed-had functions related to metabolism. Myo-inositol oxygenase (MIOX) was down-regulated by as much as 5-fold in all GRMD dogs at both time points. Carboxylesterase 1 (CES1) was down-regulated by~4-fold in slow-progressing dogs at 3 months and all dogs at 6 months. Solute carrier family 25 member 48 (SLC25A48; encoding a mitochondrial carrier  c. DEGs related to inflammation. Inflammation plays a major role in disease progression in DMD and mdx. Genes with roles in inflammatory and immune-related processes were included among the top 10 most strongly up-or down-regulated genes in all comparisons. Of these genes, chitinase 3-like 1 (CHI3L1) is particularly worth noting. CHI3L1 was found to be up-regulated~6-fold in fast-progressing dogs compared to controls at both time points.

DEGs between fast-and slow-progressing GRMD dogs
One of the main purposes for this study was to identify differences in gene expression between fast-and slow-progressing GRMD dogs that could help define what causes some dogs to deteriorate faster than others, thus giving insight into potential avenues for investigation in DMD. At T1, the most up-regulated DEG in fast-progressing compared to slow-progressing GRMD dogs was protein tyrosine phosphatase, receptor type T (PTPRT). This gene encodes a signaling molecule that regulates several processes including cell growth and differentiation. In humans, PTPRT has been associated with autism [45-47], which is comorbid with DMD [48]. On the other hand, the most strongly down-regulated gene at T1 when comparing fast-progressing to slow-progressing GRMD dogs was ENSCAFG00000031467, also called LOC100682772 leucinerich repeat-containing protein 37A3-like. This gene was also the most down-regulated gene overall in fast-vs. slow-progressing dogs. The protein encoded by this gene is a member of the leucine-rich repeat containing family, the members of which have been shown to regulate collagen fibrillogenesis [49].
At T2, the most highly up-regulated gene in fast-progressing (as compared to slow-progressing) GRMD dogs was interleukin 6 (IL-6). IL-6 is affiliated with the innate immune response and its increased expression in fast vs. slow-progressing dogs likely reflects a stronger inflammatory reaction due to a relatively larger number of regenerative muscle fibers. The most down-regulated gene in fast-progressing dogs relative to slow-progressing dogs was a novel processed pseudogene (ENSCAFG00000032111) with no known orthologs or relevant literature to provide clues as to its function. In fact, many of the other top-hit DEGs in the fast vs. slow-progressing DEGs; graphs on the right of each panel indicate numbers of DEGs with functions related to regeneration, metabolism, immune response, both regeneration and metabolism (R,M), both metabolism and immune response (M,I), or both regeneration and immune response (R,I).
https://doi.org/10.1371/journal.pone.0194485.g003 Table 3. Numbers of differentially expressed genes (DEGs) and gene ontology analysis output. Total numbers of DEGs for the comparisons "Fast vs Control", "Slow vs Control", and "Fast vs Slow" are presented in gray rows, with these results broken down by time points below the total numbers (T1 = age 3 months; T2 = age 6 months). For DESeq2, genes with q value < 0.05 were considered as significant. For Gene Ontology, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Panther, genes matched with Canis familaris genome in respective database with q value<0.05 were reported. GRMD comparison were novel genes and pseudogenes for which little functional information was available.

Samples
Only one gene was found to be among the top 10 DEGs at both time points when comparing expression levels between fast-and slow-progressing dogs. This gene, ENSCAFG00000014968, is an ortholog of peptidylprolyl isomerase a (PPIA) and was previously identified as a gene of interest in our genome-wide association study [19]. We had originally considered PPIA as a potential housekeeping gene for normalization in that previous study, but instead in that study we observed PPIA expression to be significantly different in 6-and 12-month-old GRMD compared to normal dogs in the cranial tibialis muscle. PPIA is a regulator of a several processes relevant to GRMD, including inflammatory response to injury and apoptosis in certain cell types such as smooth muscle and endothelial cells [50, 51].

Gene ontology and pathway analyses for DEGs related to speed of disease progression
Results for gene ontology and pathway analysis for comparisons across fast, slow, and control dogs are found in S2 Table. Inflammation. As expected, gene expression pathways related to inflammation were enriched in GRMD dogs compared to control dogs at both time points and progression speeds. At 3 months of age, significantly up-regulated genes in the fast-progressing GRMD dogs included those involved in T-cell activation. This suggests a strong early immune response in these dogs, likely in reaction to damage-associated molecular patterns, as seen in mdx mice [52]. Migration of regulatory T cells to damaged muscle can ameliorate the degenerative phenotype, allowing for muscle repair early in the disease process [53, 54]; however, T-cell activation was not identified as a top pathway for these dogs at age 6 months. This would be in keeping with the disease course of the CS in GRMD, with necrosis occurring even in utero and then stabilizing [7,21]. Interestingly, T-cell activation was not a top pathway for slow-progressing dogs at either time point.
Growth and regeneration. Angiogenesis and cholecystokinin (CCK) signaling were among the top most-enriched pathways in all GRMD dogs at age 3 months, consistent with a regenerative phenotype at this younger age together with changes to neuromuscular signaling. The integrin signaling pathway was enriched in all GRMD dogs at 6 months of age, consistent with our prior studies [55], perhaps as part of a compensatory mechanism to ameliorate increasing amounts of muscle damage [56, 57].
At age 3 months, the fast-progressing GRMD dogs exhibited higher levels of expression in genes involved in the platelet-derived growth factor (PDGF) signaling pathway and tricarboxylic acide cycle pathway [18]. These two pathways were not as enriched in comparisons of control dogs vs. fast-progressing dogs at 6 months of age, or slow-progressing dogs at either time point; therefore, these pathways may also be critical to the more rapid progression of GRMD disease. This is not surprising, given the functions of these pathways: PDGF signaling is involved in growth, particularly blood vessel formation; while the TCA cycle is generally related to energy and metabolism.

RT-qPCR of specific genes of interest
Expression profiles for CHI3L1, IL-6, and SPP1 were further evaluated using RT-qPCR (Fig  4). These genes were selected based on their association with GRMD progression speeds, especially in conjunction with processes (such as inflammation) likely to be affiliated with fibrosis. We chose to measure expression in the same tissue used for RNAseq (CS) as well as in the vastus lateralis (VL), an extensor muscle which functions in the opposite direction as CS (which is a flexor) [58, 59]. While no statistically significant differences were identified between expression levels of any of these genes in fast and slow progressing dogs, age had a significant effect on expression levels for CHI3L1 and SPP1. Muscle type also had a significant effect on SPP1 expression levels.
Comparisons of CHI3L1 expression levels were mostly in agreement between RNAseq and RT-qPCR: as observed via RNAseq, RT-qPCR results showed CHI3L1 expression was approximately 6 times higher in fast-progressing GRMD dogs than controls at T1. However, slow-progressing dogs also showed a nearly 6-fold increase in expression at T1, and both fast-and slow-progressing dogs showed an increase of nearly 5-fold at T2, based on RT-qPCR (but not RNAseq) results. The expression of IL-6 in the CS at T2 follows the same overall pattern in both RNAseq and RT-qPCR, in that IL-6 was up-regulated in fastprogressing GRMD dogs compared to slow-progressors, though RNAseq data showed this difference to be greater (2.59 fold via RT-qPCR, vs 3.97 fold via RNAseq). SPP1 expression in the CS was up-regulated in slow-progressing GRMD dogs compared to controls for both RT-qPCR and RNAseq.
Somewhat surprisingly, expression patterns were similar in VL compared to CS for all 3 genes-in other words, genes higher expression levels in the CS of fast-vs slow-progressing GRMD dogs also had higher expression levels in the VL for fast vs slow. The only exception here was for CHI3L1 at T1 (fast-progressing GRMD dogs had relatively higher fold change levels at T1 in the CS, but not VL, as compared to slow-progressing dogs). Given the opposing nature of CS and VL functions, the expression pattern similarity almost certainly reflects a broader pattern of expression changes related to progression speed rather than the influence of muscle type. In the cases of the 3 genes evaluated here, this "broader pattern" could reflect an overall increase in inflammatory markers in fast-progressing vs slow-progressing GRMD dogs. However, due to small sample sizes and large standard error values, expression comparisons in the VL-while intriguing-will require larger sample sizes before conclusions can be drawn.

GRMD-associated SNVs and their potential impacts
Next, we set out to use the mapped RNA-seq reads to identify SNVs that are present in the genomes of dogs affected with GRMD but absent in those of unaffected dogs. These identified SNVs should be located in the transcribed regions and some of these SNVs located in coding regions are likely to alter the function of proteins. Using a GATK-based SNV identification pipeline (Materials and Methods), we identified 13,108 GRMD-specific coding SNVs (with minimal read depth = 20). These coding SNVs are composed of 53% synonymous variants, 36% missense variants, 9% frameshift variants, 1% in-frame insertions, and 1% in-frame deletions (Fig 5). A total of 6,066 missense and frameshift SNVs were predicted to have a moderate or high impact on protein function (S3 Table). Altogether, there are 2,693 proteins that contain at least one of these GRMD-specific SNVs.
Of the GRMD-specific coding SNVs, 32 SNVs are located within the 16 of the top 10 upor down-regulated DEGs in any given comparison (such as fast vs. slow progressing GRMD, or GRMD vs. control, at T1, T2, or overall). In 8 cases, multiple SNVs are present within the same gene: ADAMDEC1, ARHGAP36, ENSCAFG00000024944, ENSCAFG00000032099, ENSCAFG00000032358, F5, SMPDL3A, and TRIM22. There are no obvious effects of these SNVs on GRMD phenotype, but future studies are warranted, as the affected genes have roles in metabolism as well as immune response and could potentially influence disease progression.

Discussion
We have evaluated gene expression differences relevant to disease progression speed based on a series of quantitative, objective biomarkers [4]. Many of our findings from comparing GRMD to unaffected control dogs echo those of human and mouse studies, as well as prior array-based GRMD gene expression analyses [e.g., 6, 7]. This reaffirms GRMD as a relevant model for studying DMD. Furthermore, our findings provide evidence of genetic modifiers of GRMD, in addition to those found in our previous genome-wide association study [19].
Perhaps more interesting, however, are findings related specifically to speed of disease progression, which may open new avenues of exploration in the field of GRMD/DMD research. Expression profiling of disease progression in canine model of Duchenne muscular dystrophy When we compared healthy dogs to GRMD dogs, we identified potential genetic biomarkers of disease progression. Most of these genes fall under one or more of three main categories: muscle regeneration and growth, metabolism, and inflammation.
To gain insight into how differences in muscle regeneration could affect disease progression, we took a closer look at the ontology of significant PANTHER pathways specifically in fast-progressing GRMD dogs early in the disease course-the 3-month time point. In those dogs, we found increased expression levels in genes included in the PDGF pathway, which is involved in growth and angiogenesis, and likely related to clearing of necrotic debris and regeneration of damaged muscle. PDGF staining has previously been associated with regenerative fibers in muscular dystrophies, including DMD and mdx [60,61]. These studies showed that once the disease process advanced beyond the regenerative stage to fibrosis, PDGF was no longer observed in dystrophic muscles. In fact, PDGF signaling has been associated with mesenchymal progenitors involved in adipogenesis and fibrosis [62,63]. PDGF signaling was not among the pathways enriched in 6-month-old GRMD dogs, regardless of disease progression speed. This suggests that by age 6 months the rapidly progressing GRMD dogs had reached a point where fibrosis had become more prominent than regeneration, in keeping with the acute onset of CS necrosis in GRMD and relative stability by 6 months. The fact that PDGF signaling was not among the enriched gene ontology pathways for slow-progressing GRMD dogs at either time point would be consistent with less pronounced early necrosis and reduced or slowed accumulation of muscle damage, adipogenesis and fibrosis. These findings provide further support to the idea that blocking PDGF signaling may ameliorate fibrosis in dystrophic muscles, a concept currently being explored as a potential treatment for DMD [64].
While genes with functions related to muscle regeneration were often up-regulated in GRMD dogs, many of the top DEGs related to metabolism were down-regulated in GRMD dogs compared to controls. Down-regulation of those DEGs could contribute to muscle degeneration or failed regeneration, aggravating the ongoing battle to compensate for disease-related muscle loss. Genes that inhibit myofiber differentiation and growth counter the muscle regenerative process. Indeed, genes that promote and inhibit muscle differentiation must be held in balance to avoid either atrophy or hypertrophy [65]. In recent years, there has been considerable interest in blocking inhibitory genes to promote greater muscle regeneration in muscular dystrophy and other muscle wasting disorders [66]. Myostatin (MSTN) is a member of the TGF-β family that acts as a negative regulator of muscle growth [67,68]. Inhibition of MSTN signaling enhances muscle growth and regeneration in the mdx mouse [69][70][71] and both normal [72] and GRMD [73] dogs. The results of previous studies have shown that MSTN is naturally down-regulated by the body in muscular dystrophy, perhaps to promote muscle regeneration [5,7,9]; therefore, we hypothesized that we would find decreased expression of MSTN in the dystrophic dogs, more greatly reduced in the fast-progressing group and especially at the earlier time point (3 months). Indeed, we observed MSTN expression levels reduced over 3-fold in the fast-progressing GRMD dogs compared to control dogs at age 3 months. However, at 6 months there was little difference in MSTN expression between these two groups, and at no point did the slower-progressing dogs show any significant differences in MSTN levels compared to control dogs.
The TCA cycle (also known as the Krebs or citric acid cycle) was a top PANTHER pathway for fast-progressing dogs at the earlier time point and was among the top PANTHER pathways for slow-progressing dogs at the later time point. Relationships between the TCA cycle and muscular dystrophy/GRMD pathogenesis are not well understood. However, studies indicate a link between muscle wasting (as seen in muscular dystrophies) and dysfunctional energy-related pathways. For example, decreased amounts of adenosine triphosphate (ATP), a product of the TCA cycle, have been observed in dystrophic muscle [74,75]. This deficit likely arises in part due to increased intracellular calcium [76] and muscle regeneration, resulting in an increased need for ATP production via the TCA cycle [77]. Furthermore, an increase in the activity of TCA cycle enzymes was observed in several brain and muscle tissues of the mdx mouse, suggesting that mitochondrial dysfunction and oxidative stress are part of mdx pathophysiology [78]. Additional evidence is needed to evaluate whether there is a delay in the onset of oxidative stress and muscle wasting in the slow-progressing dogs compared to fast-progressing dogs.
Inflammation is a typical response to muscle damage, as well as a precursor to the onset of fibrosis. As expected, we identified several immune-related DEGs with functions related specifically to inflammation. Some of these, such as secreted phosphoprotein 1 (SPP1, also known as osteopontin), have been comprehensively described before in the context of DMD, mdx, and GRMD pathogenesis and fibrosis [79][80][81][82]. SPP1 is strongly up-regulated in DMD and mdx [3,10,43,79] as well as GRMD [7, 82] and has been identified as a therapeutic target for DMD [83]. In the current study, we have also confirmed the dramatic up-regulation of SPP1 in GRMD dogs compared to controls (roughly 7-fold increase in expression for both fast-and slow-progressing dogs) using RNA sequencing.
A surprising finding related to inflammation and GRMD was the strong,~5-6-fold up-regulation of CHI3L1 in fast-progressing vs. control dogs. The connection between CHI3L1 and speed of GRMD disease progression is particularly intriguing. The up-regulation of CHI3L1 has been previously linked with fibrosis in other diseases, most notably idiopathic pulmonary fibrosis [84] and liver fibrosis [85]. CHI3L1 expression in normal human muscle cell culture occurs in a differentiation dependent fashion, being more pronounced in myotubes versus myoblasts [86]. The GRMD response, therefore, likely reflects relatively mature muscle regeneration, as would be expected in the CS muscle by 3 months. The significant increase in CHI3L1 in fastprogressing GRMD dogs could indicate that this gene contributes to accelerated disease progression in GRMD by playing a pro-fibrotic role.
A reciprocal relationship has been demonstrated in myotube cultures between CHI3L1 and the inflammatory cytokine TNFα, with CHI3L1 being induced by and also inhibiting TNFαinduced inflammation [86]. The interplay between TNFα and CHI3L1 is mediated through the NF-κB pathway, which is also involved in DMD inflammation [87]. Germane to disordered metabolism in DMD, CHI3L1 also counteracts TNFα-induced insulin resistance in muscle cells [86]. Therefore, CHI3L1 levels could be increased as part of a feedback mechanism.
Some skeletal muscle gene expression profiles for DMD patients have also shown CHI3L1 to be up-regulated compared to healthy controls [8, 10,43], and at least one mdx expression profile has shown a slight increase in the murine homolog to CHI3L1, Chil1, in the skeletal muscle of mdx compared to control mice [88]. Chil1 levels were also slightly increased in the diaphragms of some mdx mice compared to control [44]. However, these findings were neither sufficiently remarkable nor consistent to be discussed in any accompanying literature. In fact, some other expression studies found that CHI3L1 expression levels in DMD patients, and Chil1 levels in mdx mice, were not substantially different from those of controls [35, [89][90][91]. Additional studies are therefore warranted to clarify the nature of the role of CHI3L1 in GRMD inflammation/fibrosis and disease progression and the potential relevance of CHI3L1 in DMD pathogenesis and progression.
The small sample number is a major limitation to this study, but not unusual for a longitudinal study involving a large animal model such as dogs. Furthermore, lack of histological data makes drawing conclusions difficult, as physiological data alone are the basis for categorizing dogs as fast-or slow-progressing without additional evidence for comparing fibrosis levels. The dogs and samples used in this present study have also been used in other studies, in an attempt to retrieve maximal data from a data set that must be as small as reasonably possible. However, given the sample data at hand, this project is particularly valuable in that it provides a number of starting points for future investigations.
Supporting information S1