Discovery of potential urine-accessible metabolite biomarkers associated with muscle disease and corticosteroid response in the mdx mouse model for Duchenne

Urine is increasingly being considered as a source of biomarker development in Duchenne Muscular Dystrophy (DMD), a severe, life-limiting disorder that affects approximately 1 in 4500 boys. In this study, we considered the mdx mice—a murine model of DMD—to discover biomarkers of disease, as well as pharmacodynamic biomarkers responsive to prednisolone, a corticosteroid commonly used to treat DMD. Longitudinal urine samples were analyzed from male age-matched mdx and wild-type mice randomized to prednisolone or vehicle control via liquid chromatography tandem mass spectrometry. A large number of metabolites (869 out of 6,334) were found to be significantly different between mdx and wild-type mice at baseline (Bonferroni-adjusted p-value < 0.05), thus being associated with disease status. These included a metabolite with m/z = 357 and creatine, which were also reported in a previous human study looking at serum. Novel observations in this study included peaks identified as biliverdin and hypusine. These four metabolites were significantly higher at baseline in the urine of mdx mice compared to wild-type, and significantly changed their levels over time after baseline. Creatine and biliverdin levels were also different between treated and control groups, but for creatine this may have been driven by an imbalance at baseline. In conclusion, our study reports a number of biomarkers, both known and novel, which may be related to either the mechanisms of muscle injury in DMD or prednisolone treatment.


Introduction
Oral corticosteroids are the standard of care in Duchenne Muscular Dystrophy (DMD)-an X-linked genetic disease that affects 1 in 4000 to 5000 boys worldwide [1,2]. DMD is caused by mutations in the dystrophin gene located on the short arm of the X chromosome [3,4]. Oral corticosteroid use in young boys with DMD changes the natural history of the disease. One of the largest natural history studies in DMD showed that corticosteroids increased survival in DMD, improved motor outcomes such as continued ambulation, as well as cardiopulmonary outcomes [5]. Despite these unambiguous clinical benefits, there is great variation in the dosage and regimen of corticosteroids, and with compliance to treatment in DMD [6]. Patients and families often discontinue corticosteroids mainly due to unpleasant side-effects including sleep disturbance, weight gain, increased fracture risk, and metabolic syndrome. Identification of non-invasive pharmacodynamic biomarkers associated with glucocorticoid side effects in boys may help predict those individuals who might be at a higher risk for developing corticosteroid-induced side-effects [7].
We have previously identified a metabolic signature consisting of serum biomarkers in DMD [8]. Metabolites that were increased in the serum of DMD patients compared to age-matched healthy controls included creatine, arginine, and unknown compounds with m/z values of 357. 25 and 312.01 Da; metabolites that were decreased in the serum of DMD patients compared to age-matched healthy controls included creatinine, androgen derivatives, and other yet-to-be-identified metabolites. To further develop metabolite biomarkers for DMD, we used the mdx mouse-a widely used pre-clinical model-to agnostically evaluate urine metabolites associated with dystrophin deficiency and/or response to corticosteroids. Urine is an easily accessible bodily fluid, can be collected non-invasively and on multiple time-points, and is becoming an attractive source of biomarker development in DMD [9][10][11][12][13][14][15]. The identification of a palette of pharmacodynamic urine metabolites will allow the identification of metabolic pathways that are affected by the disease pathogenesis and that might be responsive to therapies such as corticosteroids and exon skipping. Such pharmacodynamic biomarkers may also help future drug development and dose finding.

Animals
The experiments using mice described in this paper were conducted according to the institutional guidelines regarding the humane treatment of animals. This study was done in compliance with the National Institutes of Health guidelines for pre-clinical studies as described by [16] and the standard operating procedure for pre-clinical studies [17]. The protocol was approved by the Institutional Animal Care and Use Committee (IACUC) at Children's National Health System (#30424).
Age-matched 5-week-old male wild-type (C57BL/10ScSnJ) and mdx (C57BL/10ScSn-Dmd<mdx>/J) mice litters were purchased from Jackson Laboratory (Bar Harbor, ME). Mice were acclimatized for 4 days prior to commencement of experiments. A total of 48 mice (23 wild-type (WT) and 25 mdx) were used to ensure a rigorous statistical analysis. WT and mdx mice matched for body weight were randomized into treatment with either vehicle control (cherry syrup) or prednisolone (dissolved in cherry syrup) for a total of 4 weeks as previously published [18]. Animals were treated with vehicle or prednisolone (5mg/kg/day) by mouth based on body weight. Animals had access to water and food ad libitum. The study team was blinded to the treatment groups. Unblinding of the different treatment groups occurred after data collection and data processing.

Urine collection
Urine from mice was collected by placing them in clean cages early in the morning. Urine was collected at three different time points: baseline (prior to treatment), 15 days following commencement of treatment, and at completion of treatment at 4 weeks; these are coded as times T0, T1, and T2 in the remainder of this manuscript. We note that at T0, the mice had not received any vehicle cherry syrup. The urine collection had the following caveats: 4 mdx and 2 WT mice died between times T0 and T1, and some mice did not urinate at every time point; one sample was not available for T2 and therefore not analyzed. In total, 48 mice were evaluated at time T0 and 43 unique mice at times T1 and T2, out of which 41 provided measurements at T1 but not T2 and 41 provided measurements at T2 but not T1. The number of mice categorized by genotype, time point, and treatment is provided in Table 1.

LC-MS method
For metabolite extraction, 80 μl of a solution of 50% acetonitrile in water containing internal standards (10μl of 1mg/ml debrisoquine and 50μl of 1mg/ml 4-nitrobenzoic acid added to 10 ml of 50% acetonitrile in water) was added to 20 μl of each urine sample in an Eppendorf vial. The samples were centrifuged at 13,000 rpm for 20 minutes at 4˚C and the supernatant transferred to fresh vials for UPLC-Qtof analysis. A 2μl aliquot from each sample was injected onto a Waters Acquity BEH C18 1.7 μm, 2.1 × 50 mm column using an Acquity UPLC system by Waters Corporation, Milford, MA connected to a quadrupole time of flight mass spectrometer (Waters G2-Qtof). The total run time for each injection was 11 minutes. Metabolites were eluted from the column at a flow rate of 0.5 mL/min using a gradient mobile phase consisting of solvent A-100% water with 0.1% formic acid-and solvent B-100% acetonitrile with 0.1% formic acid. The gradient started at 5% of solvent B and gradually ramped to 20% of solvent B in 4 minutes at curve 6. Then it ramped to 95% of solvent B in 4 minutes and maintained at 95% of solvent B for an additional 9 minutes to wash the column before returning to initial conditions of 95% solvent A and 5% solvent B in 2 minutes. Mass spectrometry analysis was performed by electrospray ionization in both positive and negative mode at a capillary voltage of 3.0 kV and sampling cone voltage of 30 V. The source temperature was set to 120˚C and the desolvation temperature to 500˚C. The cone gas flow was maintained at 25L/hr and desolvation gas flow at 1000L/hr. Leucine-encephalin solution in 50%

Data processing
LC-MS metabolomics samples were preprocessed as previously described [8]. In brief, the XCMS approach [19], available through the xcms package on Bioconductor [20] was used to detect features and estimate intensities, perform retention time correction, and group peaks from different samples, followed by filling in missing peaks via integrating the signal in the peak region defined by the other samples. The resulting peaks were then annotated using the CAMERA package, [21] resulting in the prediction of possible isotopes. Following these initial preprocessing steps, peak groups with fewer than 37 peaks were removed, along with predicted isotopes. This led to 3,674 peaks in the positive mode and 2,663 peaks in the negative mode. Internal standard normalization was then performed, with the intensities in each mode being divided by the corresponding standard and the standards being removed from the list of peaks, leading to a total number of 6,335 peaks. A very small number of peak-sample combinations had missing values, coded as intensities of 0 (51 for the negative mode and 3 for the positive mode); these intensities were replaced by the smallest non-zero intensity for that peak across all samples. The intensities were then log2-transformed and the peaks from both modes were quantile-normalized together [22]. These data are in S1 File.

Principal components analysis (PCA)
Prior to individual metabolite-level statistical analyses, we performed a principal components analysis (PCA) using the peaks detected in both modes and all the urine samples, including the QC samples. Thus, each principal component corresponds to a linear combination of the normalized, log2-transformed metabolite measurements. We considered this approach to better understand the variation in our dataset and identify possible outliers or artifacts, as recommended in [23]. All the QC samples were removed from the downstream analyses following the PCA.

Statistical analyses
Given that the mice had a different diet at time T0 compared to times T1 and T2, the analyses considered were: 1) a comparison of the genotypes at time T0 via a two-sample t-test and 2) a mixed effects linear model for times T1-T2 which included genotype, time, and treatment, as well as all their interactions, with a random intercept to account for the within-mouse correlation. We note that for analysis 1) we did not consider treatment given that all the mice were at baseline, with none being yet treated with prednisolone. Prior to analysis 1), a single peak that had the same value in all samples at T0 was removed, so that this analysis was performed on 6,334 peaks. Levene's test was initially applied to each peak to test whether the variances were equal within the groups defined by genotypes. The percent of null hypotheses, i.e. of peaks that have equal variances was estimated via the qvalue package [24,25] as 82%. We also compared the p-values from two-sample t-tests assuming equal variances and the two-sample t-tests that do not make that assumption (Welch's t-test.) Given that the vast majority of peaks appear to have equal variance between genotype groups and that the results between the two methods are very similar (correlation between p-values > 0.9999), moving forward, we used the results from the analysis with the equal variance assumption. For analysis 2), we focused on the top 50 most significant peaks from analysis 1), with the addition of creatine and creatinine, which are well-known to be important in DMD and were among the top metabolites associated with disease status in previous work [8]. In the mixed effects models, likelihood ratio tests were used to assess the association of genotype, time, and treatment with the transformed peak intensities. To account for multiple testing, statistical significance was determined to be reached if the Bonferroni-adjusted p-values were less than 0.05, when considering all 6,334 peaks, corresponding to a critical p-value of 7.89 × 10 −6 . All analyses were performed using the R statistical programming language. [26] Our code is available at https://github.com/SiminaB/DMD-mdxpred-metabolomics.
For analysis 2), we used the lme function in the nlme package in R [27] to fit linear mixed effects models, using the maximum likelihood approach. The outcome is the normalized, log2-transformed metabolite intensity. For each of the 52 metabolites considered, a "full" model is fit considering genotype, time, and treatment and all the genotype x time x treatment interactions as fixed effects, with a random intercept fit for each mouse, to account for the within-mouse correlation: where i indicates the mouse, t indicates 1 of 2 measurements per mouse, u i indicates the mouse-specific random intercept, and � it indicates the random noise term. We note that u i and � it are both random variables with independent normal distributions and their own variances, which can be denoted by s 2 u and s 2 � . "Null" models are then fit that exclude genotype (includes time, treatment, time x treatment), time (includes genotype, treatment, genotype x treatment), treatment (includes genotype, time, genotype x time). Thus, each null model is nested in the full model. To obtain the association of each metabolite with genotype, the full model is compared to the model that excludes genotype using the anova function in R, which compares the likelihoods of the fitted linear mixed models and reports a p-value from the likelihood ratio test.

LC-MS/MS experiments for identification of top peaks
The top 50 peaks in terms of significant differences between mdx and WT mice at baseline were considered as queries in the Human Metabolome Database (HMDB), looking for possible H+, Na+, and K+ adducts for the positive mode peaks and H-adducts for the negative mode peaks, within 10 ppm mass error. Eleven of these peaks were selected for identification by MS/MS. Most of them were also significantly different between the genotypes longitudinally. Creatine and creatinine were previously identified, so we did not include them in the MS/MS experiments here. The positive mode peaks considered in the MS/MS experiment had the following m/z values: 229. 16 [8], running the samples on Waters G2-Qtof. Pooled QC samples were processed similarly to the profiling experiment and the chromatographic conditions used for data acquisition remained the same. In particular, checks were performed by measuring the accuracy of the metmix (cocktail of standards) peaks which fall under 10 ppm. The metmix is run before the start of the batch and after the end of the batch. The MS/MS spectra obtained were then inspected by matching to the online available spectra from the databases Metlin [28] and HMDB [29] and manually. Further putative identification was performed as follows: The MZXML files were parsed using the pyteomics python library [30]. For each sample, the retention time was used to access the scan of the targeted compound. The mzxml function in pyteomics was used to access the m/z and corresponding intensity information of the retention time. The top 30 peaks with highest intensities were used for compound identification using MS/MS spectral libraries such as Metlin, HMDB, and MONA (http://mona.fiehnlab.ucdavis.edu/). In addition to searching tools provided by these sources, we used MetaboQuest (http://omicscraft.com/MetaboQuest) for spectral matching considering +/-10 ppm mass tolerance for both precursor and fragment peaks.

Principal components analysis (PCA)
The PCA analysis showed that 23% of the variability can be explained by the first principal component (PC1) and 12% by the second principal component (PC2). Each individual remaining principal component explained less than 10% of the remaining variability. Fig 2  shows plots of PC2 against PC1. We note that there appear to be no outlying samples and that all the QC samples cluster together, a good indication of data quality. There are no disjoint clusters based on either genotype (Panel a), time (Panel b), or treatment (panel c), which is a positive indication of the absence of batch effects. However, the baseline samples are somewhat separated from the T1 and T2 samples in PC2, strengthening our decision to consider them in a separate analysis.

Comparison of genotypes at baseline
Out of a total of 6,334 detected peaks, 868 were determined to show significant differences between the mdx and the WT mice (Bonferroni-adjusted p-value < 0.05). Given this large number, for the downstream analyses, we chose to focus on the top 50 most significant peaks, along with creatine (m/z = 132.08) and creatinine (m/z = 114.06), as described above. Results for these peaks are in S2 File, including the m/z values, the mode, the retention time in seconds, the p-value, and the log2 fold change (difference in means between the mdx and WT groups after normalization, log2 transformation). The results from the HMDB query for the top peaks with a window of 10 ppm are also in S2 File. 11 of these 50 peaks were selected for additional annotation and identification via MS/MS. The combined annotation approach using the HMDB+Metlin+MONA+MetaboQuest search yielded one additional annotation for the peak at 234.18, namely 5-hydroxyiminoisocaryophyllene, within a 10 ppm window. The MS/MS spectra for the 11 selected peaks considered in this follow-up experiment are in S1 Fig. Creatine and creatinine are known to be important in DMD and were among the top peaks in our human serum study [8]. We note that the positive mode peak, with m/z = 357.25 (p = 5.11 x 10 -25 unadjusted), appears to be the same as the top peak in our previous work with DMD patients [8]. Interestingly, creatine was not among the top 50 peaks in the mouse model, but ranked 472 and was still deemed significant (p = 4.91 x 10 -8 unadjusted). Creatinine, however, was not significant (p = 0.98 unadjusted). Boxplots representing results for these 3 peaks for the comparison between genotype are shown in Fig 3a-3c. We considered two additional peaks at m/z = 234.18 and m/z = 583.26 which were significant (p = 1.26 x 10 -14 unadjusted and p = 2.31 x 10 -16 unadjusted, respectively) and identified using MS/MS in Fig 3d and 3e.
The peak at m/z = 234.18 was identified as hypusine and the peak at m/z = 583.26 was identified as a geometric isomer of biliverdin in accordance with recently published MS/MS data of biliverdin [31]. For all peaks in Fig 3 besides creatinine, the intensities are significantly elevated in the urine of mdx mice compared to WT mice. In particular, for the peak with m/z = 357.25, this finding recapitulates our earlier findings from the serum of younger individuals with DMD, assuming the two peaks represent the same metabolite. For creatine, older boys with DMD had higher values [8].

Comparison of genotypes, time, and treatment for T1-T2
As discussed above, we focused our analysis at the later time points on the peaks found to have the lowest 50 p-values in the genotype comparison at T0, in addition to creatine and creatinine. This is because we are of the opinion that it is important to focus on peaks that are likely to be indicative of the disease process, with differences seen at an early age. Creatine and all the peaks selected for the MS/MS experiments except for the one at m/z = 370.04 were among Overall, 31 of these 52 peaks were significantly associated with genotype, 6 with treatment, and 17 with time. The overlaps between these categories are summarized in Fig 4. In particular, we note that the peak with m/z = 357.25 was associated with genotype (p = 1.71 x 10 -40 unadjusted) and time (p = 1.99 x 10 -25 unadjusted), but not with treatment (p = 7.11 x 10 -4 unadjusted, p = 1 when adjusting for multiple comparisons). Creatine was associated with genotype, as well as with treatment and time. However, the treatment association for creatine may be driven at least in part by an imbalance at baseline between the WT mice later randomized to prednisolone versus those randomized to the control group Overlaps between the peaks significantly associated with treatment, time, and genotype at times T1-T2. Significance was defined as Bonferroniadjusted p-value < 0.05 for 6,334 peaks. The peaks considered at times T1-T2 were out of the 52 peaks considered, consisting of the top 50 peaks most significantly associated with genotype at T0, plus creatine and creatinine. 19 peaks were not significantly associated with treatment, time, or genotype (bottom right corner).
Once again, the results for the peak with m/z = 357.25, showing a decrease over time in the genotype with the disorder, recapitulate the results from the human serum study [8]. An example of a peak showing treatment differences (p = 6.87 x 10 -9 unadjusted) that did not have a large imbalance at baseline and that, interestingly, showed overall higher values in the mdx versus the WT group is the peak with m/z = 484. 16 We also highlight a peak with m/z = 884.92, which demonstrated a significant decrease over time, as well as a significant association with treatment (p = 1.37 x 10 -8 unadjusted) and a clear separation between mdx and WT mice in Fig 7 (boxplot for time T0 in S3 Fig).

Discussion
Our primary goal in this study was to identify urinary metabolite biomarkers associated with (i) muscle pathology and (ii) prednisone response in an animal model of DMD. We foresee that such information can guide future studies to evaluate biomarkers to predict disease progression and response to corticosteroids and other anti-inflammatory drugs. To our knowledge, this is the first study evaluating urine metabolites in the murine model of DMD. There is a growing body of literature that shows that urine is becoming an attractive biomarker in DMD [9][10][11][12] including that changes in urinary biomarker signatures may inform regarding response to therapy [32]. It is also easily accessible and less invasive, especially in very young boys and can be used to evaluate biomarker responses. The myriad biomarkers reported here are reflective possibly of the different mechanisms that result in muscle injury in DMD. The best-known biomarkers of DMD is creatine kinase, which seeps from muscle into blood and is generally used to identify young boys with this disorder, being many times higher than in healthy individuals [33]. While this an appropriate diagnostic biomarker, we have shown that it decreases with disease progression and muscle loss [34]. We have also previously shown that creatine and creatinine are substantially different in the serum of DMD cases compared to controls, with creatine being higher in the cases and creatinine being lower; these metabolites are both in the creatine metabolism pathway, which includes creatine kinase [8]. Our current work shows an increase in creatine at baseline in the mdx mice over the WT mice, although it tends to decrease over time, whereas in the human serum study, it stayed constant or slightly increased over time in DMD cases and decreased in controls. However, creatinine was not significant in any of the analyses considered. Differences in these results could be due to the biospecimen (serum vs urine), model (human disease vs murine model), or type of study (observational natural history study vs randomized laboratory study). Recent work on urinary metabolites by a different group also found a significant increase of creatine -normalized to urinary creatinine -in mdx compared to WT mice at 3 months, though not at 1 month [15]. Urine metabolite biomarkers for muscle disease and corticosteroid response in the mdx mouse model for Duchenne Particularly interesting is the metabolite corresponding to the peak with m/z = 357.25 (positive mode). This metabolite may have also been the top metabolite in our previous study [8]. Furthermore, it recapitulates the trend of being generally higher in cases than in controls, as well as decreasing over time in cases. While we have been unable to identify it herein, we consider that could be an important key to the pathogenesis of DMD and thus worthy of further investigation.
Two novel observations from our study concern the peaks identified as biliverdin and hypusine, metabolites which were higher in mdx urine compared to WT mice. At times T1 and T2, both metabolites showed associations with genotype and time, with biliverdin also showing an association with treatment. Biliverdin is an end-product in the heme pathway. Heme oxygenase catabolizes heme to equimolar biliverdin, carbon monoxide, and ferrous iron. Heme oxygenase is considered a regulator of oxidative stress and is anti-inflammatory. Second, its product carbon monoxide has vasorelaxant property. Recent studies have provided contradictory information on whether heme oxygenase-1 is decreased [35,36] or increased [37,38] in the muscle of mdx mice compared to controls. We postulate that a pro-inflammatory state seen in mdx mice drives active breakdown of heme, resulting in an increase in biliverdin levels in urine [39,40]. Further investigation of this pathway could thus lead to both an improved understanding of the disease pathogenesis and a validated biomarker to monitor disease progression and response to treatment. Hypusine is an amino acid found exclusively in the eukaryotic translation factor 5A (eIF5A) family [41]. It is important in nonsense-mediated decay [42] but its role in muscle pathology has not yet been established. It is possible that the increase in hypusine is a reflection of ongoing extensive muscle regeneration in mdx mice, especially during early disease stages. It is also known that [43] eIF5A induces inflammatory cytokine cascade and the nitric oxide synthase iNOS and inhibition of eIF5A protects mice from developing diabetes. While the precise role of biliverdin and hypusine in vivo in skeletal muscle in mdx is unknown, we postulate that their levels in urine likely reflect inflammation and regeneration in dystrophin-deficient skeletal muscle.
The detection of a larger number of metabolite biomarkers in positive mode compared to negative mode was most likely driven by the low pH of the eluting solvent used in the LC-MS analysis (e.g. water and acetonitrile containing 1% formic acid), which favors protonation rather than deprotonation of metabolites. Additionally, some metabolites are more easily ionized in positive mode.
We note that our study has a number of important caveats. Most notably, we were unable to identify most of the 11 peaks selected for MS/MS, including the top peak with m/z = 357.25. We also found nearly 14% of peaks to be significantly different at baseline between the mdx and WT genotypes. While somewhat surprising, we focused only on the top 50 peaks with the most significant difference between mdx group and WT group in follow-up analyses and interpretation. Finding similar trends for the peak with m/z = 357.25 and for creatine as in our previous study is reassuring, as is the exploratory PCA analysis which showed the clustering of QC samples.
In conclusion, metabolome profiling of urine samples collected from mdx mice and WT mice enabled identification of a panel of potential metabolite biomarkers that might be associated with muscle pathogenesis. A subset of these metabolites was responsive to prednisolone treatment. These metabolites could prove to be useful biomarkers if evaluated in urine biospecimens of boys with DMD. (TIF) S1 File. Log2-transformed, quantile-normalized metabolite peak data. along with additional covariates. Metabolite peaks are labeled according to their m/z, mode, and retention time in seconds (RT). For the mouse samples considered in the statistical models: "Sample" column indicates mouse label and time point (T0, T1, or T2), separated by "_"; "SampleID" indicates mouse; "Time" indicates time point (T0, T1, or T2); "Genotype" is labeled "mdx23" or "WT"; "Group" is "group 1" (prednisolone) or "group 2" (vehicle control). The pooled quality control samples are labeled "QC" in the "Sample" and "SampleID" columns, "N" in the "Time" column, and NA in the "Genotype" and "Group" columns. (TSV) S2 File. T-test and HMDB query results from HMDB query for top 50 peaks from baseline (T0) genotype analysis. A cutoff of 10 ppm was considered for the database search. The log2 fold change, corresponding to the mean difference between mdx and WT groups, the number of degrees of freedom of the test (DF), the 95% confidence interval (CI) lower and upper bounds, the t-statistic, and the p-value from the t-test are given, along with the rank of the peak and results from the HMDB query. Each row represents a possible HMDB annotation; if no metabolites are found within 10 ppm, the HMDB entries are labeled with "NA." (CSV) S3 File. Results from the linear mixed models for times T1 and T2, applied to the top 50 peaks from the baseline (T0) genotype analysis, along with creatine and creatinine. The first 8 sheets represent fixed effects (FE) estimates for the: intercept; genotype, time, and treatment main effects; genotype x time, genotype x treatment, and time x treatment pairwise interactions; and genotype x time x treatment interaction. They give the coefficient (Beta) estimate, its standard error (SE), the number of degrees of freedom (DF), the 95% confidence interval (CI) lower and upper bounds, the t-value, and the p-value. The next sheet presents the random effect (RE) estimates, which for these models are the random intercept standard deviations (SD). The last sheet presents the results from the likelihood ratio tests (LRT) comparing the full model with models that exclude genotype, time, and treatment, thus testing the associations with genotype, time, and treatment, respectively. The baseline values in the model were WT (for genotype), T1 (for time), and control (for treatment). (XLSX)