NMR metabolomics of cerebrospinal fluid differentiates inflammatory diseases of the central nervous system

Background Myriad infectious and noninfectious causes of encephalomyelitis (EM) have similar clinical manifestations, presenting serious challenges to diagnosis and treatment. Metabolomics of cerebrospinal fluid (CSF) was explored as a method of differentiating among neurological diseases causing EM using a single CSF sample. Methodology/Principal findings 1H NMR metabolomics was applied to CSF samples from 27 patients with a laboratory-confirmed disease, including Lyme disease or West Nile Virus meningoencephalitis, multiple sclerosis, rabies, or Histoplasma meningitis, and 25 controls. Cluster analyses distinguished samples by infection status and moderately by pathogen, with shared and differentiating metabolite patterns observed among diseases. CART analysis predicted infection status with 100% sensitivity and 93% specificity. Conclusions/Significance These preliminary results suggest the potential utility of CSF metabolomics as a rapid screening test to enhance diagnostic accuracies and improve patient outcomes.

Introduction Encephalomyelitis (EM) is a condition characterized by inflammation of the brain (encephalitis) and spinal cord (myelitis) that frequently causes permanent disability. There are myriad causes of EM syndromes, which are in aggregate relatively common [1][2][3][4] and include viral, bacterial, fungal, protozoal and prion infections, autoimmune encephalitis, intoxications, and metabolic encephalopathies, while other EM cases have unknown causes [5]. Clinicians face significant challenges to the rapid and accurate diagnosis and treatment of EM. Due to the rarity of a definitive diagnosis, many arbovirus and other viral causes of EM, including rabies, have limited evidence-based therapies; this may change with newer broad-spectrum antivirals currently in clinical trials [6]. Treatment of autoimmune EM relies on corticosteroids, immunoglobulin, plasmapheresis, cytotoxic agents and biologicals [7,8], which are typically contraindicated until infections can be excluded. Physicians are often forced to treat empirically for infections and delay appropriate therapy for autoimmune EM, thereby worsening patient outcomes. Moreover, for many causes of EM, no rapid diagnostic testing exists, and long delays pending laboratory test results commonly occur before definitive treatment may be initiated; however, superior outcomes depend on early intervention. Because there are numerous causes of EM, including multiple infectious agents that overlap or coincide in geographic distribution, diagnosis reliant on single-target testing is unsatisfactory as it requires quantities of tests that are not only prohibitive in cost but also involve collecting unsafe volumes of blood or cerebrospinal fluid (CSF) from patients. Improved diagnostics and proxy markers of therapeutic efficacy are sorely needed, especially as new treatment regimens develop.
In recent years, the development and expansion of omics technologies have presented opportunities for discovering disease mechanisms and biomarkers of clinical significance [9][10][11]. Metabolomics, the comprehensive study of small-molecule metabolites in a biofluid or tissue, offers a set of clues to the biochemical workings of a body system, organ, or compartment in a given physiological state, and has diverse applications in improving clinical diagnosis and treatment of central nervous system (CNS) diseases and intoxications [9,[12][13][14][15][16][17].
Metabolomics panels may also provide information about a broad spectrum of metabolic processes involved in a disease presentation compared to traditional single-molecule assays. Metabolites present in CSF may originate from brain metabolic processes, including intermediate and end products of energy metabolism, neurotransmission, inflammation and oxidative stress responses; thus, their analysis provides insights into metabolic disturbances occurring in CNS diseases. Among the methodological approaches taken in metabolomics studies of CSF, 1 H-NMR spectroscopy carries advantages for exploratory studies both in the scope of metabolite detection and its quantitative ability [18]. An additional advantage of this method is the lack of sample consumption, given practical limitations on the volume of CSF usually available. Further, many CNS diseases and intoxications are prevalent in countries where advanced imaging facilities, reference laboratories and therapeutics are in short supply. Recent studies have applied 1 H NMR-based metabolomics of CSF to identify single-molecule biomarkers and panels of metabolites associated with a range of neurological diseases such as infectious meningitis [14], multiple sclerosis (MS) [13,19,20], Alzheimer's [21,22] Parkinson's [23] and Huntington's diseases [24]. Further, this method has detected metabolic changes characterizing different stages of disease progression in rabies and MS [12,25]. Proxy markers of disease progression or response to therapy may also accelerate therapeutic trials while lowering their cost.
Despite significant advances in the application of NMR metabolomics in the investigation of certain CNS diseases, such as multiple sclerosis, its potential to describe metabolic changes occurring in many infectious neurological diseases has been less studied. Lyme disease and West Nile Virus (WNV) are vastly under-studied in this sense, despite being the most common causes of vector-borne bacterial and viral disease, respectively, in the United States [26,27]. Rabies is an important global zoonosis but may be underdiagnosed in some contexts due to challenges in distinguishing it clinically from other CNS infections, such as cerebral malaria, in areas where these are endemic [28]. Infectious diseases that invade the CNS have distinct molecular mechanisms driving their respective pathologies [29,30]. Further, pathogen strategies to replicate while evading host immune responses can involve the disruption of a range of endogenous metabolic processes [31], many of which have yet to be illuminated for specific diseases; thus, explorative studies of the CSF metabolome in different disease states can provide an important window for examining potential pathogen effects on metabolism within the CNS to lay the groundwork for future targeted diagnostics or therapeutic interventions. In the present study CSF samples from patients representing diverse infectious and non-infectious diseases of the CNS were analyzed by 1 H NMR-spectroscopy to determine if metabolomics profiling could distinguish diseases. We find preliminary evidence of the existence of discriminating metabolic features.

Subjects
Twenty-seven patients were diagnosed with CNS Lyme disease (n = 5, all ages, at the New York State Department of Health), WNV meningoencephalitis (n = 5, all ages, New York State Department of Health), Clinically Isolated Syndrome (CIS) of multiple sclerosis (MS, n = 4, adults, Intermountain Healthcare), rabies (n = 10, all ages, at Canadian Food Inspection Agency, Centers for Disease Control and Prevention, Kimron Veterinary Institute, National Institutes of Health-Colombia, and New York State Department of Health), or Histoplasma meningitis (n = 3, anonymous, at Indiana University School of Medicine). Due to ethical concerns surrounding the collection of CSF from healthy individuals, healthy controls were not available for this study. Specimens obtained as discard material from 25 anonymous children aged 5-20 years at the Children's Hospital of Wisconsin with no concurrent microbiological testing and no known encephalopathy or encephalitis served as a control group. This population includes mostly children with cancer in remission or children being treated for pseudotumor cerebri, a common non-inflammatory condition. Given patient samples were anonymous discard material, the study was ruled to not be human research requiring informed consent by the Children's Hospital of Wisconsin IRB (protocol CHW 10/24). For rabies patients, for whom multiple specimens were available, the specimen taken closest to the fourth day of hospital admission was selected to minimize the influence of hypoglycemia, ketosis or renal insufficiency on presentation to the CSF metabolome. While the CSF was collected for diagnostic purposes, precise timing is uncertain other than for rabies patients. Initially, four specimens from patients with Histoplasma meningitis were analyzed, but one specimen had a metabolite profile inconsistent with CSF and was excluded on the basis of containing implausible values. Three Histoplasma specimens remained after this exclusion.

Storage and preparation of CSF samples
After collection, specimens were stored refrigerated and/or frozen until transport on dry ice to the site of analysis, where they were stored at -80˚C until sample preparation. Once defrosted, samples were filtered using washed Amicon Ultra-0.5 mL centrifugal filters with a cut-off of 3000 MW (Millipore, Billerica, MA) to remove lipids and proteins. When needed, filtrate volume was adjusted to 207 μL when preparing for 3mm NMR tubes or 585 μL when preparing for 5mm NMR tubes with Type I ultrapure water from Millipore Synergy UV system (Millipore, Billerica, MI). Samples were prepared for analysis by the addition of 23 μL or 65 μL of internal standard containing approximately 5 mmol/L of DSS-d6 [3-(trimethylsilyl)-1-propanesulfonic acid-d6], 0.2% NaN 3 , in 99.8% D 2 O to 207 μL or 585 μL of CSF filtrate, respectively. The pH of each sample was adjusted to 6.8 ± 0.1 by adding small amounts of NaOH or HCl. A 180 or 600 μL aliquot was subsequently transferred to 3 mm or 5mm Bruker NMR tubes, respectively, and stored at 4 o C until NMR acquisition (within 24 hours of sample preparation). NMR spectra were acquired as previously described [12] on a Bruker Avance 600-MHz NMR equipped with a SampleJet autosampler using a NOESY-presaturation pulse sequence (noesypr) at 25˚C.

Data analysis
NMR spectra were manually phased and baseline-corrected using NMR Suite v6.1 Processor (Chenomx Inc., Edmonton, Canada), and Chenomx NMR Suite v.8.1 Profiler (Chenomx Inc., Edmonton, Canada) was used for quantification of metabolites. Selected NMR spectral data from a previous rabies study in this lab [12] were compared to additional samples acquired from Lyme, WNV, histoplasmosis and MS patients.
After correcting metabolite concentrations for dilution, data were cluster-analyzed 2 ways for comparison using RStudio software (RStudio Version 1.0.136, Boston, MA, USA) or Stata software (SE 14, College Station, TX, USA). First taking a data-driven approach, concentrations were log 10 -transformed before principal component analysis (PCA) was carried out on the covariance matrix of the centered data as an unsupervised search for trends. Alternatively, to provide clinical context, data were normalized to z-scores using published reference ranges in CSF (www.hmdb.ca). In instances when published norms were discrepant, those that encompassed the range of our control population were selected. In rare instances when normal ranges were unavailable, means and standard deviations were constructed using our 25 controls. Normalization by z-scores constructed from population norms generated more skewed data than log 10 -transformation across the entire spectrum of diseases and controls. Factor analysis better tolerates skewed data than PCA and was applied to the z-scores.
Based on the separation found by PCA and factor analysis, differences in metabolite concentrations by infection status and by individual disease diagnoses were assessed on the untransformed data using Mann-Whitney U tests and Kruskal-Wallis tests, respectively. P-values were adjusted for multiple comparisons using false discovery rates. Homogeneity of variance between groups was tested using the Levene test to inform interpretation of the rank sum test results. For metabolites with significant differences by Kruskal-Wallis testing, Dunn's multiple comparisons tests were performed between each pair of groups to determine which diseases were different from each other. For these tests, p-values were Bonferroni-adjusted within the 15 multiple comparisons carried out for each metabolite. After adjustment, p-values of less than 0.05 were considered significant. Cliff's Delta statistics [32] were calculated to assess the degree of overlap in metabolite concentrations by infection status and between diseases that were found to have significant differences by the Dunn's test.
Untransformed data were also analyzed by predictive analysis [33,34]. Classification and regression trees (CART) and Random Forests were performed using Salford Predictive Modeler software suite CART and suite Random Forests (Salford Systems, San Diego, CA, USA). For CART, parent node and terminal node were 10 and 5, respectively. 10% leave-out samples were used for cross-validation. Random Forests are collections of decision trees, and each tree was grown on a random (~2/3) subsample of the data. The remaining data were used to determine the performance of the trees. The number of trees to build was 1000. The number of predictors considered for each node was the square root of the number of potential predictors, and the parent node minimum cases was 2. The variable importance was assessed using the GINI method. Target variable and predictors were the same as for CART.

Results
CSF samples obtained from 25 controls and 27 patients with different neurological diseases were analyzed by 1H-NMR spectroscopy. Table 1 summarizes clinical characteristics of patients included in this study. A total of 57 compounds were identified and quantified in CSF samples; rabies spectra from a prior study [12] were repeat-profiled. Quantification for 13 metabolites present at very low concentrations in a majority of samples was considered not to be exact (S1 and S2 Tables) but still useful in detecting differences between groups. To further minimize the reversible behavioral effects of starvation and dehydration in the analysis normalizing by z-scores, we excluded 3 ketone bodies (3-hydroxybutyrate, acetoacetate, and acetone) and creatinine from the dataset.

Metabolite profiles by infection status
A major clinical challenge is determining whether infection exists as a contraindication to immunosuppression. Unsupervised PCA was performed on metabolite data from patients diagnosed with a neurological disease and controls. Six compounds (acetaminophen, ethanol, ethylene glycol, glycerol, propylene glycol, and valproate) of likely exogenous origin were excluded from cluster analysis models. The first two principal components (PC) in this model accounted for 37.8 percent of the variation in metabolite concentrations. Prominent overlap was apparent between controls and MS, which separated distinctly from infectious diseases along PC 1 (Fig 1). In a scores plot of the first two components, PC 2 identified an apparent outlier in the WNV group, which upon closer examination was observed to have extremely low levels of citrate, lactate, and amino acids coupled with markedly high glutamate, pyruvate, acetate and 2-oxoglutarate compared to the rest of the samples. Since the general patterns generated by PCA did not change when this individual was removed from the dataset, the results shown in Fig 1 reflect this exclusion in order to better visualize clusters in the data. When overlaid with loadings vectors, the scores plot of the first two PCs revealed two patterns of metabolites among infectious diseases, one characterized by higher levels of ketone bodies and the other by higher levels of pyruvate, glutamate, 2-oxoglutarate, carnitine, and glycine (Fig 1). Pearson correlation coefficients reflect moderate to high correlation among the metabolites in each pattern, with correlation coefficients ranging from 0.77 to 0.93 among ketone bodies and from 0.23 to 0.58 among metabolites in the second pattern. In contrast, metabolites including acetate, isobutyrate, myo-inositol, threonine, and glutamine appeared to characterize controls and MS using loadings vectors.
The contribution of ketone bodies to the PCA analysis prompted a second, clinically applicable analysis using z-scores of normal human values for each metabolite while excluding the potentially non-specific markers of dehydration and starvation, which yielded similar results. Unsupervised factor analysis discriminated CNS disease from controls, with 2 factors accounting for 35.6 percent of the variation. The WNV sample that appeared as an outlier by PCA was not influential in this analysis. Factor analysis excluding ketones and creatinine did not discriminate infections from normal as well as did the PCA analysis.
Given the graphical separation by infection status shown by PCA and factor analysis, Mann-Whitney U tests were performed to test for differences in metabolite concentrations between patients with an infectious CNS disease and those with no CNS infection (MS and controls). All metabolites were included. These results are summarized in Table 2. After correcting for multiple comparisons, significant univariate differences were detected in the concentrations of 29 compounds; these included several metabolites that appeared to drive separation in the PCA (ketones, pyruvate, carnitine, and glycine). Median concentrations of glutamate and 2-oxoglutarate were significantly higher in infectious diseases than patients with no infectious disease, and there was a trend towards higher citrate concentrations in the infectious disease group (p = 0.07). Also, in agreement with the PCA results, median concentrations of isobutyrate, fructose, N-acetylneuraminate, and serine were higher in the noninfectious disease group, and acetate exhibited different distributions between the groups. In a similar univariate analysis on z-scores for 43 variables, nine metabolites were identified ( Table 2, among bolded metabolites), all of which were also identified using the previous method.

Metabolite profiles by CNS disease
While CNS infections overlap as a syndrome, they are caused by viruses, bacteria, fungi, protozoa and prions that require different therapies. We therefore evaluated PCA discrimination within CNS diseases without the influence of controls. In the resulting model, PC1 and PC2 cumulatively accounted for 38.9 percent of the variation, and when loadings vectors were overlaid with PC scores, the resulting Gabriel's biplot revealed the most important metabolites to be ketone bodies, glutamine, glutamate, and threonine. In a scores plot of the first two PCs, moderate separation by disease diagnosis pointed to differential as well as overlapping metabolic patterns among diseases (Fig 2), which were further dissected in additional analyses and are summarized in Tables 3 and 4. After removing ketones and creatinine, factor analysis of zscores did not separate cleanly between disease groups.
After correcting for multiple comparisons, Kruskal-Wallis tests on untransformed data detected significant differences among diseases and controls in the concentrations of 31 metabolites. Metabolites and diseases for which concentrations were significantly different from control samples according to Dunn's multiple comparisons tests are shown in Table 4. In particular, the CSF of WNV patients had markedly higher concentrations of pyruvate (p = 0.0008) and formate (p = 0.0005), and Lyme disease and WNV patients shared higher levels of formate and glycine compared to controls. Rabies patients had significantly different concentrations of energy-related metabolites including ketone bodies, lactate and 2-hydroxybutyrate, some of which were also elevated in WNV but not in histoplasmosis or Lyme disease.

Predictive analyses by infection status and CNS disease
CART analysis differentiated infection status with 100% sensitivity and 93% specificity ( Table 5). High pyroglutamate alone discriminated WNV, Lyme and histoplasmosis from controls. MS or rabies could be identified from controls with 100% sensitivity and 76% specificity by high 2-hydroxybutyrate or low 2-hydroxybutyrate and high carnitine. Random Forest analyses confirmed the importance of the majority of metabolites identified by CART.  Discussion NMR metabolomics distinguished infectious and inflammatory disorders using laboratoryconfirmed samples of 5 disorders using 2 approaches to normalization of the data, and 2 unsupervised cluster analytical approaches. CART decision analysis easily differentiated bacterial (Lyme), fungal (Histoplasma) and viral (WNV) causes of encephalomyelitis from controls. Decision analysis also differentiated rabies and the prodromal form of MS from controls, while separation by cluster analyses was incomplete between MS and controls. Notably, the greatest source of variation in metabolomics data found by PCA was the presence or absence of an infectious pathogen. If replicated, this finding is of paramount clinical impact because treatments for infections require almost polar opposite therapeutics than those for autoimmune diseases. There was also substantial agreement in the identification of influential metabolites between different approaches to data normalization and reduction and predictive approaches, including CART and random forest analysis. Metabolites driving separation in PCA (pyruvate, glutamate, quinolinate, 2-oxoglutarate, carnitine, and glycine) potentially suggest alterations in energy metabolism, excitotoxicity and antioxidant response. Patterns of these metabolites were not uniform. Rather, overlapping as well as distinguishing metabolic features were seen, highlighting the potential utility of measuring a suite of metabolites rather than searching for individual metabolic biomarkers for diseases, which may not exist. Overlap of profiles makes strong clinical sense given that EM syndromes overlap in signs and symptoms. The overlap also supports a clinical rationale for syndromic metabolic therapies across a range of infectious or autoimmune causes of EM. Distinguishing features provide promise of rapid, relatively specific diagnoses that enable prompt pathogen or process-directed therapies. Significant differences by disease group were found in the CSF concentrations of several metabolites known to be involved in the synthesis of the antioxidant glutathione (GSH) and related pathways, including glycine, formate, pyroglutamate, and 2-hydroxybutyrate. The transsulfuration pathway links the methylation cycle of one carbon metabolism to GSH synthesis and produces 2-hydroxybutyrate as a secondary byproduct during the conversion of cystathionine to cysteine [35,36]. Formate, an endogenous and bacterial metabolite that along with glycine was found at significantly higher levels in WNV and Lyme disease patients compared to controls in this study, is formed as a byproduct in several pathways including the tryptophan kynurenine pathway [37], pterin metabolism [38] and protein demethylation (following hypermethylation by S-adenosyl-L-methionine [39]), while it is also consumed in the folate cycle during the conversion of tetrahydrofolate (THF) to 10-formyl-THF [40]. An end product of purine catabolism, neopterin, has been found to be elevated in patients with rabies [41], Lyme disease, and other neuroinfections, while remaining low in MS and other neuroinflammatory conditions [42]. Pyroglutamate, which converts to glutamate before being incorporated into GSH and also activates amino acid transport systems at the blood brain barrier [43], was higher in histoplasmosis, Lyme disease and WNV and was an important predictor distinguishing these conditions from control samples. Given individual metabolites can participate in a number of biochemical pathways, further studies are required to parse out the mechanisms at play in the diseases studied here. A likely interpretation is that infection or inflammation in the CNS is associated with redox imbalances including glutathione metabolism and NADH/NAD + ratios. It is of particular interest that these metabolites may profile mechanisms leading to insulin resistance and vascular disease [36], given that low dose insulin therapy was added to the Milwaukee protocol, version 4, with statistical improvements in survival [44].

Metabolite Potential Pathway(s) Involved Effect Size b P-value c Betaine
Our analytical design sought to minimize the effects of starvation/ketosis and dehydration/ uremia on the metabolic profile of rabies by prioritizing rabies samples taken four days after admission. Nevertheless, PCA analysis identified the importance of ketone bodies in identifying rabies. Factor analysis that deliberately excluded primary ketones, urea and creatinine from analysis still identified isopropanol and methanol (Table 3), both downstream metabolites of ketones, as discriminators of rabies. RF and CART analyses also identified ketones and carnitine (fatty acid oxidation) as predictors of rabies but not other infections (Table 5). Despite our experimental design, CNS ketosis may be a valid indicator of rabies encephalitis.
This study was originally intended to further explore the specificity of NMR metabolomics for the diagnosis of rabies, which is often confused with Guillain-Barre syndrome, acute psychosis and N-methyl-D-aspartate receptor (NMDAR) encephalitis and currently requires multiple tests for diagnosis at remote reference laboratories. Our findings suggest that the utility of the approach may instead lie in excluding competing diagnoses, many of which are more treatable. NMR metabolomics performed on a par with current rabies diagnostics (100% sensitivity, 76% specificity) and is likely complementary (particularly after 5 days). When restricted to the first week of hospitalization with rabies (when most patients die), NMR metabolomics did not perform as well as for other infections; gene expression studies of rabies CSF and detection of rabies-specific antibodies also performed poorly in the first week. Rabies can clearly be delineated from controls by NMR at later time points, and NMR of CSF also measures recovery [12]. The promise of an NMR metabolomics profile as a proxy marker for therapeutic response would be welcome for rabies, WNV, NMDAR encephalitis or acute disseminated encephalomyelitis for which efficacious treatments remain undefined.
This study is exploratory and is limited by the number of samples available for CNS diseases of rare incidence. The possibility of confounding effects of age, sex, disease stage, or other acute variations in metabolic processes should be considered in interpreting these results. Our control group was aged 5-20 years, while ages in the disease group ranged from 4 to 83 years. However, we confirmed that the distribution of metabolites of our controls overlapped with adult norms reported by the international Human Metabolomics Database (www.hmdb.ca). Further, clear inter-disease differences within groups of adult diseases (MS, WNV) were evident in PCA (Fig 2), suggesting disease was much more influential in driving variation than was age. Sensitivity analyses in rabies in a larger dataset [12] did not identify meaningful age differences, although we cannot exclude the possibility that this might occur for other inflammatory diseases of the CNS. Another potential source of confounding is the timing of sample collection, which was not precisely known for samples other than rabies. All forms of encephalitis are treated empirically upon hospitalization, so early diagnostic samples such as those analyzed here may reflect early empirical therapies that often overlap (e.g., rehydration, provision of glucose, use of antibacterials, sedation) but may also differ between diseases. Our choice of rabies samples centered on the fourth day of hospitalization was intended to minimize effects of dehydration and malnutrition, but may have biased rabies samples toward normality. Finally, differences in some metabolites should be interpreted with caution, since low concentrations in some specimens precluded exact quantification (carnitine and glycine), which may have artificially led to statistical differences. Other metabolites (glutamine and pyroglutamate) are potentially affected by protein removal [45], although this has not been shown in CSF.
This study provides justification for further analysis of samples from these and other causes of encephalomyelitis. Several prominent and as of yet unidentified peaks observed in the spectra of some patients may indicate the presence of important metabolites involved in disease pathogenesis that have not yet been elucidated. While further studies with larger sample sizes will be needed to determine the clinical utility of NMR in the diagnosis of EM, NMR or other 'omics technologies may in the future serve as a rapid initial screening test that would allow medical practitioners to initiate treatment with antivirals or biological immune modifiers, while patient samples can then be triaged to appropriate reference laboratories for confirmation without delaying treatment. Rabies and many arbovirus reference laboratories require specialized containment facilities, immunization of laboratory workers, and highly trained personnel who perform subjective assays such as immunofluorescence. Reference laboratories for rabies, arboviruses, bacteria and fungi are often dispersed geographically, leading to substantial requirements in volume, delay, and cost for diagnosis of encephalomyelitis when all are considered. NMR and MS instruments, on the other hand, exist at most research universities, i.e. at a state or provincial rather than national level. NMR analytical procedures are easily standardized and permit detection of multiple diseases using a single experiment, as illustrated here. NMR spectra can be transmitted electronically for analysis, which can be automated [46]. Decision analytical approaches such as CART and RF offer diagnostic flow charts that are easily implemented once validated, with quantifiable diagnostic probabilities. Considering current challenges, its relative ease of use makes NMR metabolomics of CSF a potentially important tool for emergent diseases and distinguishing between autoimmune and infectious EM.
Supporting information S1