Metabolomic Method: UPLC-q-ToF Polar and Non-Polar Metabolites in the Healthy Rat Cerebellum Using an In-Vial Dual Extraction

Unbiased metabolomic analysis of biological samples is a powerful and increasingly commonly utilised tool, especially for the analysis of bio-fluids to identify candidate biomarkers. To date however only a small number of metabolomic studies have been applied to studying the metabolite composition of tissue samples, this is due, in part to a number of technical challenges including scarcity of material and difficulty in extracting metabolites. The aim of this study was to develop a method for maximising the biological information obtained from small tissue samples by optimising sample preparation, LC-MS analysis and metabolite identification. Here we describe an in-vial dual extraction (IVDE) method, with reversed phase and hydrophilic liquid interaction chromatography (HILIC) which reproducibly measured over 4,000 metabolite features from as little as 3mg of brain tissue. The aqueous phase was analysed in positive and negative modes following HILIC separation in which 2,838 metabolite features were consistently measured including amino acids, sugars and purine bases. The non-aqueous phase was also analysed in positive and negative modes following reversed phase separation gradients respectively from which 1,183 metabolite features were consistently measured representing metabolites such as phosphatidylcholines, sphingolipids and triacylglycerides. The described metabolomics method includes a database for 200 metabolites, retention time, mass and relative intensity, and presents the basal metabolite composition for brain tissue in the healthy rat cerebellum.


Introduction
The brain is the centre of the nervous system in all vertebrates, and is responsible for controlling all bodily functions ranging from walking and talking, to heart rate and endocrine function. In addition to this diseases of the brain and central nervous system represent a major cause of global morbidity and mortality, with over 600 recognised neurological diseases [1] including developmental disorders such as Down syndrome and autism spectrum disorders [2][3], seizure disorders like epilepsy [4] and neurodegenerative disorders including Alzheimer's and Parkinsons diseases [5][6][7][8]. Despite the importance of the brain and the pathological burden associated with it, we are still relatively ignorant of its mechanisms and it is hoped that developing a better understanding of cerebral metabolism will help to begin unlocking the secrets of the brain. Arguably, the biggest challenges of working with both human and animal brain tissue are twofold, firstly the small amounts/preciousness due to inaccessibility of sample material and secondly reproducible extraction of metabolites from the sample tissues. These obstacles make the development of analytical approaches that maximise the metabolites that can be reproducibly measured from small tissue samples an important challenge.
Metabolomics is the unbiased analysis of the composition of small molecule metabolites in a given biological tissue or fluid, under a specific set of environmental conditions [9][10]. Due to the wide range of concentrations at which these metabolites are present and their diverse physiochemical properties it is challenging to obtain comprehensive analysis of all metabolite classes using a single method [11][12][13][14]. Therefore many metabolomic approaches that aim to maximise metabolite coverage utilise a combination of analytical platforms including liquid chromatography-mass spectrometry (LC-MS), nuclear magnetic resonance (NMR) and gas chromatography-mass spectrometry (GC-MS) [11,[15][16]. These multi-platform approaches will measure metabolites with a wide range of concentrations and physiochemical properties, however the downside to increasing metabolite coverage will be a significant increase in the amount of tissue required.
LC-MS is one of the most widely used analytical techniques for metabolite fingerprinting and has been used to analyse a range of metabolite classes in a variety of biological matrices [17][18][19][20]. One of the major advantages of this approach is that it separates complex sample mixtures into its constituent components prior to mass spectral analysis. Separation enables the discrimination of some isobaric compounds which mass spectrometry alone cannot do, it also helps to reduce matrix effects in the ionisation chamber such as ionisation suppression in which different components of the matrix compete to be ionised resulting in a suppressed metabolite signal and incorrect metabolite quantitation [21][22][23][24]. However, one important limitation is that physiochemical properties of metabolites are diverse, and a single chromatographic technique cannot separate thousands of metabolites. For example reversed phase chromatography will separate non-polar metabolites such as lipids, but not separate polar compounds like amino acids [13]. This means that all of the polar metabolites will co-elute at the start of the chromatogram, with many not being measured correctly due to ion suppression. Therefore, as a result multiple chromatographic separation techniques are required to achieve a broad coverage of the metabolome. Sample preparation for LC-MS metabolite fingerprinting usually involves a solvent based (usually methanol, ethanol or acetonitrile) protein precipitation [25] to reduce surface absorption and protein-metabolite interactions. Different chromatographic conditions require distinct sample preparations increasing analysis time, analytical variability and the amount of sample material required. A main obstacle in metabolomics is metabolite identification, metabolite features measured need to be translated to chemical identities or metabolites that can give biological information.
Metabolite annotation has repeatedly been identified as a significant bottleneck in mass spectrometry untargeted workflows [26][27]. There are several challenges that make metabolite annotation difficult, the first of which is that there is up to an estimated 200,000 distinct metabolites [10] less than 50% of which have been structurally identified. Many metabolites, especially esoteric compounds, have unknown structure, so complete identification can only be done by compound synthesis, hence sharing of in-house databases is unusual. Secondly whilst fragmentation patterns are used for identification, this is an expert field and good quality fragmentation is not always possible.
To date there has been a number of metabolomic studies that have looked at the metabolite composition of brain tissue. Salek et al. [28] used 1H-NMR to measure the metabolite composition in the hippocampus, cortex, frontal cortex, midbrain and cerebellum of CRND8 mice identifying 23 metabolites from tissue samples ranging in mass from 10-50mg. In humans, brain tissue is in short supply and to date only small numbers (n = 10-15) with reversed phase fingerprinting have been profiled. However two groups were able to make important contributions, Graham et al. [29] used 5g of human post mortem brain and UPLC-ToF to develop a method that detected 1,264 metabolic features, with 10 features shown to be correlated to AD. Koichi et al. [30] also used UPLC-ToF metabolomics of human brain and found spermine and spermidine to be increased in AD pathology.
Therefore, this study aimed to obtain both polar and non-polar metabolites from a single small sample of brain tissue. For this HILIC together with reversed phase (RP) methods were investigated. Another aim was to provide the means for metabolite identification with the method, the data generated is the basal metabolome in rat cerebellum that can be applied in clinical investigations.

Samples
Experimental tissue material was obtained from the cerebellum of adult male (Sprague-Dawley) rats obtained from Harlan Laboratories UK. The animals were euthanized in the Biomedical services unit, King's College London by inducing carbon dioxide (CO 2 ) anoxia followed by cervical dislocation as per Schedule 1 of the Animal (scientific procedures) Act of 1986. All animal procedures were approved by local animal welfare and the Ethics Review Body (King's College London). The cerebellum was isolated according to the Springer protocol for the dissection of rodent brain regions [31], samples were weighed and subsequently stored at -80°C. The cerebellum was sectioned on sterile glass slides (Thermo Scientific, Menzel-Glazer slides) using a sterile scalpel, both scalpel and slide were cooled in liquid nitrogen to reduce sample thawing during sectioning. Sectioned tissue samples were transferred to Eppendorf tubes containing a clean, pre-cooled, 5mm stainless steel ball bearing.

Experimental Design
In this study two primary experiments were performed to assess the precision and sensitivity of the IVDE, instrument methods and tissue homogenisation as well as to determine the effect of sample mass on metabolite recovery. The first experiment was designed to assess the combined variability of the IVDE and instrument methods. This was done by homogenising a single piece (18mg) of rat cerebellum, removing sample mass and tissue homogenisation as sources of variability. The homogenate was split into 7 aliquots of 50μl which underwent parallel extractions prior to injection on both HILIC and reversed phase methods ( Fig 1A). The second experiment was designed to assess the effect of the mass of tissue extracted and tissue homegenistation on method sensitivity and precision. Four Sprague-Drawly rat brain were obtained and this material was used to perform Experiment 2. To do this 15 tissue samples ranging from 3-17mg were homogenised and extracted in parallel prior to analysis (Fig 1B). Sensitivity was assessed in terms of the number of metabolite features that are routinely detected, whilst precision will be assessed in terms of the variability (coefficient of variation) of the abundance of internal standard and metabolite peaks as well as the degree of compositional similarity between samples as determined principal component analysis (PCA). A graphical description of the analytical workflow used in this study is shown in Fig 2.

Tissue homogenisation
Prior to homogenisation 20μl of methanol and 5μl of HILIC internal standard cocktail (2.5mM L-serine 13 C 3 15 N and L-valine 13 C 5 15 N in methanol:water (4:1)) was added per milligram of sample material. The tissue was then homogenised using a Tissuelyzer(Qiagen) in 10 cycles of 30 seconds at 25 Hz, subsequently a 50ul aliquot of homogenate was transferred to a Chromacol HPLC vial (400μl fixed insert). In-vial dual extraction of brain tissue Subsequently 10μl of water was added to the homogenate, vials were then vortexed for 5 minutes, after which 250μl of MTBE containing Tripentadecanoin (10 μg/ml) and Heptadecanoic acid (10 μg/ml) was added after which samples were again vortexed at room temperature for 60 minutes. Following the addition of a further 40μl of water containing 0.15mM ammonium formate to enhance phase separation, samples were then centrifuged at 2500×g for 30 minutes at 4°C. This resulted in a clear separation of MTBE (upper) and aqueous (lower) phases, with protein precipitate aggregated at the bottom of the vial. Quality control samples were created by pooling excess tissue homogenate from biological samples (after a 50μl aliquot had been taken), this excess homogenate was then split into 50μl aliquots for in-vial extraction.

LC-MS analysis of IVDE non-aqueous phase
LC-MS analysis was performed on a Waters Acquity ultra performance liquid chromatogram (UPLC) system coupled to a Waters Premier quadrupole time-of-flight (Q-Tof) mass spectrometer (Waters, Milford, MA, USA). The needle height in the auto-sampler was set to 13mm, with 5μl of sample extract injected onto an Agilent Poroshell 120 EC-C8 column (150mm × 2.1mm, 2.7 μm). Separation was performed at 55°C with a flow-rate of 0.5 ml/min using 10mM ammonium format in water (mobile phase A) and 10mM ammonium format in methanol (mobile phase B). For analysis in the positive mode, the gradient started at 80% mobile phase B increasing linearly to 96% B in 23 minutes and was held until 45 minutes then the gradient was increased to 100% by 46 minutes until 49 minutes. Initial conditions were restored in 2 minutes ahead of 7 minutes of column re-equilibration. For analysis in the negative ionisation mode the gradient started at 75% B increasing linearly to 96% B at 23 minutes, then increasing further to 100% B by 35 minutes, initial conditions were restored to allow 7 minutes of column re-equilibration. In the positive mode, a capillary voltage of 3.2 kV and a cone voltage of 45V was applied. Data was collected between 50 and 1000m/z, the desolvation gas flow was 400 L/hour and the source temperature was 120°C. In the negative mode, a capillary voltage of 2.6 kV and a cone voltage of 45 V were used. Desolvation gas flow and source temperature were fixed at 800 L/h and 350°C, respectively. All analyses were acquired using the lock spray to ensure accuracy and reproducibility; A reference solution (leucine-enkephalin) was used as lock mass (m/z 556.2771 and 278.1141) at a concentration of 200 ng/mL to update accurate mass data values and a flow rate of 10 μL/min. Data were collected in the centroid mode over the mass range m/z 50-1000 with an acquisition time of 0.1 seconds a scan.

LC-MS analysis of IVDE aqueous phase
The auto sampler needle height was set at 2mm, with analysis of 5μl of aqueous phase extract being analysed on a Merck Sequant Zic-HILIC column (150 × 4.6mm, 5μm particle size) coupled to a Merck Sequant guard column (20 × 2.1mm). A 40 minute room temperature gradient (0.3ml/min) was applied using 0.1% formic acid in water (mobile phase A) and 0.1% formic acid in acetonitrile (mobile phase B). The gradient started at 80% mobile phase B, followed by a linear reduction to 20% mobile phase B after 30 minutes, initial conditions were restored to allow 10 minutes of column re-equilibration. Mass spectral data was acquired between 75-1000 Daltons in both positive and negative ionisation modes. The applied mass spectrometry conditions were the same as for the reversed phase method.

Data processing and metabolite identification
The generated data was processed using MarkerLynx (Masslynx 4.1 Waters, USA) which provides automated peak detection based on peak alignment and normalization to total peak area. The reversed phase data were processed with a mass tolerance of 0.01 daltons (Da), a mass window of 0.05Da, and a retention time window of 12 seconds and a peak width of 10 seconds. The HILIC data was processed with a mass tolerance of 0.01 daltons (Da), a mass window of 0.05 Da, retention time window 18 seconds, and peak width of 20 seconds. Processed data was evaluated using principal component analysis (PCA) performed in SIMCA 13.0.3 (Umetrics, Umeå, Sweden). The data in all of the generated PCA models was logarithmically transformed (base 10) and scaled to unit variance (UV). The performance of the PCA models generated was assessed based on the cumulative correlation coefficients (R 2 X[cum]), and predictive performance based on seven-fold cross validation (Q 2 [cum]). Hotelling's T 2 plots were used to assess the departure of samples from the origin in the model plane, which will show the distance of a sample to a calculated average observation (i.e. an average metabolite composition). The DModX plots corresponds to the residual standard deviation of an observation in the x-variables, it was used to assess the distance of an observation to the fitted model. Metabolite annotation was performed by searching the m/z of measured metabolite features in a range of publicly accessible metabolite databases including the human metabolome database (HMDB), METLIN and LipidMaps. Once potential metabolites had been identified it was confirmed by matching the fragmentation pattern of the peak being annotated to the fragmentation pattern shown for given metabolites in the literature and standard compounds. In addition some peaks in the reversed phase method were annotated by comparing the m/z and retention time of metabolite features to metabolite features previously annotated in Whiley et al. [32].

Results/Discussion
Assessing the effect of IVDE and LC-MS on method performance and precision (Experiment 1) The first step in assessing the precision of the in-vial dual extraction (IVDE) and both the reversed phase and HILIC methods was to determine the recovery for four internal standards (Fig 3). In the HILIC method both internal standards were measured in both the positive and negative ionisation modes. In the positive data the recovery of internal standards are highly consistent with coefficient of variation (CV) of 2.4% and 3.7% (Fig 3B) for the serine and valine standards respectively. In the negative mode, recovery is more variable than the positive mode with CV's of 9.1% and 5.7% (Fig 3C) for serine and valine respectively. In the reversed phase method heptadecanoic acid was measured in the negative mode and tripentsdecanoin was measured in the positive. The recovery of both standards was consistent with CV's of 2.5% and 4.4% for heptadecanoic acid and tripentadenanoin respectively. The standard recoveries suggests that the IVDE and both HILIC and reversed phase methods have good precision with all internal standard measurements having CV's less than 15% [33], with mass spectrometry in the negative mode adding more variability than the positive mode.
The next step in determining the methods performance was to identify the number of metabolite features measured following HILIC and reversed phase separation and to assess the precision of these peaks. This was done by initially identifying the features present in all samples, then identifying those features measured in at least of 85% of samples, with a minimum cut off of peaks present in at least 70% of samples analysed (Tables 1 and 2). In total 5,841 metabolite features were measured in 100% of samples for both the HILIC (3713 metabolite features) and reversed phase (2128 metabolite features) methods. When a 70% sample presence cut off was applied, 12,274 metabolite features were identified with 6,570 and 5,704 metabolite features measured in the HILIC and reversed phase methods respectively. The measured metabolite features show good precision with 3,468 of the 5,841 (59.4%) of peaks seen in 100% of samples, and 6,362 of the 12,274 (51.8%) of the peaks measured in at least 70% of samples have CV's of <15%. In general the features with CV's of 15% are lower in abundance, with peaks at CV's <15% with an average abundance 6.62 and peaks with CV's 15% having an average of 1.93 potentially accounting for the lower precision. It is also interesting to note that the metabolite features that are measured in all samples have a higher average abundance (4.76) than those measured in 85% (2.04) and 70% (1.83). This is due to these groups possessing more peaks that are close to the limit of detection (LOD) with the peak falling below the LOD in some samples accounting for the missing values.
Having considered the behaviour of individual metabolite peaks, the final step in assessing the method performance is to look at the similarity of the overall composition of the analysed samples. Principal component analysis (PCA) was performed on all 12,274 metabolite features that were identified in at least 70% of samples (Fig 4). This PCA revealed little structure within the data with the first component accounting for only 25.3% of the total variability with a predictive performance of Q 2 = -0.10, with the first two components accounting for just 43.9% of variability with a predictive performance of Q 2 = -0.21. The distance of a samples metabolite composition to a calculated average composition was assessed using the Hotelling's T 2 range plot (Fig 4B). This plot shows that all of the samples are compositionally similar both to each other and the calculated average, with all samples having a T 2 of < 5 with the 95% confidence interval set at 13.88. The distance of samples to the model was assessed using the DModX plot (Fig 4C), which shows that the samples have a low residual of difference to the fitted model with all of the observations falling below the Dcritical(0.05) threshold. This combined with the Hotelling's T 2 show that all of the samples are compositionally similar and that there are no outliers to the model. Assessing the effect of tissue homogenisation and sample mass on method performance and precision (Experiment 2) As with assessing the performance of the IVDE and instrument methods, the first step in assessing the effect of tissue homogenisation and sample mass is to look at the recovery of the internal standards. As in experiment 1 both HILIC internal standards are seen in positive and negative ionisation modes (Fig 5B and 5C). In the positive mode the CV's of the internal standard recoveries were 13.5% and 14.7% for serine and valine respectively. In the negative mode CV's of the internal standard recoveries were 14.9% and 14.4% for serine and valine respectively. In the reversed phase data heptadecanoic acid is measured in the negative mode with a CV of 13.4%, and tripentadecanoin was measured in the positive mode with a CV of 3.8%. The recovery of the HILIC internal standards is more variable in these samples than in experiment 1, suggesting that the tissue homogenisation step is contributing significantly to analytical variability. This is further supported by no increase in the variability of tripentadecanoin which is spiked into the sample after tissue homogenisation. The recovery of the HILIC internal standards in the quality control samples, which are pooled after tissue homogenisation, were more consistent than in the analytical samples, and comparable with experiment 1 with CV's of 3.8% and 4.8% in positive and 5.3% and 7.1% in negative for serine and valine respectively, further supporting the hypothesis that tissue homogenisation is contributing significantly to the observed variability. With the increased CV's showing that tissue homogenisation is contributing to an increase in data variability, it is important to assess the effect of the extracted tissue volume on the recovery of the internal standards. Spearman's correlation was used to assess the relationship between standard recovery and sample mass, this analysis revealed no significant correlations showing that internal standard recovery is independent of the sample mass extracted.
The next step in assessing the method performance is to determine the number of metabolite features measured and the precision of these peaks. As in experiment 1 this was initially done by identifying peaks that were measured in all samples, working down to a cut off of peaks present in at least 73% of samples. In total 4,021 peaks were measured in 100% of samples, with 2,838 and 1,183 measured in HILIC (Table 3) and reversed phase (Table 4) methods respectively, 10,934 peaks measured in 73% of samples with 6,737 and 4,197 measured in HILIC and reversed phase data respectively. The precision of the measured peaks is lower than was seen in experiment 1 with 1,726 of 4,021 (43.7%) of the peaks seen in 100% of samples and 3,151 of 10,934 (28.8%) of peaks seen in 70% of samples having CV's of <15%. The finding of higher sample to sample variability of the measured metabolite features lends further support to the hypothesis of tissue homogenisation as a source of variability within the method. A transformation of the HILIC data to correct for the variability introduced during tissue homogenisation was performed by normalising peak intensity to an average of the abundance of the two internal standards, however this correction did not improve precision of the measured metabolite peaks (S1 Table).
Having considered metabolite features individually it is important to consider the composition of samples as a whole. As in experiment 1 PCA was applied to all metabolite features that were measured in at least 73% of samples (Fig 6). The analysis revealed little structure within the data with the first component accounting for only 22.3% of total variability with a poor predictive performance of Q 2 = 0.07, with the second component only explaining a further 13.1% of variability (Q 2 = 0.05) (Fig 6A). The Hotelling's T 2 plot (Fig 6B) shows that all samples fall within the 95% confidence interval (T 2 = 8. 19), with all bar one sample having a T 2 < 4 demonstrating that the samples are compositionally similar both to each other and to the calculated average. The DModX plot (Fig 6C) shows that all samples have a low residual of difference to the fitted model with all of the observations falling below the Dcritical(0.05) threshold. This combined with the Hotelling's T 2 plot show that all samples are compositionally similar and that there are no outliers to the model. Whilst all samples are compositionally similar it is important to determine the effect of the extracted tissue mass on metabolite composition. Looking at the PCA scores plot (Fig 6A) it can be seen that there is no bias in the distribution of samples based on the tissue mass, with low and high mass samples clustering together within the plot showing that they possess high levels of compositional similarity. As well as looking at the effect of sample mass on the compositional similarity it is important to assess its effect on the abundance of individual metabolites. Fig 7 shows the abundance of 9 annotated metabolites from both HILIC and reversed phase methods plotted against the tissue mass, these plots show no relationship between metabolite abundance and sample mass, with the strongest correlation being for glutamate (r = -0.24). This data shows that using between 3-17mg of sample material has no effect on the overall sample composition or the abundance of individual metabolites, showing this method can provide broad metabolite coverage when sample material is limited.

Annotated metabolites
Having optimised the sensitivity and reproducibility of the metabolite features measured by the analytical method, the final step is to demonstrate its biological relevance by linking the data directly to metabolism by annotating metabolites from a variety of chemical classes and across a range of concentrations. To do these 200 metabolites, 100 from both the HILIC and reversed phase methods were annotated (Tables 5 and 6). The annotated metabolites come from    a wide range of metabolite classes including amino acids, purines, phospholipids and glycerides, across 3.5 orders of magnitude ranging in abundance from 0.1 to 576.9. There is limited overlap between the two analytical methods with no identified metabolites in common, this limited overlap demonstrates the necessity of using complimentary separation techniques like HILIC and reversed phase chromatography to obtain a comprehensive view of all of chemical space. These annotations enable the method to be easily compared as basal metabolite abundance in the rat's healthy cerebellum and provide valuable information allowing the method to be accurately replicated by other laboratories.

Conclusions
The method described in this paper is shown to be capable of measuring over 4,000 metabolite features from as little as 3mg of tissue with a high degree of reproducibility of which we were able to annotate 200 metabolites from a variety of metabolite classes across a range of concentrations. It is hoped that the low required sample mass and improved sensitivity of this method will provide a valuable tool to analyse cerebral metabolism, hopefully providing new insights into the functioning of the brain as well as the mechanisms of pathology of neurological disorders.
Supporting Information S1 Table. Measured metabolite features in the HILIC method in experiment 2. Showing the number of metabolite peaks identified and their relative variability in 100%, 93%, 87%, 80% and 73% of 15 sample replicates after transformation based on the recovery of both internal standards. a percentage of samples a peak is detected in, b coefficient of variance of peak intensity between samples. (DOCX)