Targeted Metabolomics Identifies Reliable and Stable Metabolites in Human Serum and Plasma Samples

Background Information regarding the variability of metabolite levels over time in an individual is required to estimate the reproducibility of metabolite measurements. In intervention studies, it is critical to appropriately judge changes that are elicited by any kind of intervention. The pre-analytic phase (collection, transport and sample processing) is a particularly important component of data quality in multi-center studies. Methods Reliability of metabolites (within-and between-person variance, intraclass correlation coefficient) and stability (shipment simulation at different temperatures, use of gel-barrier collection tubes, freeze-thaw cycles) were analyzed in fasting serum and plasma samples of 22 healthy human subjects using a targeted LC-MS approach. Results Reliability of metabolite measurements was higher in serum compared to plasma samples and was good in most saturated short-and medium-chain acylcarnitines, amino acids, biogenic amines, glycerophospholipids, sphingolipids and hexose. The majority of metabolites were stable for 24 h on cool packs and at room temperature in non-centrifuged tubes. Plasma and serum metabolite stability showed good coherence. Serum metabolite concentrations were mostly unaffected by tube type and one or two freeze-thaw cycles. Conclusion A single time point measurement is assumed to be sufficient for a targeted metabolomics analysis of most metabolites. For shipment, samples should ideally be separated and frozen immediately after collection, as some amino acids and biogenic amines become unstable within 3 h on cool packs. Serum gel-barrier tubes can be used safely for this process as they have no effect on concentration in most metabolites. Shipment of non-centrifuged samples on cool packs is a cost-efficient alternative for most metabolites.


Introduction
The inclusion of the serum or plasma metabolome analysis in clinical trials is an appealing approach for several reasons. Observed changes in the metabolome could be linked to the clinical response to a study medication or any other kind of intervention. This could enable future predictions of drug efficacy or side effects based on the metabolome. Other potential benefits include a better understanding of an intervention's mode of action. However, two questions are critical in ascertaining whether such an approach is feasible. The first question concerns the reproducibility of metabolite measurements. Metabolite levels in an individual need to be reasonably stable over time to allow for the measurement of changes elicited by an intervention. Few studies have investigated the reliability of metabolite concentrations across repeated measurements [1][2][3]. However, these are limited by a smaller number of metabolites analyzed. The second issue arises from the fact that almost all larger clinical trials are multicenter studies. To incorporate metabolomics into such studies, one has to validate realistic and cost effective ways of pre-analytic sample handling, such as 1) shipment, 2) choice of tube type and 3) repeated freeze-thaw cycles. To date, studies investigating sample stability during shipment focus on a small metabolite panel including cholesterol [4][5][6], vitamins [6] lipids [6,7], amino acids [8], glucose [9] or acylcarnitines [10], or they are limited by a small sample size [11].In this study, we address questions regarding the reproducibility of targeted metabolomics measurements in the same individual at three different time points and of pre-analytic stability of metabolites.

Ethics Statement
All participants of this study gave written informed consent. The study was conducted according to the principles expressed in the Declaration of Helsinki and approved by the ethics committee of the Ludwig-Maximilians-University Munich (LMU), Germany (no. 086-06).

Sample Collection and Preparation
Blood samples were collected from 22 healthy volunteers (5 men and 17 women), with a mean age of 30 (range: 22-52) after an overnight fast. Gender was found to be no confounder in this study. Information regarding medication and the last meal before each fasting period was collected for each sampling day. All participants were non-smokers. On day one, blood was taken from 20 participants (5 men and 15 women) in five 7.5 mL safetymonovettes (Sarstedt, Nümbrecht). For preparation of plasma ('plasma-direct'), the K + EDTAmonovette was centrifuged directly (20006g, 10 min). One monovette for serum preparation ('serum W' with clot activator) was centrifuged after 30 min of coagulation at room temperature (RT) (,21uC). Serum W and plasma-direct samples were stored as 0.25 mL aliquots on dry ice and frozen at 280uC before measurement. The other three serum tubes (serum gel-barrier tubes with clot activator) were stored on cool packs (CP) (,4uC) for 3, 6 and 24 h before centrifugation. Aliquots of 0.25 mL were stored at 280uC before measurement.
On day two, blood was taken from the same 20 participants in six 7.5 mL tubes. One tube for serum (serum W) and one for plasma (plasma-direct) were prepared as described above. The remaining four plasma tubes from each person were stored on CP (,4uC) for 3, 6 and 24 h and one tube at RT (,21uC) for 24 h before centrifugation. Aliquots of 0.25 mL plasma were stored at 280uC before measurement. On day three, blood was collected from the same 20 participants in two 7.5 mL tubes for serum W and plasma-direct preparation (see above). For a subgroup of 13 volunteers (4 men and 9 women), blood was collected in three 7.5 mL serum gelbarrier tubes. One serum gel-barrier tube from each person was frozen immediately at 220uC and stored at 280uC without additional thawing before measurement (for thawing and tube type experiments). Aliquots of 0.25 mL of the remaining two serum gel-barrier tubes were frozen at 220uC and thawed once and twice after one week, respectively before storage at 280uC. From five subjects of this subgroup a second set of serum gel and serum W samples was collected for the thawing and tube type experiments during an additional collection day (1 men, 4 women). Additionally, 30 mL blood was taken from two different participants (2 women) in four 7.5 mL tubes, one for serum W and three for serum gel preparation (thawing and tube type experiments).

Metabolite Analysis
The targeted metabolomics approach was based on electrospray ionization liquid chromatography-mass spectrometry (ESI-LC-MS/MS) and MS/MS measurements using the AbsoluteIDQ TM p180 kit (BIOCRATES Life Sciences AG, Innsbruck, Austria). The assay allows simultaneous quantification of 188 metabolites out of 10 mL plasma, including free carnitine, 40 acylcarnitines (Cx:y), 21 amino acids (19 proteinogenic amino acids, citrulline and ornithine), 21 biogenic amines, hexose (sum of hexosesabout 90-95% glucose), 90 glycerophospholipids (14 lysophosphatidylcholines (lysoPC) and 76 phosphatidylcholines (PC diacyl (aa) and acyl-alkyl (ae)), and 15 sphingolipids (SMx:y). The abbreviation Cx:y is used to describe the total number of carbons and double bonds of all chains, respectively (for more details, see [12]). The method of the AbsoluteIDQ TM p180 kit has been proven to be in conformance with the FDA-Guideline ''Guidance for Industry -Bioanalytical Method Validation (May 2001'') [13], which implies proof of reproducibility within a given error range. Measurements were performed as described in the manufacturer's manual UM-P180. The assay procedures of the AbsoluteIDQ TM p180 kit as well as the metabolite nomenclature have been described in detail previously [12,14]. Sample handling was performed by a Hamilton Microlab STAR TM robot (Hamilton Bonaduz AG, Bonaduz, Switzerland) and an Ultravap nitrogen evaporator (Porvair Sciences, Leatherhead, U.K.). Mass spectro-

Statistical Analysis
All statistical analyses were performed with the statistic platform R, version 2.14.1 [15]. 29 metabolites were excluded from further analysis as they did not pass quality control (with more than 12% of measurements at zero concentration or missing and concentrations below LOD in more than 95% of all samples). 16 metabolites had a coefficient of variance (CV) across all plates above 25% in reference samples (see table S1). These were not excluded but labeled (*) in the tables and figures, as these metabolites may have a lower CV across all plates in future investigations. Between and within-subject CVs (BCV and WCV) and between and within-plate CVs were calculated as the square root of the between-and within-subject/plate variance as determined using random effects models (R package nlme) divided by the mean metabolite concentration. The median within-plate CV was 9.31% and the median between-plate CV was 1.74E-4% (table  S2). Reliability was expressed by the intraclass correlation coefficient (ICC (1)), calculated as between-subject variance divided by the sum of between-and within-subject variance [16]. Bias corrected and accelerated bootstrap confidence intervals were calculated for between-and within-subject CVs and ICC (1) using 10,000 bootstrap samples [17].
In order to explore time and temperature effects during simulation of transport in serum and plasma samples and the effect of tube type and number of thawing cycles on metabolite concentration in serum samples, nonparametric Friedman tests were performed [18]. Significant changes in concentration during the Friedman tests were afterwards tested in a pairwise manner against the reference (0 h) with Wilcoxon signed-rank tests. Similarly, significant differences in metabolite concentrations between serum and plasma baseline samples (serum W and plasma-direct) were tested in a pairwise manner using Wilcoxon signed-rank tests. For all performed tests the level of statistical significance was set at p,0.01.
A power analysis was performed for each metabolite for selected settings. The minimum concentration change between two paired settings (e.g. 0 h and 24 h) was calculated, for which the present study had a power of 80% to recognize it as significant, at the given sample size of 20 and at an alpha level of 0.01. To account for the reduced power of the non-parametric Wilcoxon signedrank as compared to a parametric paired t-test, an asymptotic relative efficiency of 0.955 was assumed [19]. Results are given in table S3.

Reliability of Serum and Plasma Metabolite Concentrations
Samples taken from 20 healthy human subjects on three different days were used to calculate the intraclass correlation coefficient (ICC), the within-person CV (WCV) and the betweenperson CV (BCV) for 159 metabolites in serum (figure 1) and plasma (figure S1). Table S4 summarizes results for ICC, WCV and BCV. In serum and plasma metabolites, 80% of WCV and 70% of BCV values were below 0.25 (figure 2, figure S2). In serum, the median WCV was 0.15 (range: 0.08-1.09) and the median BCV 0.20 (range: 0.05-1.56). Median WCV values of metabolite subclasses were concordant in serum and plasma samples, whereas median BCV values were slightly higher in serum samples. Both within-and between-person variance contribute to the ICC. Accordingly, the median ICC was higher in serum as compared to plasma samples.
The median ICC in plasma was 0.63. The lowest reliability in plasma was detected for PC ae C30:1* (ICC = 2.28E-09, confidence interval (CI): 0.00-0.47) and the highest for creatinine (ICC = 0.90, CI: 0.84-0.95). In serum samples, the median ICC was 0.66. The lowest reliability in serum was observed for asymmetric dimethylarginine (ADMA) (ICC = 0.05, CI: 0.00-0.38) and the highest for spermidine* (ICC = 0.87, CI: 0.77-0.93). Regarding metabolite subclasses, reliability was lowest for acylcarnitines both in serum and plasma, as hydroxyacylcarnitines showed low reliability due to low concentrations. Long chain (. C10) and unsaturated acylcarnitines showed low reliability either due to low concentration or due to higher within-person variability compared to between-person variability. All other subclasses showed good (ICC .0.50) reliability. 101 metabolite concentrations were significantly higher in serum compared to plasma, comparing the median values of the baseline samples that were collected on three different days. Only sarcosine showed significantly lower concentrations in serum compared to plasma samples (table S5). There was no evidence of any effects of gender and last meal composition before each overnight fasting on metabolite concentrations (data not shown).

Stability of Metabolites during Shipment Simulation
Simulated shipment on cool packs and at room temperature. We simulated different ways of sample collection and shipment that might occur in multicenter clinical trials: Transport of non-centrifuged samples on cool packs (CP) within 3, 6 and 24 hours (plasma and serum) or at room temperature (RT) for 24 h (plasma only).
The compound class of amino acids was the least stable metabolite class. In the compound class of glycerophospholipids, all metabolites had stable concentrations on CP for at least 24 h. Concentrations in 2 out of 10 lysoPCs (lysoPC a C28:0 and lysoPC a C28:1) were even stable at RT for 24 h. 32 of 36 PC aa concentrations and all PC ae concentrations remained stable at RT. The compound class of sphingolipids was the most stable one, as all analyzed sphingolipid concentrations were stable on CP and at RT for at least 24 h.
All significantly increased or decreased metabolite concentrations during shipment simulation in serum showed similar tendencies in plasma samples over time on CP (see figures S3, S4, S5, S6, S7, S8, S9, S10, S11, S12, S13, S14, S15, S16, and S17). In plasma, a larger number of metabolites had stable concentrations for 24 h on CP as compared to serum. In total, 140 of 159 serum metabolites had stable concentrations on CP for at least 24 h. 19 metabolites showed significant changes in serum concentration during storage on CP within the first 24 h (table 1). 14 metabolite concentrations increased significantly with time (figure S13, S14, S15, S16, and 17), whereas five metabolite concentrations decreased significantly (arginine, putrescine*, serotonin*, spermidine* and hexose). Figure 4 shows examples of altered metabolite concentrations in serum. In concordance with plasma samples, the group of amino acids was the least stable one in serum metabolites.
Effect of tube type on serum metabolites. In multicenter studies, the use of serum gel-barrier tubes is preferred due to the easy handling at the blood collection site and persistent separation of serum during shipment of tubes. In this study, possible interactions of the gel with the metabolites in human serum samples were analyzed. Blood was drawn into serum gel-barrier tubes and serum W tubes with clotting activator only (control). After centrifugation, whole gel-barrier tubes and separated serum from serum W tubes were frozen until analysis.
Methionine sulfoxide showed significantly increased concentrations in serum gel-barrier tubes as compared to serum W (see figure 5). For all other investigated metabolites, serum gel-barrier tubes had no significant effect on concentration, compared to serum W tubes.

Effect
of freeze-thaw cycles on serum metabolites. Repeated freezing and thawing of samples is sometimes inevitable due to a limited number of aliquots. When serum samples were subjected to two freeze-thaw cycles, most metabolite concentrations were stable. Except methionine sulfoxide, all 159 investigated metabolites of this panel maintained stable concentrations after one freeze-thaw cycle. Only eleven out of 159 metabolites revealed significantly decreased concentrations after two freeze-thaw cycles, including C10:1, three amino acids (isoleucine, tryptophane and valine), five phosphatidylcholines (PC aa C32:2, PC ae C36:2, PC ae C36:5, PC ae C40:1 and PC ae C42:0), acetylornithine and SM C16:0 (table S7). Power calculation. The minimal effect size (i.e. absolute difference in metabolite concentration) that could be detected for each metabolite with a given power of 0.8 was calculated for selected sample pairs. table S3 gives estimates of detectable effect sizes for individual metabolites in our study. This information should be informative for planning future studies. This study was not powered to significantly detect smaller effects than those displayed.

Reliability of Serum and Plasma Metabolite Concentrations
The analysis of three fasting blood samples within a 14 day period showed that most of the 159 metabolite levels remained stable within the same individual. In serum and plasma metabolites, 80% of WCV and 70% of BCV values were below 0.25 and showed good reliability within individuals, independent of the characteristics of the study population. An ICC above 0.50 has been defined as good reliability [2]. In our study, more than 75% of metabolites in serum and plasma samples fulfilled this criterion. Compared to a population-based approach, our ICC values were artificially low due to the homogeneous study cohort. Metabolites with good reliability included saturated short-and medium acylcarnitines, amino acids, biogenic amines, glycerophospholipids, sphingolipids and hexose. These findings are in line with Floegel et al. [2].
Creatinine was found to have the lowest WCV and highest reliability in plasma. In muscle cells, the ring closing of creatine forms creatinine in a chemical equilibrium [20]. As the quantity of creatine per unit of skeletal muscle and the reaction to creatinine is constant [21], the produced amount of creatinine is dependent on the muscle mass. Concentration of creatinine is therefore constant and within-person variability is relatively low in healthy people, as long as the muscle mass remains constant.
Reliability of metabolites was slightly higher in serum compared to plasma. 101 metabolite concentrations were significantly higher in serum samples compared to plasma. 86 of these metabolites showed similar results in Yu et al. Two of the 101 metabolites (C10:2 and PC aa C40:2) were excluded by Yu et al. and 12 of these 101 metabolites were not part of the metabolite panel of Yu et al. In our study, SM (OH) C24:1 showed significantly higher concentrations in serum compared to plasma. This finding is not in line with Yu et al. [22,23]. Ladenson et al. found a 5% lower glucose concentration in plasma compared to serum [23]. Although glucose concentration was lower in plasma samples, the difference between serum and plasma levels of glucose was not significant in our study. We found no effect of sex and last meal composition before overnight fasting in contrast to other studies [24][25][26]. This could be explained by the sample size and the unbalanced ratio of men to women.
Our study goes beyond previous investigations by using a broader metabolite panel and measuring three samples per individual at the same time, in a well-established and validated targeted metabolomics approach. The observed high reliability of amino acids was consistent with the results of Floegel et al. [2]. However, reliability for six additional amino acids was shown in our study. The good repeatability of phospholipid measurements in plasma samples observed by Ma et al. was in line with our observations [1]. Classification of reliability of acylcarnitines, glycerophospholipids, sphingolipids and hexose was concordant with that of Floegel et al. [1]. Widjaja et al. in concordance with our study showed that a single measurement of glucose is sufficient in plasma samples [3].
No analytical duplicates were measured. Therefore, it was not possible to further divide the total within-person variability into biological and analytical variability. However, regarding the between and within-plate CVs derived from repeated measurements of reference samples (table S2), analytical variance can be assumed to be low. Only samples of healthy subjects were analyzed, therefore variability in non-healthy patients could vary from these results.
In sum, we found that most metabolite concentrations in a targeted approach are highly stable over time in a fasted individual. The WCVs obtained in this study can help to judge whether single measurements of a metabolite are sufficient to describe differences between groups of human subjects. They can also assist with the interpretation of the effects of a medication or another intervention on an individual's serum or plasma metabolome.

Stability of Metabolites during Shipment Simulation
Simulated shipment on cool packs and at room temperature. The majority of the analyzed metabolites had stable concentrations for at least 24 h on CP (plasma and serum) and at RT (plasma). In plasma and serum samples, no significant changes in concentration during transportation for 24 h on CP and at RT were detected for PC ae's and sphingolipids. The compound class of amino acids was most fragile during simulated transportation. Plasma and serum metabolite stability showed good coherence.
Within the group of acylcarnitines, Yang et al. showed decreasing concentrations of C8:0, C18:1 and C18:2 and stable concentrations of carnitine and eight other acylcarnitines (C2, C3, C4, C5, C6, C14, C16, C16:1) after 24 h at 37uC. The stability of carnitine, C2, C3, C4 and C5 for 24 h at 37uC is in line with our results (24uh at RT), whereas C18:1 and C18:2 showed an opposing trend. Stability of C14 and C16 was not confirmed in our study. In contrast to Yang et al., our results showed stability of C8 for 24 h at RT. The reason for these partly contradictory results could be the small size of three subjects in Yang et al. and the different origin of specimen (rat in Yang et al.) [11]. LC-MS measurement of 18 amino acids in porcine cerebrospinal fluid, stored at RT for up to 2 h, showed increased concentrations in ten amino acids (alanine, asparagine, glycine, glutamate, histidine, isoleucine, phenylalanine, serine, threonine, tyrosine), which is in line with our observations, except that glycine concentration is stable in our experiments [8]. Yang et al. showed stability of nine amino acids for up to 24 h at 37uC in rat plasma. [11]. Only valine and tryptophane stability is in accordance with our results in human plasma samples. The decreasing levels of four PC aa's and increasing levels of eight lysoPCs during storage at RT could be explained by the hydrolysis of phosphatidylcholine to yield a lysophospholipid and free fatty acid, catalyzed by a phospholipase [27]. This result shows remaining activity of phospholipase in noncentrifuged plasma samples at RT but not at 4uC. Increasing lysoPC levels at RT were also observed in Yang et al. [11]. Sphingomyelins, which were found to play an important role in bcell function [28], were the most stable compound class both in plasma and serum samples in this study. This finding indicates low enzymatic activity of sphingomyeline degrading and synthesizing enzymes at 4uC and RT. Decreasing hexose concentration during storage due to glycolysis, especially at RT, is consistent with the literature [9].
Effect of tube type on serum metabolites. The use of serum gel tubes was investigated by comparing samples drawn in serum W tubes and serum gel-barriertubes. Only methionine sulfoxide showed significant changes in concentration in gelbarrier tubes compared to serum W tubes. This indicates a possible interaction of the gel with the investigated metabolite. Thus, based on our data, the use of serum gel-barrier tubes should be avoided if the measurement of methionine sulfoxide is of interest. However, the use of serum gel-barrier tubes is preferred for all other analyzed metabolites in multicenter settings due to the advantages for sample handling.
Effect of freeze-thaw cycles on serum metabolites. Except methionine sulfoxide, all 159 investigated metabolites of this panel had stable concentrations after one freeze-thaw cycle. Only eleven metabolites showed significantly decreased concentrations after two freeze-thaw cycles. Targeted and non-targeted analyses have investigated the stability of several serum and plasma metabolites in repeated freeze-thawing experiments [29][30][31] and showed only minor changes in metabolite concentration due to thawing. Cuhadar et al. showed stability of creatinine and glucose in serum samples for up to ten freeze-thaw cycles, which is in line with our results [31]. In the study of Zivkovic et al. serum samples were exposed to one, two or three freeze-thaw cycles [30]. Concentrations of lysoPC a C16:0 were significantly altered after two freeze-thaw cycles (p = 0.048), which is not in line with our results, as we used a significance level of p = 0.01. The stability of phosphatidylcholine PC 15:0, PC 16:0 and PC 18:0 for up to three freeze-thaw cycles, which was shown by Zivkovic et al., is in line with our results. Based on our data, the optimal method of sample shipment in multicenter trials is to collect samples in serum in gel-barrier tubes, to centrifuge them on site and to ship the samples while frozen, as at higher temperatures some amino acids and biogenic amines are not stable in both serum and plasma. Shipment of non-processed serum or plasma results in instability of more metabolites, but can be a simple and cost-effective alternative for selected questions. Table S1 List of all metabolites. Metabolites that did not pass quality control (#) and metabolites that had a coefficient of variance (CV) across all plates above 25% in reference samples (*) are marked appropriately. (XLS)   (1), WCV and BCV with confidence intervals (CI) in serum and plasma metabolites.

(XLS)
Table S5 Comparison of the median metabolite concentrations in serum and plasma samples. Serum W and plasma-direct baseline samples that were collected on three days within 20 subjects were tested in a pairwise manner using wilcoxon-signed rank test (p,0.01). (XLS) Table S7 Significant changes in serum metabolite concentrations due to freeze-thawing cycles. Overall significant changes in metabolite concentrations in serum gel tubes (Friedman test, significance level p,0.01) and changes after one and two freeze-thaw cycles, each one tested against serum gel control (Wilcoxon signed rank, significance level p,0.01). (XLS)