α-Hydroxybutyrate Is an Early Biomarker of Insulin Resistance and Glucose Intolerance in a Nondiabetic Population

Background Insulin resistance is a risk factor for type 2 diabetes and cardiovascular disease progression. Current diagnostic tests, such as glycemic indicators, have limitations in the early detection of insulin resistant individuals. We searched for novel biomarkers identifying these at-risk subjects. Methods Using mass spectrometry, non-targeted biochemical profiling was conducted in a cohort of 399 nondiabetic subjects representing a broad spectrum of insulin sensitivity and glucose tolerance (based on the hyperinsulinemic euglycemic clamp and oral glucose tolerance testing, respectively). Results Random forest statistical analysis selected α-hydroxybutyrate (α–HB) as the top-ranked biochemical for separating insulin resistant (lower third of the clamp-derived MFFM = 33 [12] µmol·min−1·kgFFM −1, median [interquartile range], n = 140) from insulin sensitive subjects (MFFM = 66 [23] µmol·min−1·kgFFM −1) with a 76% accuracy. By targeted isotope dilution assay, plasma α–HB concentrations were reciprocally related to MFFM; and by partition analysis, an α–HB value of 5 µg/ml was found to best separate insulin resistant from insulin sensitive subjects. α–HB also separated subjects with normal glucose tolerance from those with impaired fasting glycemia or impaired glucose tolerance independently of, and in an additive fashion to, insulin resistance. These associations were also independent of sex, age and BMI. Other metabolites from this global analysis that significantly correlated to insulin sensitivity included certain organic acid, amino acid, lysophospholipid, acylcarnitine and fatty acid species. Several metabolites are intermediates related to α-HB metabolism and biosynthesis. Conclusions α–hydroxybutyrate is an early marker for both insulin resistance and impaired glucose regulation. The underlying biochemical mechanisms may involve increased lipid oxidation and oxidative stress.

Traditional clinical tests do not measure IR directly and, as a result, a variety of methods have been developed: the gold standard hyperinsulinemic euglycemic clamp (HI clamp); insulin tolerance test; steady state plasma glucose (SSPG) following fixed somatostatin/glucose/insulin infusions; and modeling analysis of the oral glucose tolerance test (OGTT) or frequently sampled intravenous glucose tolerance test (FSIVGTT) [16]. However, such procedures are mostly confined to clinical research settings due to cost and time constraints. Fasting insulin and derived indices (HOMA, QUICKI) have been widely used [17], but lack of insulin measurement standardization strongly limits their accuracy and has prevented adoption in routine clinical practice. The identification of novel markers for detection of IR subjects remains an unmet need.
Further, this approach may reveal markers that are useful for identifying individuals at risk of progression to T2D and CVD, whereby enabling implementation of effective strategies for disease prevention and patient monitoring.
The RISC study (Relationship of Insulin Sensitivity to Cardiovascular Risk), comprising a nondiabetic cohort, was initiated to address how IR may contribute to T2D and CVD progression. We report here on a global biochemical profiling technology developed for the discovery of new biochemical biomarkers. This technology has been successfully applied to identify biochemicals associated with disease, toxicity and aging [18,19,20]. Here it was applied to identify biochemicals associated with IR and dysglycemia in 399 subjects, a subset of the RISC cohort, in which insulin sensitivity was measured directly by the HI clamp. We found that a-hydroxybutyrate (a-HB) is the most significant metabolite associated with insulin sensitivity and, interestingly, as an early marker for dysglycemia. The biochemical pathway for a-HB and its potential involvement in IR and dysglycemia are briefly discussed. Monitoring changes in the concentration of a-HB in fasting human plasma may provide novel insights on how early stages of IR evolve into T2D or CVD.

Biochemical Profiling Analysis
Fasting plasma samples from the RISC cohort were analyzed in a non-targeted fashion on three separate mass spectrometry platforms, UHPLC-MS/MS (+/-ESI) and GC-MS (+EI), with 485 biochemicals measured, as illustrated in Figure 1A. Each participant's insulin sensitivity was measured using the hyperinsulinemic euglycemic (HI) clamp; the distribution of M FFM (M FFM = insulin-mediated glucose disposal rate, mmol?min 21 ?kg FFM 21 ) in the 399 RISC subjects analyzed is shown in Figure 1B. Taking a commonly used classification approach [11,21,22,23], the bottom tertile of insulin sensitivity of the entire EGIR-RISC cohort (n = 1293) (i.e., M FFM #45 mmol?min 21 ?kg FFM

21
) was defined as IR. By this criterion, M FFM was 33 [12] mmol?min 21 ?kg FFM 21 , median [interquartile range], in the IR group (n = 140) and 66 [23] mmol?min 21 ?kg FFM 21 in the more insulin sensitive (IS) subjects. The demographic and metabolic characteristics of the 399 subjects under analysis are described in Table 1.

a-HB is inversely associated with insulin sensitivity
To assess the ability to classify subjects as IS or IR, Random Forest (RF) analysis was performed. As shown in Figure 2, the organic acid, a-hydroxybutyrate (a-HB) was the top-ranked metabolite in the resulting importance plot, which ranks the classifiers based upon contribution of each to the separation of the subjects into classes. In this analysis the subjects were classified as either IS or IR with approximately 76% accuracy (inset). This result did not change when normalizing the M value for kg of body weight rather than kg of fat-free mass (data not shown).
Univariate correlation analysis of the data from the biochemical profiling screen also ranked a-HB as the metabolite with the highest correlation to the glucose disposal rate (r = 20.45, p-value 1.40E-21, Table 2). a-HB negatively correlated with total glucose disposal for both M FFM (fat free mass, mmol?min 21 ?kg FFM 21 ) and M WBM (whole body mass, mg?min 21 ?kg 21 , data not shown). Summarized in Table 2 are additional candidate biomarkers correlative to insulin sensitivity as measured by the euglycemic clamp (M FFM ) with overlap observed with the initial RF analysis ( Figure 2).
Since the initial analyses were based upon relative quantification data obtained from the non-targeted biochemical profiling technology, a targeted assay was developed to provide absolute quantitative results. As shown in Figure 3, a-HB was consistently higher (p,0.0001 for both the screening and targeted data) in IR subjects compared to IS subjects, whether measured by the screening platform or by the targeted isotopic dilution assay.

a-HB in dysglycemic subjects
Subjects were classified as normoglycemic or dysglycemic based upon the results of fasting plasma glucose (FPG) and the oral glucose tolerance test (OGTT) as illustrated in Figure 4. Subjects with 2-hour glucose levels ,7.8 mmol/l were deemed normal  glucose tolerant (NGT) while those with 2-hour glucose between 7.8-11.1 mmol/l were deemed as having impaired glucose tolerance (IGT). Individuals with fasting plasma glucose levels $5.6 mmol/l were classified as having impaired fasting glucose (IFG). Thus, based on insulin sensitivity and glucose tolerance subjects were classified into four categories: NGT insulin sensitive (NGT-IS); NGT insulin resistant (NGT-IR); IFG; and IGT.   Shown in Figure 5 is a heat map of the global biochemical profiling data set illustrating the statistical significance of changes in the biochemicals in the various pair-wise group comparisons.
Four classes of metabolites that differentiate NGT-IS from NGT-IR and/or NGT-IS from dysglycemia (IFG or IGT) are highlighted. The organic acids a-ketobutyrate (a-KB), a-HB and creatine readily distinguish NGT-IS subjects from both IGT and IFG subjects, whereas a-HB and creatine serve as early indicators of IR by readily distinguishing NGT-IS from NGT-IR subjects. Similarly, lipid species such as acylcarnitines and lysoglycerophospholipids also distinguish NGT-IS and NGT-IR subjects and NGT-IS from IGT, with high statistical significance. In contrast, fatty acids such as palmitate are later stage markers of impaired glucose regulation, and only distinguish NGT-IS from IGT subjects in the continuum of insulin resistance.

Targeted analysis of metabolites correlative of insulin sensitivity
Consistent with previous reports [24], M FFM was significantly lower in each of the IFG, IGT, and NGT-IR groups in comparison with the NGT-IS group (p,0.0001 for each), as illustrated in Figure 6A, while plasma a-HB concentrations ( Figure 6B), were the mirror image of M FFM . Using the targeted assay, the measured levels of a-HB were significantly (p,0.0001) higher in the NGT-IR, IFG and IGT groups as compared to the NGT-IS group. Relatedly, by partition analysis, an a-HB concentration of 5 mg/ml was found to best separate IR from IS subjects. Furthermore, based upon multiple logistic regression analysis, a-HB was significantly associated with IR independently of center (collection site), sex, age, and BMI, with an odds ratio of 2.84 (C.I.: 2.02-4.00, p,0.0001) for each SD ( = 1.7 mg/ml) of plasma a-HB.
Interestingly, RF analysis ranked a-HB as the most important metabolite to classify NGT and IGT subjects, with a .70% classification accuracy (data not shown). Consistent with these observations, a-HB levels were significantly higher in IGT than NGT subjects (p,0.0001), as shown in Figure 6B. To test whether a-HB levels segregated with glucose dysregulation in general, we grouped together IFG and IGT into one IGT category, and by multiple logistic analysis a-HB was significantly associated with IGT independently of center, sex, age, and BMI, with an odds ratio of 2.51 (C.I.: 1.81-3.49, p,0.0001) for each SD of plasma a-HB. Furthermore, both IR and IGT were each independently associated with an a-HB concentration in the top tertile of its plasma concentrations (i.e., 5.9 [1.7] mg/ml), with respective odds ratios of 3.26 (C.I.: 1.83-5.81, p,0.0001) and 2.72 (C.I.: 1.51-4.92, p = 0.0009) after adjustment for center, sex, age, and BMI.
In addition to measuring a-HB by absolute quantitation, targeted assays were also developed for candidate IR biomarkers identified by RF and correlation analyses, with examples of representative biochemical classes highlighted in Figure 5. The results of these targeted assays are presented in Figure 6C-F. For example, the lysophospholipid 1-linoleoylglycerophosphocholine ( Figure 6C) and long-chain acylcarnitines such as decanoylcarnitine ( Figure 6D) decrease in concentration with increasing insulin resistance and dysglycemia. Similarly, levels of the amino acid glycine were observed to trend downward with IR ( Figure 6E). In contrast, similar to a-HB, the saturated fatty acid palmitate is inversely correlated with insulin sensitivity ( Figure 6F). Related to this latter finding, a direct relationship between fasting plasma a-HB concentrations and the mean free fatty acid (FFA) level during the clamp (which averaged 30 [40] mmol/l) was observed; this association was highly statistically significant (r 2 = 0.25, p,0.0001) even after adjusting for center, sex, age, and BMI (data not shown). . The X-axis shows the groups and the Y-axis shows the relative normalized intensity for a-HB median scaled to 1. B. Box plot of a-HB concentrations measured using targeted isotopic dilution assays (targeted data). The X-axis shows the groups and the Y-axis shows a-HB concentration in mg/ml. In the box plots the top and bottom of the box represent the 75th and 25th percentile, respectively. The top and bottom bars (''whiskers'') represent the entire spread of the data points for a-HB and each group, excluding ''extreme'' points, which are indicated with black squares. The black arrowheads indicate the mean value and the gray arrowheads indicate the median value. doi:10.1371/journal.pone.0010883.g003  Table 3 are representative targeted assay results for top-ranking IR candidate markers, with regard to their correlation to M FFM value and their fold changes in concentration from the bottom tertile to the top two-thirds of insulin sensitivity (green: decreased fold-change; red: increased fold-change). Consistent with the screening data, a-HB is highly correlated to the glucose disposal rate (r = 0.45, p-value 1.15E-21).  . Biochemicals showing significant change in subjects with IR and/or dysglycemia. A heat map graphical representation of p-values obtained from statistical analysis of the global biochemical profiling of metabolites measured in plasma collected from NGT-IS, NGT-IR, IGT, and IFG subjects. t-tests were performed to determine those metabolites that significantly increase or decrease in insulin resistant (IR) and dysglycemic individuals (IGT, IFG). Highlighted from the main heat map include an organic acid, a-HB, the top-ranked biochemical for separating NGT-IS from NGT-IR and NGT-IS from IGT; a cluster of long-chain fatty acids such as palmitate that are pronounced when comparing NGT-IS to IGT; and acyl-carnitines and acylglycerophosphocholines that distinguish NGT-IR and IGT from NGT-IS. The color coding used, from white to dark blue, indicate the most significant to least significant, respectively, with white, most statistically significant (p#1.0E-16); light blue (1.0E-16#p#0.001), royal blue (0.001#p#0.01), and dark blue, not significant (p$0.1). doi:10.1371/journal.pone.0010883.g005

Discussion
Using a non-targeted biochemical screening approach in a large and well characterized cohort of nondiabetic subjects representing a wide spectrum of insulin sensitivity, we identified a-hydroxybutyrate (a-HB) as a biomarker segregating with clamp-derived IR in subjects with normal glucose tolerance. Furthermore, a-HB segregated with dysglycemia (IFG+IGT) independently of, and in addition to, IR. Importantly, these associations were independent of sex, age, and BMI. Thus, together with other biomarkers, a-HB may provide a diagnostic tool to identify IR and/or IGT earlier than currently used clinical tests.
a-HB is an organic acid derived from a-ketobutyrate (a-KB) ( Figure 7). a-KB is produced by amino acid catabolism (threonine and methionine) and glutathione anabolism (cysteine formation pathway) and is metabolized to propionyl-CoA and carbon dioxide [25]. a-HB is formed as a by-product during the formation of a-KB via a reaction catalyzed by lactate dehydrogenase (LDH) or ahydroxybutyrate dehydrogenase (a-HBDH) (Figure 7), an LDH isoform present in the heart [26]. Accumulation of a-HB is postulated to occur in vivo when either (a) the formation of a-KB exceeds the rate of its catabolism, which leads to substrate accumulation, or (b) there is product inhibition of the dehydrogenase that catalyzes the conversion of a-KB to propionyl-CoA [25,27].
a-KB is also produced as a result of the conversion of cystathionine to cysteine. Under conditions of increased oxidative stress, a higher flux of cysteine into production of glutathione, the primary antioxidant in cells, occurs from a shift in homocysteine production from transmethylation of methionine to transsulfuration of homocysteine to produce cystathionine [28] (Figure 7). In one report, a-HB was associated with excess glutathione demand and disrupted mitochondrial energy metabolism and shown to derive from hepatic glutathione stress [28], supporting the idea that elevated a-HB may be associated with increased oxidative stress in the IR state.
a-HB may become elevated by at least two mechanisms: (1) elevation of hepatic glutathione stress resulting in an increased demand for glutathione production, and (2) elevation of the NADH/ NAD + ratio due to increased lipid oxidation. The first mechanism likely contributes to increased a-HB formation by supplying more a-KB substrate from increased cysteine anabolism ( Figure 5). Consistent with this interpretation, we observe statistically significant elevation of both a-KB and cysteine with increasing insulin resistance from the global screening data (Figures 2 & 5, Table 2), similar to the trend observed with a-HB. In support of the second proposed mechanism, increased lipid oxidation is a metabolic feature of IR, and is indexed by the insulin-inhibited FFA concentration [7,14]. Our finding of a positive association between steady state FFA and plasma a-HB concentrations in the whole cohort supports the possibility that an increased NADH/NAD + ratio favors reduction of a-KB to a-HB (Figure 7).
Changes in other important IR-associated metabolites within metabolic pathways leading to the formation of a-KB and a-HB are highlighted in Figure 7. For example, reduced levels of glycine ( Figure 6E) and serine upstream of a-KB formation may be consistent with increased gluconeogenesis which is observed with IR in db-/db-mice [29]. Our interpretation that a redox imbalance may contribute to elevated a-HB in the context of IR is consistent with our finding that branched-chain alpha-keto acids, such as 3-methyl-2-oxobutyrate, are elevated with IR ( Table 3). These increases may be due to the effect of the redox imbalance on the directionality of the dehydrogenases that reduce/oxidize these keto acids (Figure 7). In addition, a-HB has also been observed to be elevated in T2D subjects and animal models of T2D, as well as in severe lactic acidosis and ketoacidosis [25,27,30,31,32,33,34]. Interestingly, in normal subjects and T2D patients, it has been shown that restoration of the NADH/ NAD + redox balance by glutathione infusion therapy resulted in improvement of insulin sensitivity and b-cell function in normal subjects and in T2D patients [35].
In a recent study comparing the urinary profiles of 98 intermediary metabolites measured by targeted MS in 74 obese and 67 lean individuals, Newgard et al. identified a metabolic signature for the accumulation of branched-chain amino acids, the glutamine/glutamate couple, several acylcarnitines, and some aromatic amino acids (phenylalanine and tyrosine) using principal component analysis [36]. These metabolites were also related to insulin resistance (as determined by the HOMA index) and interpreted as marking the metabolic consequences of excessive fat and protein intake, with impairment of insulin signaling and mitochondrial overload. It is noteworthy that in the non-targeted metabolomics approach of the present study, lipid molecules, branched-chain amino acids, and acylcarnitines were also featured among the top 30 metabolites that RF analysis associated with the M value ( Figure 2). The current data narrow down the complex interactions of amino acid and lipid metabolism [37] to highlight the importance of a single marker, a-HB, which may reflect oxidative burden in the context of IR.
With an unmet need for a practical clinical test that accurately measures IR in individuals, identification of a-HB as a significant biomarker for separating IR from IS subjects using a fasting plasma sample could lead to development of such a diagnostic test. a-HB in combination with other biochemical and clinical parameters may also prove to be useful as a clinical indicator of subclinical abnormalities of glucose metabolism.

Study subjects
RISC is a prospective, observational cohort study whose rationale and methodology have been published previously [38].
In brief, participants were recruited at 19 centers in 13 countries in Europe, according to following inclusion criteria: either sex, age 30-60 years, clinically healthy, stratified by sex and by age according to 10-year age groups. Initial exclusion criteria were: treatment for obesity, hypertension, lipid disorders or diabetes, pregnancy, cardiovascular or chronic lung disease, weight change of $5 kg in last month, cancer (in last 5 years), and renal failure. Exclusion criteria after screening were: arterial blood pressure $140/90 mmHg, fasting plasma glucose .7.0 mmol/l, 2-hour plasma glucose (on a standard 75-g oral glucose tolerance test [OGTT]) $11.0 mmol/l, total serum cholesterol $7.8 mmol/l, serum triglycerides $4.6 mmol/l, and ECG abnormalities. Baseline examinations began in June 2002 and were completed in July 2005.
Of 1293 clamped RISC subjects, 194 males and 205 femalesmedian age 45 years and median body mass index (BMI) 25.0 kg m 22 (range 16.9-42.9) -were selected for non-targeted biochemical profiling analysis. Based on the OGTT, 256 subjects had normal glucose tolerance (NGT, i.e., fasting plasma glucose ,5.6 mmol/l and 2-hour glucose ,7.8 mmol/l), 82 subjects had impaired glucose tolerance (IGT, i.e., 2-hour glucose between 7.8-11.1 mmol/l), and 61 subjects had impaired fasting glycemia (IFG, i.e., fasting glucose between 5.6-7.0 mmol/l). EGIR-RISC study had undergone appropriate review by the European Commission research program and its ethics committee. Written consent was given by the patients for their information to be stored in the hospital database and used for research purposes, aligned with the analysis described herein. The current retrospective analysis described herein did not require additional review by said ethics committee due to prior approval of future biomedical analyses when EGIR-RISC study was initiated.

Research protocol
Electrical bioimpedance (to measure fat-free mass), routine clinical chemistry, OGTT, and HI clamp were performed as described [38]. Insulin sensitivity was expressed as M FFM , in units of mmol per min per kg of fat-free mass. Plasma free fatty acids (FFA) were measured in the fasting state and at timed intervals during the clamp; the values during the last 40 min of the clamp were averaged to express insulin inhibition of circulating FFA.

Metabolomic analysis
Biochemical profiling was performed using multiple platform (UHPLC and GC) mass spectrometry technology, as described [18,19,39]. Briefly, a broad array of small molecule metabolites, irrespective of class (e.g., amino acids, lipids, carbohydrates), was examined to measure biochemical changes within plasma samples collected after an overnight (10-12 hours) fast. The non-targeted process used single sample extraction followed by protein precipitation to recover a diverse range of molecules (e.g., polar, hydrophobic).

Metabolite identification
Metabolites were identified by automated comparison and spectra fitting to a chemical standard library of experimentally derived spectra as previously described [18,19,39]. Identification of known chemical entities was based on comparison with library entries of purified authentic chemical standards. 485 biochemicals were identified in this global biochemical profiling analysis, with 350 biochemicals measured in .50% of the entire data set. The latter grouping of 350 biochemicals was used in all of the statistical analyses.

Sample preparation
Upon receipt of fasted, baseline plasma samples from HI clamps, aliquots were prepared and immediately frozen at 280uC until time of analysis. At time of analysis, samples were thawed on ice and 100 ml was extracted using an automated MicroLab STARH system (Hamilton Company, Salt Lake City, UT). The samples were extracted using a single extraction with 400 ml of methanol, containing the recovery standards: tridecanoic acid, fluorophenylglycine, chlorophenylalanine and d6-cholesterol. The solvent extraction step was performed by shaking for two minutes using a Geno/Grinder 2000 (Glen Mills Inc., Clifton, NJ). After extraction, the sample was centrifuged and supernatant removed using the MicroLab STARH robotics system. The extract supernatant was split into four equal aliquots: two for UHPLC/ MS, one for GC/MS and one reserve aliquot. Aliquots were placed on a TurboVapH (Zymark) to remove solvent, and dried under vacuum overnight. Samples were maintained at 4uC throughout the extraction process. For UHPLC/MS analysis, extract aliquots were reconstituted in either 0.1% formic acid for positive ion UHPLC/MS, or 6.5 mM ammonium bicarbonate pH 8.0 for negative ion UHPLC/MS. For GC/MS analysis, aliquots were derivatized using equal parts N,O-bistrimethylsilyltrifluoroacetamide and a solvent mixture of acetonitrile:dichloromethane:cyclohexane (5:4:1) with 5% triethylamine at 60uC for 1 hour. The derivatization mixture also contained a series of alkyl benzenes for use as retention time markers.
GC/MS and UHPLC/MS/MS analysis UHPLC/MS was carried out using a Waters Acquity UHPLC (Waters Corporation, Milford, MA) coupled to an LTQ mass spectrometer (Thermo Fisher Scientific Inc., Waltham, MA) equipped with an electrospray ionization source. Two separate UHPLC/MS injections were performed on each sample: one optimized for positive ions and one for negative ions. The positive ion analyses were performed first, followed by negative ion analyses. The mobile phase for positive ion analysis consisted of 0.1% formic acid in H 2 O (solvent A) and 0.1% formic acid in methanol (solvent B), while the mobile phase for negative ion analysis consisted of 6.5 mM ammonium bicarbonate, pH 8.0 (solvent A) and 6.5 mM ammonium bicarbonate in 95% methanol (solvent B). The acidic extracts were monitored for positive ions and the basic extracts were monitored for negative ions in independent injections using separate acid/base dedicated 2.16100 mm Waters BEH C18 1.7 mm particle columns heated to 40uC. The extracts were loaded via a Waters Acquity autosampler and gradient eluted (0% B to 98% B, with an 11 minute runtime) directly into the mass spectrometer at a flow rate of 350 ml/min. The LTQ alternated between full scan mass spectra (99-1000 m/z) and data dependent MS/MS scans, which used dynamic exclusion.
The derivatized samples for GC/MS were analyzed on a Thermo-Finnigan Trace DSQ fast-scanning single-quadrupole MS operated at unit mass resolving power. The GC column was 20 m60.18 mm with 0.18 mm film phase consisting of 5% phenyldimethyl silicone. The temperature program started with an initial oven temperature of 60uC and was ramped to 340uC, with helium as the carrier gas. The MS was operated using electron impact ionization with a 50-750 amu scan range and was tuned and calibrated daily for mass resolution and mass accuracy.

Data normalization
Samples were analyzed over the course of two weeks. Each run day was balanced for age, BMI, gender, OGTT, and insulinmediated total glucose disposal, M FFM ). Within each day run, samples were completely randomized to avoid group block effects. The raw area counts for each metabolite in each sample were normalized to correct for variation resulting from instrument interday tuning differences. For each metabolite, the raw area counts were divided by its median value for each run-day, therefore setting the medians equal to 1 for each day's run. This correctly preserves all variation between samples, yet allows metabolites of widely different raw peak areas to be compared directly on a similar graphical scale. Missing values were assumed to result from areas falling below limits of detection. For each metabolite, missing values were imputed with its observed minimum after the normalization step.

Data extraction and quality assurance
The data extraction of raw mass spectra data files yielded information that was loaded into a relational database and manipulated without resorting to BLOB manipulation. Once in the database the information was examined and appropriate QC limits were imposed. Peaks were identified using Metabolon's proprietary peak integration software, and component parts were stored in a separate and specifically designed complex data structure.
The median relative standard deviation (MRSD), a quality assurance metric of quantification and measure of instrument variability, was determined to be 8% for a panel of 30 internal standards. Overall process variability (i.e., extraction, recovery, resuspension, and instrument performance) for endogenous biochemicals within technical replicate plasma samples was calculated to be 15% MRSD. These SD values reflected acceptable levels of variability for overall process and instrumentation of the analytical platform.
A variety of data curation procedures were carried out to ensure that a high quality data set was made available for statistical analysis and data interpretation. The QC and curation processes were designed to ensure accurate and consistent identification of true chemical entities, and to remove those representing system artifacts, mis-assignments, and background noise. Metabolon data analysts use proprietary visualization and interpretation software to confirm the consistency of peak identification among the various samples. Library matches for each compound were checked for each sample and corrected if necessary. In addition to rigorous identification, the quality of the automated Metabolyzer integration (basis of quantitation) was verified for each biochemical.
For QA/QC purposes a number of additional samples were included with each day's analysis. Briefly, a selection of internal standards was added to every sample, immediately prior to injection into the instrument. These compounds were carefully chosen in order to not interfere with measurement of endogenous compounds. These QC samples were primarily used to evaluate process control for each study. Additionally, a small aliquot of each experimental sample was pooled together to serve as a technical replicate for duration of the run. This technical replicate sample was injected throughout the platform run day and across all run days, allowing variability in quantitation of all consistently detected biochemicals in the experimental samples to be monitored. With this monitoring, a metric on overall process variability was assigned for the platform's performance based on quantitation of metabolites in actual experimental samples (see results section).

Statistical Analysis
Data are given as median and [interquartile range]. Classification and Regression Trees (CART), Random Forest (RF) [40], multiple linear regression, correlation, and logistic regression analyses were carried out on untransformed data, whereas logtransformed data were used for t-testing. When data from NGT, IGT, or IFG categories were used in comparisons for classification by RF, the number of in-bag samples was set to 50% of smallest sub-group to account for unbalanced samples sizes. For platform screening data and targeted analytical data, we used 50,000 and 1,000 trees, respectively. Random forest analysis was performed using the R-package ''randomForest'' [41]. Partition analysis (JMP) was employed to find the metabolite value that best separated the M FFM value into two groups. Multiple logistic regression tested the independent association of metabolites with lower tertile of insulin resistance; results are given as the odds ratio and 95% confidence interval (C.I.). Statistical analyses were performed using JMP (JMP, Version 8. SAS Institute Inc., Cary, NC, 1989-2009), and ''R'' (http://cran.r-project.org/).

Targeted analytical methods
For absolute quantitation, metabolites were analyzed by isotope dilution UHPLC-MS-MS (except for palmitoleic acid, palmitoyllyso-PC, and oleoyl-lyso-PC). 50 ml of EDTA plasma samples were spiked with internal standard solution and subsequently subjected to protein precipitation by mixing with 250 ml of methanol. Following centrifugation, aliquots of clear supernatant were injected onto an UHPLC-MS-MS system, consisting of a Thermo TSQ Quantum Ultra Mass Spectrometer and a Waters Acquity UHPLC system equipped with a column manager module and three different columns. Each sample was analyzed using three different chromatographic systems to cover the various analytes.