The authors have declared that no competing interests exist.
Conceived and designed the experiments: DS AKJ KIB APP JPB. Performed the experiments: DS LS KM FR. Analyzed the data: DS LS KM FR. Contributed reagents/materials/analysis tools: AKJ FR. Wrote the paper: DS LS KM AKJ KIB FR APP JPB.
This study aims to identify novel markers for gestational diabetes (GDM) in the biochemical profile of maternal urine using NMR metabolomics. It also catalogs the general effects of pregnancy and delivery on the urine profile. Urine samples were collected at three time points (visit V1: gestational week 8–20; V2: week 28±2; V3∶10–16 weeks post partum) from participants in the STORK Groruddalen program, a prospective, multiethnic cohort study of 823 healthy, pregnant women in Oslo, Norway, and analyzed using 1H-NMR spectroscopy. Metabolites were identified and quantified where possible. PCA, PLS-DA and univariate statistics were applied and found substantial differences between the time points, dominated by a steady increase of urinary lactose concentrations, and an increase during pregnancy and subsequent dramatic reduction of several unidentified NMR signals between 0.5 and 1.1 ppm. Multivariate methods could not reliably identify GDM cases based on the WHO or graded criteria based on IADPSG definitions, indicating that the pattern of urinary metabolites above micromolar concentrations is not influenced strongly and consistently enough by the disease. However, univariate analysis suggests elevated mean citrate concentrations with increasing hyperglycemia. Multivariate classification with respect to ethnic background produced weak but statistically significant models. These results suggest that although NMR-based metabolomics can monitor changes in the urinary excretion profile of pregnant women, it may not be a prudent choice for the study of GDM.
Type 2 diabetes (T2DM) is one of the most challenging health problems in this century, and its prevalence is rising – in a worldwide perspective, it is projected that by 2030 more than 500 million people will suffer from diabetes. The escalating costs threaten the health care system of any nation, and complications associated with the disease are a major cause of disability, reduced quality of life, and death.
The STORK Groruddalen research program
Metabolomics, unlike more compound-specific analyses of clinical chemistry, is an approach that tries to model changes in broad profiles of metabolites and relate them to health and disease states.
Metabolomics has been successfully applied to the study of a wide range of diseases in humans and animal models, from type 1 and 2 diabetes to autism, asthma and cancer.
In this study we have tested whether urine NMR metabolomics can find biomarkers that identify women at risk of developing GDM in a large, multiethnic prospective cohort study. The analysis yielded a comprehensive overview of urinary metabolite concentrations during and after pregnancy which we report as a secondary objective. We have also studied the influence of ethnic background on the metabolite profile.
The STORK Groruddalen project has been described in detail previously.
Fasting morning midstream clean-catch urine samples were collected at three visits (V1: gestational week 8–20; V2: week 28±2; V3∶10–16 weeks post partum), and routine tests for nitrite, proteinuria and glucosuria were performed using dipsticks. The remainder was aliquoted and stored at −80°C. Albumin and creatinine concentrations were determined using one of each samples’ aliquots while another was reserved for the NMR analysis described in the present article. At visit V2 a 75 g oral glucose tolerance test (OGTT) was performed, measuring fasting (FPG) and 2-hour venous plasma glucose (2-h PG). The participants who were diagnosed with GDM according to WHO definitions (see below), the current standard in clinical practice in Norway, received lifestyle advice and were remitted to their GP or specialist care for follow-up, where few required insulin. Note that this would only influence a small number of observations at the last visit V3, since V1 and V2 were completed before the diagnosis.
GDM was diagnosed independently according to two separate sets of criteria: The first set are the criteria of the World Health Organization (WHO) which define GDM as FPG ≥7.0 or 2-h PG ≥7.8 mmol/L.
The WHO criteria identified 13% of the STORK participants as GDM cases. The graded criteria find a GDM prevalence of 32%, further subdivided into 26% with mild (G1) and 6% with more pronounced hyperglycemia (G2).
Ethnic origin was defined by country of birth of the participant or her mother, whichever was more relevant
The women were given oral and written information, available in eight languages, when attending the Child Health Clinics. Participation was based on written consent. The Regional Ethics committee and The Norwegian Data Inspectorate have approved the study protocol of the STORK Groruddalen research program. The Norwegian Directorate of Health accepted the storage of biological material.
Urine samples were thawed, and 900 µl of sample were buffered with 100 µl of a KH2PO4/KOH solution at pH 7.4 in pure D20, containing NaN3 to inhibit bacterial growth and Trimethylsilyl propanoic acid (TSP) as a frequency and concentration reference. The buffered samples were centrifuged at 13,400
Of a selection of representative samples, two-dimensional spectra were acquired in order to facilitate compound identification.
Eventually, after accounting for missing samples and removing a small number of low-quality spectra, a total of 1,911 urine profiles from 790 of the 823 participants (667, 671 and 573 from visits V1, V2 and V3, respectively) were eligible for further analysis. Among these were 572 matched pairs between visits V1 and V2, 509 between V2 and V3, and 494 between V1 and V3. There were 454 complete series with high-quality spectra from all three visits.
All spectra were preprocessed with an in-house program written in GNU Octave
Further processing and analysis was carried out with the statistics environment R.
Metabolites were identified by using published literature
Finally, both the spectra and the concentration variables were individually normalized to the absolute creatinine concentration of the respective urine sample. As an internal consistency check, the creatinine concentrations as determined by NMR were compared with measurements performed on a Roche Modular (Roche Diagnostics Ltd., Burgess Hill, UK) at the Central Laboratory, Oslo University Hospital, Aker, and found to be in good agreement (R2>95%). Citrate concentrations of a small number of representative samples were also validated against enzymatic measurements
The urine spectra and the concentration variables were then subjected to principal component analysis (PCA, implemented in the R package “pcaMethods”
To investigate the relations between the spectra and given endpoints, i.e. classifications by visit, diagnosis of GDM, or ethnic background, partial least-squares regression (PLS, using the R package “pls”
Univariate statistical summaries and tests were performed based on the creatinine-normalized, log-transformed concentration variables to support and expand upon any multivariate modeling. In particular, median concentrations and the interquartile range (IQR) of their distributions were calculated for all classes and groups encountered throughout this article, and presented in tables. Two-sample t-tests (or k-sample ANOVAs, as appropriate) were carried out to estimate the significance of group differences. As a special case, when comparing the progression of matched pairs of samples between the three visits, individual fold-change factors could be computed from the log differences, and paired instead of two-sample t-test were appropriate. All p-values are reported without correction for multiple testing.
PCA of the urine spectra produced a weak clustering of samples from the three visits, respectively, but suffered from noise due to e.g. positional variations of compound resonances (data not shown). A PCA of the creatinine-normalized, log-transformed concentration variables, shown in
PCA scores plot from creatinine-normalized, log-transformed concentration variables, showing the first and second principal component, i.e. the two linear combinations of the original variables that contain the largest and second-largest overall variation (24% and 10%, respectively). Note the clustering of samples from the three visits (V1, gestational week 8–20: red circles; V2, week 26–30: green triangles; and V3, 10–16 weeks post partum: filled blue circles). Red lines connect corresponding samples from visits V1 and V2; blue lines from V2 and V3. Solid black lines represent the density of the scores from the three visits. The overlap between visit V2 and V3 appears to be the smallest.
The classification potential was demonstrated by PLS-DA on a dummy matrix of all three visits simultaneously: Using 5 components, the model based on the concentration variables correctly classified 86% of the samples compared to 31% in the permutation test. The model based on the spectra, using 7 components, achieved a very similar 85% correct classification.
In order to track the specific changes between the visits, pairwise PLS-DA was carried out using both the spectra and the concentration values. The results of the latter are shown in
Proton NMR spectra of all three urine samples from one healthy participant, showing the region between 0.5 and 0.9 ppm; normalized to the creatinine concentration, BXR baseline correction not yet applied in order to preserve the shape of the broader peaks. The four highlighted signals increase from visit V1 (red line, gest. week 8–20) to V2 (green line, gest. week 26–30) and then disappear at V3 (blue line, 10–16 weeks post partum).
Scatter plot of concentrations (relative to creatinine concentration; log axes) of lactose and an unidentified substance with an NMR signal at 0.62 ppm. Red circles, green triangles and filled blue circles for visit V1 (gestational week 8–20), V2 (gestational week 26–30) and V3 (10–16 weeks post partum), respectively. Note how these two compounds alone reproduce a clustering similar to that in
Visit | Q2 and R2 values of thefirst 3 components |
NMC at 5 componentsand permutation test |
Involved compounds |
V1 → V2 | 0.526, 0.663, 0.679/0.534, 0.676, 0.700 | 89.3 (84.0–93.8)/662 (625–695) | Unknown, 0.62 ppm↑; Unknown, 0.78 ppm↑; Alanine↑; Glycine↑; Lactose↑ |
V2 → V3 | 0.791, 0.868, 0.883/0.794, 0.873, 0.891 | 9.0 (8.0–10.0)/607 (581–633) | Unknown, 0.62 ppm↓; Unknown, 0.78 ppm↓; Alanine↓; Glycine↓; Lactose↑ |
V1 → V3 | 0.690, 0.782, 0.803/0.695, 0.788, 0.815 | 35.5 (32.3–39.0)/601 (578–627) | Unknown, 0.62 ppm↓; Unknown, 0.78 ppm↓; Alanine↓; Glycine↓; Lactose↑ |
A high ratio between Q2 and R2 confirms the validity of the models.
The number of misclassifications (NMC) of the PLS-DA models relative to the random result from permutation testing serves as a performance estimate; 95% CI of the estimates in parentheses.
Spectral signals that could not be assigned to known metabolites are referred to as “Unknown”, along with the locations of their NMR signals. Arrows denote relative increase or decrease between visits. See also
Finally, univariate statistics were employed to quantify the changes.
Median concentration |
Rel. individual FC |
||||
Compound | Visit V1 |
Visit V2 |
Visit V3 |
V1 → V2 | V2 → V3 |
Urea |
2.5 (1.9–3.3) | 2.4 (1.9–3.1) | 2.6 (2–3.3) | −1.01 (0.64) | 1.07 (0.0025) |
Unknown multiplet, 0.55 ppm |
2.3 (1.9–2.9) | 4.9 (4.1–5.9) | 1.7 (1.3–2.2) | 2.12 (7e-200) | −2.88 (2.7e-194) |
Unknown multiplet, 0.62 ppm |
4.2 (3.4–5.3) | 9.4 (7.9–11) | 1.8 (1.4–2.3) | 2.2 (1.4e-219) | −5.14 (2.8e-283) |
Unknown multiplet, 0.78 ppm |
0.2 (0.091–0.34) | 1.2 (0.92–1.6) | 0.026 (0–0.084) | 6.11 (1.4e-151) | −21.8 (1e-125) |
Unknown multiplet, 0.90 ppm |
0.28 (0–1.1) | 2.6 (1.3–4.4) | 0 (0–0.3) | 3.42 (1.1e-42) | −4.35 (8e-24) |
Unknown doublet, 0.75 ppm |
0.047 (0.0085–0.15) | 0.04 (0.006–0.13) | 0.055 (0.01–0.17) | −1.12 (0.22) | 1.45 (0.00027) |
Valine | 0.0059 (0.004–0.008) | 0.006 (0.004–0.008) | 0.0041 (0.003–0.006) | −1.02 (0.29) | −1.45 (2.5e-34) |
Leucine | 0.004 (0.003–0.005) | 0.0045 (0.003–0.006) | 0.0025 (0.002–0.004) | 1.12 (2.1e-11) | −1.83 (8.2e-89) |
Unknown doublet, 1.08 ppm |
3.5 (2.3–5) | 5.4 (3.6–8.1) | 2.1 (1.4–3.1) | 1.61 (5.9e-68) | −2.63 (1.4e-106) |
Unknown doublet, 1.11 ppm |
11 (7.5–16) | 19 (14–25) | 2.9 (1.9–4.4) | 1.78 (3.1e-114) | −6.66 (1.2e-231) |
3-Aminoisobutyrate |
0.0098 (0.004–0.022) | 0.011 (0.0041–0.024) | 0.008 (0.003–0.019) | 1.08 (0.14) | −1.28 (0.00018) |
Unknown doublet, 1.24 ppm |
2.9 (2.2–4) | 4.4 (3–6.2) | 3.2 (2.3–4.5) | 1.43 (1.6e-38) | −1.33 (2.8e-20) |
Unknown doublet, 1.26 ppm |
1.3 (0.78–1.9) | 1.9 (1.3–2.8) | 0.88 (0.41–1.4) | 1.55 (1.2e-40) | −2.31 (6.2e-65) |
3-Hydroxyisovalerate |
0.0092 (0.007–0.012) | 0.011 (0.0084–0.014) | 0.0054 (0.004–0.007) | 1.2 (5.8e-28) | −2.1 (7.9e-145) |
2-Hydroxyisobutyrate | 0.0071 (0.006–0.008) | 0.0076 (0.006–0.009) | 0.0054 (0.004–0.006) | 1.11 (1.0e-14) | −1.44 (2.5e-87) |
comb. Lactate/Threonine |
29 (20–43) | 56 (39–83) | 8.7 (5.8–13) | 1.89 (1.6e-89) | −6.3 (2.6e-202) |
Alanine | 0.04 (0.029–0.054) | 0.065 (0.046–0.09) | 0.02 (0.015–0.028) | 1.62 (1.7e-84) | −3.22 (1.5e-191) |
Lysine |
0.02 (0.011–0.034) | 0.016 (0.009–0.026) | 0.0093 (0.005–0.016) | −1.26 (4.7e-08) | −1.64 (6.8e-23) |
Acetaminophen metabolites |
9.3 (6.5–14) | 10 (6.7–14) | 5.8 (3.5–9.4) | 1.04 (0.27) | −1.56 (1.1e-14) |
Acetone | 0.003 (0.002–0.005) | 0.004 (0.003–0.006) | 0.003 (0.002–0.005) | 1.1 (0.38) | −1.09 (0.49) |
N-Acetylglutamine | 0.055 (0.04–0.08) | 0.048 (0.03–0.07) | 0.07 (0.045–0.09) | −1.29 (1.5e-13) | 1.6 (5.6e-34) |
Unknown singlet, 2.35 ppm |
17 (9.4–28) | 16 (8.1–26) | 21 (11–31) | −1.13 (0.00072) | 1.3 (3.7e-11) |
Citrate | 0.24 (0.18–0.3) | 0.25 (0.18–0.32) | 0.15 (0.1–0.21) | 1.03 (0.077) | −1.64 (2.7e-56) |
Dimethylamine | 0.041 (0.037–0.046) | 0.047 (0.043–0.052) | 0.042 (0.038–0.048) | 1.14 (1.9e-18) | −1.08 (1.8e-06) |
Unknown singlet, 2.78 ppm |
3 (2.4–3.9) | 2.9 (2.4–3.9) | 3.6 (2.9–4.8) | −1.01 (0.68) | 1.26 (6.8e-17) |
Trimethylamine N-oxide | 0.038 (0.025–0.064) | 0.037 (0.022–0.064) | 0.041 (0.027–0.073) | −1.12 (0.019) | 1.22 (0.00055) |
Glycine | 0.24 (0.17–0.36) | 0.32 (0.23–0.45) | 0.11 (0.056–0.18) | 1.32 (3.9e-41) | −3.06 (6.6e-118) |
Creatine | 0.079 (0.033–0.15) | 0.086 (0.042–0.16) | 0.05 (0.021–0.13) | 1.16 (0.00022) | −1.56 (2.7e-17) |
Creatinine |
10 (7.3–14) | 8.5 (6.3–12) | 12 (8.8–16) | −1.19 (8.6e-13) | 1.33 (1.5e-25) |
Trigonelline | 0.011 (0.0066–0.018) | 0.012 (0.0068–0.023) | 0.014 (0.0073–0.027) | 1.11 (0.017) | 1.14 (0.012) |
Lactose | 0.04 (0.025–0.06) | 0.1 (0.073–0.14) | 0.24 (0.14–0.39) | 2.47 (6.4e-122) | 2.11 (2.8e-51) |
1-Methylnicotinamide | 0.009 (0.007–0.012) | 0.013 (0.0093–0.017) | 0.0053 (0.003–0.008) | 1.41 (4.3e-42) | −2.58 (5.4e-111) |
Unknown singlet, 4.51 ppm |
0 (0–0.062) | 0 (0–0.25) | 0 (0–0.17) | 1.1 (0.66) | −1.41 (0.14) |
Ascorbate |
0.006 (0.004–0.011) | 0.01 (0.006–0.016) | 0.006 (0.004–0.011) | 1.58 (9.6e-14) | −1.47 (6.9e-10) |
Glucose | 0.1 (0.079–0.13) | 0.14 (0.11–0.18) | 0.079 (0.064–0.11) | 1.39 (6.3e-35) | −1.77 (2.3e-64) |
Acetaminophen glucuronide |
0.033 (0.023–0.046) | 0.039 (0.028–0.053) | 0.029 (0.022–0.043) | 1.18 (3.9e-08) | −1.18 (1.3e-05) |
Unknown multiplet, 5.02 ppm |
0.89 (0.43–1.8) | 0.93 (0.46–1.8) | 1 (0.54–1.9) | 1.07 (0.18) | 1.12 (0.044) |
Unknown doublet, 5.08 ppm |
0.089 (0–0.26) | 0.084 (0–0.29) | 0.12 (0–0.29) | −1.02 (0.8) | 1.02 (0.87) |
Unknown doublet, 5.20 ppm |
1.8 (1.3–2.5) | 2.5 (1.9–3.6) | 1.2 (0.86–1.7) | 1.38 (6.3e-34) | −2.1 (1.2e-75) |
Sugar doublets, 5.23 ppm |
7.8 (6.1–10) | 16 (12–20) | 22 (15–33) | 1.98 (4.6e-120) | 1.27 (3.9e-11) |
Unknown doublet, 5.30 ppm |
0.53 (0.34–0.81) | 1.2 (0.71–1.7) | 0.66 (0.47–0.93) | 1.99 (7.6e-78) | −1.61 (1.1e-45) |
Unknown doublet, 5.42 ppm |
0.29 (0.077–0.83) | 0.52 (0.18–1.2) | 0.42 (0.12–1.1) | 1.56 (3.3e-07) | −1.12 (0.22) |
1,6-Anhydroglucose |
0.003 (0.001–0.006) | 0.005 (0.002–0.01) | 0.003 (0.002–0.007) | 1.64 (1.1e-14) | −1.4 (2.3e-08) |
Unknown doublet, 6.58 ppm |
0.015 (0–0.17) | 0.013 (0–0.18) | 0.027 (0–0.22) | 1.0 (0.99) | 1.08 (0.59) |
4-Hydroxyphenylacetate | 0.009 (0.007–0.01) | 0.01 (0.08–0.014) | 0.01 (0.08–0.014) | 1.07 (0.028) | −1.01 (0.72) |
Tyrosine | 0.012 (0.008–0.016) | 0.013 (0.0092–0.019) | 0.0051 (0.003–0.008) | 1.16 (1e-12) | −2.9 (1.2e-91) |
N-Phenylacetylglycine |
0.074 (0.047–0.11) | 0.065 (0.038–0.099) | 0.09 (0.054–0.13) | −1.15 (0.00018) | 1.37 (3e-14) |
Acetaminophen sulfate |
0.03 (0.023–0.039) | 0.033 (0.025–0.042) | 0.03 (0.023–0.04) | 1.09 (0.00011) | −1.01 (0.72) |
Hippurate | 0.21 (0.13–0.32) | 0.22 (0.14–0.33) | 0.23 (0.14–0.36) | 1.02 (0.49) | 1.01 (0.68) |
Histidine region |
0.058 (0.052–0.065) | 0.068 (0.061–0.075) | 0.053 (0.048–0.06) | 1.17 (1.2e-76) | −1.26 (6.3e-101) |
Formate | 0.035 (0.026–0.052) | 0.052 (0.038–0.073) | 0.016 (0.011–0.025) | 1.51 (1.9e-59) | −3.28 (3.4e-147) |
While the median (IQR) values are reported on their natural scale, all parametric statistical tests were carried out using the log-transformed, normally distributed variables.
Visit V1: gestational week 8–20; V2: gestational week 26–30; V3∶10–16 weeks post partum.
Urea is affected by NMR water suppression.
Broader spectral signal; measured before subtracting BXR baseline.
Compound very dilute to undetectable in a substantial number of samples; reported concentration may represent noise.
Concentration of creatinine is reported as absolute mM before normalization.
No conclusive identification, quantified as stated.
Concentrations of unidentified signals in arbitrary units, but nonetheless individually normalized to creatinine.
The multivariate analysis with respect to GDM yielded few positive results. All Q2 were negative for both sets of diagnostic criteria, with the following exceptions: PLS-DA of the concentration variables according to the WHO criteria resulted in barely positive Q2 = 2% at visit V1, with the unidentified signal at 1.11 ppm and citrate contributing most to the loading weights. Using a dummy matrix of the graded criteria, PLS-DA yielded Q2 = 2% for the pronounced hyperglycemia class G2 at visit V2, involving citrate, glucose and the unidentified signal at 1.08 ppm.
A subsequent univariate analysis supported and quantified these findings.
Median concentration (IQR) [mM/mM Creatinine] | t-test | |||
Visit |
Compound | WHO: healthy | WHO: GDM | p-value |
V1 | Citrate | 0.23 (0.18–0.29) | 0.28 (0.19–0.34) | 2e-4 (↑) |
Unknown doublet, 1.08 ppm |
3.4 (2.2–4.8) | 4 (2.5–6.4) | 0.007 (↑) | |
Unknown doublet, 1.11 ppm |
11 (7.5–15) | 14 (8.9–18) | 9e-4 (↑) | |
V2 | Citrate | 0.24 (0.17–0.31) | 0.3 (0.22–0.4) | 7e-5 (↑) |
Unknown doublet, 1.08 ppm |
5.4 (3.5–7.9) | 5.6 (4.2–10) | 0.01 (↑) | |
Unknown doublet, 1.11 ppm |
19 (14–25) | 24 (17–29) | 0.001 (↑) |
Visit V1: gestational week 8–20; V2: gestational week 26–30.
Concentrations of unidentified signals in arbitrary units, but nonetheless normalized to creatinine.
Median concentration (IQR) [mM/mM Creatinine] | ANOVA | ||||
Visit |
Compound | healthy (G0) |
GDM, mild (G1) |
GDM (G2) |
p-value |
V1 | Glucose | 0.1 (0.078–0.13) | 0.1 (0.083–0.14) | 0.13 (0.096–0.19) | 6e-6 |
Lysine | 0.019 (0.01–0.031) | 0.025 (0.012–0.039) | 0.026 (0.018–0.044) | 4e-4 | |
Citrate | 0.23 (0.18–0.29) | 0.24 (0.17–0.32) | 0.29 (0.21–0.38) | 0.04 | |
Unknown doublet, 1.08 ppm |
3.3 (2.1–4.7) | 3.9 (2.5–5.2) | 4.4 (2.9–6.8) | 9e-4 | |
Unknown doublet, 1.11 ppm |
11 (7.6–15) | 12 (8.1–17) | 14 (9.1–15) | 0.01 | |
V2 | Glucose | 0.14 (0.11–0.18) | 0.14 (0.1–0.19) | 0.18 (0.15–0.27) | 6e-6 |
Citrate | 0.24 (0.17–0.31) | 0.26 (0.19–0.34) | 0.35 (0.29–0.45) | 3e-5 | |
Unknown doublet, 1.08 ppm |
5.3 (3.4–7.7) | 5.6 (4–8.4) | 7.8 (4.5–13) | 2e-4 | |
Unknown doublet, 1.11 ppm |
19 (14–25) | 19 (15–26) | 21 (16–30) | 0.03 |
Visit V1: gestational week 8–20; V2: gestational week 26–30.
Classification by graded criteria based on modified IADPSG definitions and the HAPO study: Healthy, normoglycemic (G0); GDM with relatively mild hyperglycemia (G1) or with more pronounced hyperglycemia (G2). Thresholds are defined in the Materials section.
Concentrations of unidentified signals in arbitrary units, but nonetheless normalized to creatinine.
Expanding on the differences observed for the graded classes,
Median concentration (±95% CI of the median as dashed lines) of urine citrate concentration relative to creatinine levels at the three visits (V1: gestational week 8–20; V2: gestational week 26–30; V3∶10–16 weeks post partum), shown separately in red, green and blue, respectively, for the three graded classes based on modified IADPSG definitions and the HAPO study (G0: healthy, normoglycemic; G1: GDM with relatively mild hyperglycemia; G2: GDM with more pronounced hyperglycemia). Insets show the mean patient-wise relative fold-change (±95% CI of the mean, based on log values) between visits V1 and V2 (panel A), and V2 and V3 (panel B), respectively. Note the sharper rise and subsequent fall of urinary citrate associated with the severity of GDM.
PLS-DA using a dummy matrix of ethnic background categories (excluding the very small number of South Americans) was carried out at the three visits and resulted in overall significant models. However, a closer inspection of the cross-validated predictions (see
Group |
Visit | Q2, R2 at6 comp. | NMC, permutation test at 6 comp. |
West | V1 | 0.305/0.392 | 69 (63–74)/161 (150–174) |
V2 | 0.330/0.410 | 77 (73–83)/165 (150–179) | |
V3 | 0.317/0.404 | 77 (72–82)/138 (126–146) | |
S.Asia |
V1 | 0.108/0.270 | 109 (105–116)/122 (111–130) |
V2 | 0.112/0.242 | 122 (118–126)/121 (114–130) | |
V3 | 0.210/0.315 | 90 (85–95)/105 (97–113) | |
E.Asia |
V1 | 0.087/0.149 | 37/35 (32–37) |
V2 | 0.087/0.162 | 31/30 (26–31) | |
V3 | 0.066/0.113 | 34/32 (29–34) | |
ME.CA.NA. |
V1 | 0.027/0.100 | 102/85 (80–91) |
V2 | 0.066/0.164 | 99 (97–100)/87 (80–94) | |
V3 | 0.038/0.149 | 84 (83–85)/73 (67–78) | |
Sub-Sahara | V1 | 0.036/0.098 | 52/48 (45–52) |
V2 | 0.062/0.112 | 48/44 (41–47) | |
V3 | 0.045/0.128 | 37/35 (32–37) |
Excluding South America, n = 12.
S.Asia: South Asia;
E.Asia: East Asia;
ME.CA.NA.: Middle East, Central Asia and North Africa.
Using the categories “Western”, “South Asian” and “Other”, follow-up ANOVAs were performed on all concentration variables at all visits. Significant results are shown in
Median concentration (IQR) [mM/mM Creatinine] | ANOVA | ||||
Visit | Compound | West | South Asia | All Others | p-value |
V1 | Formate | 0.03 (0.024–0.04) | 0.04 (0.028–0.059) | 0.043 (0.029–0.061) | 2e-9 |
Alanine | 0.034 (0.027–0.047) | 0.045 (0.032–0.065) | 0.044 (0.031–0.058) | 5e-8 | |
3-Hydroxyisovalerate | 0.008 (0.0062–0.011) | 0.01 (0.0075–0.013) | 0.01 (0.0079–0.013) | 2e-8 | |
1,6-Anhydroglucose | 0.0035 (0.002–0.009) | 0.003 (0.002–0.006) | 0.002 (0.001–0.004) | 6e-8 | |
Lactate/Threonine |
25 (17–34) | 34 (23–57) | 36 (23–55) | 4e-12 | |
Unknown multiplet, 0.55 ppm |
2.1 (1.8–2.5) | 2.4 (1.9–3.1) | 2.5 (2–3.1) | 7e-9 | |
V2 | Formate | 0.043 (0.034–0.058) | 0.061 (0.042–0.08) | 0.066 (0.046–0.094) | 1e-20 |
Alanine | 0.059 (0.042–0.08) | 0.073 (0.051–0.093) | 0.069 (0.047–0.1) | 1e-5 | |
3-Hydroxyisovalerate | 0.0094 (0.008–0.012) | 0.012 (0.01–0.016) | 0.012 (0.01–0.015) | 3e-12 | |
1,6-Anhydroglucose | 0.006 (0.0032–0.012) | 0.004 (0.002–0.007) | 0.004 (0.002–0.007) | 8e-6 | |
Lactate/Threonine |
49 (36–70) | 64 (42–90) | 62 (43–93) | 3e-6 | |
Unknown multiplet, 0.55 ppm |
4.5 (3.9–5.4) | 5.4 (4.3–6.4) | 5.3 (4.4–6.6) | 5e-10 | |
V3 | Formate | 0.013 (0.0093–0.019) | 0.02 (0.014–0.029) | 0.02 (0.013–0.029) | 2e-11 |
Alanine | 0.016 (0.013–0.023) | 0.024 (0.019–0.031) | 0.022 (0.017–0.031) | 8e-14 | |
3-Hydroxyisovalerate | 0.0048 (0.003–0.007) | 0.006 (0.004–0.008) | 0.006 (0.004–0.008) | 3e-6 | |
1,6-Anhydroglucose | 0.0041 (0.0017–0.01) | 0.003 (0.001–0.006) | 0.003 (0.001–0.005) | 2e-6 | |
Lactate/Threonine |
6.9 (4.7–11) | 9.8 (7.3–14) | 11 (7.1–15) | 5e-11 | |
Unknown multiplet, 0.55 ppm |
1.5 (1.2–2) | 1.9 (1.4–2.3) | 1.8 (1.4–2.3) | 5e-4 |
Concentrations of the unidentified signal and the combined lactate and threonine variable in arbitrary units, but nonetheless normalized to creatinine.
Finally, the previous ANOVAs with respect to GDM according to the graded classes were repeated separately for the aforementioned three ethnic categories. Compared to
Although its primary aim was the detection of patterns in the urinary metabolome related to GDM, the most immediate finding of the metabolomics efforts that are part of the STORK Groruddalen cohort study is that the changes of the composition of urine during and after pregnancy are substantial enough to clearly differentiate between sampling time points.
The most prominent developments were the steady increase of urinary lactose, and the increase-decrease pattern of a number of NMR signals between 0.55 and 1.10 ppm. Resonances in this region have sometimes been associated with bile acids
The general increase of lactose is a well-known phenomenon and is linked to lactation and the prolactin levels in blood. Appreciable lactose concentrations in urine are usually first observed at the end of the second trimester, between gestational week 20 and 28, followed by a steady increase over the remainder of the pregnancy and another sharp rise in the days after delivery.
Besides lactose and the presumed hormones, many other concentration variables showed statistically significant developments between the visits. However, since the absolute creatinine concentration, which was used for normalization, varied by 20–30% between visits it is not clear which of the smaller changes are specific to pregnancy and which are due to dilution effects. Nonetheless, it is common clinical practice to relate analyte concentrations in individual urine samples to creatinine, and thus doing so facilitates the comparison of our findings with other reports.
The metabolomics approach has been previously applied in pregnancy research and has successfully identified biomarker candidates
In a broader perspective, type 1 and 2 diabetes have been studied rather extensively.
Most pertinent to our own work, perhaps, are the studies by Zhao et al.
It has also been reported previously that ethnic and geographic background can have a large effect on the urinary excretion profile as measured by NMR.
NMR-based metabolomics has demonstrated only limited usefulness in the study of a condition with, in the majority of cases, only mild metabolic changes, and several factors aggravate the situation: Even though NMR profiling has a high reproducibility even across laboratories, is non-selective, non-targeted and yet quantitative, its lower sensitivity compared to mass spectrometry means that only a subset of the metabolome can be surveyed. Furthermore, disease-associated concentration changes –or patterns thereof– must be larger than unrelated intra-individual and intra-group variations in order to be recognized by univariate or multivariate statistical methods. The type of study influences the amount of such variations: Case-control or animal studies, for example, typically aim for a high degree of homogeneity which facilitates biomarker detection but may limit their generalizability. Our material from a cohort study, on the other hand, gives a more realistic representation of the population at large but consequently contains more biological variation. The fact that urine also exhibits stronger variations in ionic strength than other matrices, leading to noise in the form of peak shift variations, does not simplify matters. We addressed this complication by primarily working with a matrix of concentrations instead of the raw spectra. Note, however, that all multivariate analyses also were carried out on the spectra. They never outperformed the matrix (data not shown). Apart from analytical difficulties, two opposing effects present challenges to the detection of disease states from urine: While diurnal and dietary variation may increase individual variation that is unrelated to the hypothesis being tested, the body’s homeostasis and renal regulation may mask the impact of the disease on the excretion profile.
Note, however, that all these considerations do not categorically invalidate urinary NMR metabolomics. The robustness and ease of use of NMR and in particular the non-invasive nature of urine sampling makes this approach attractive. And while, by its very definition, the search for biomarkers cannot guarantee positive results, the successful classification of the three visits clearly demonstrates that this is possible even in the face of the adverse influences listed above.
The immediate result of the present study is that urine-based NMR metabolomics can differentiate between time points during and after pregnancy and thus track its development, but that it could not identify reliable biomarkers for gestational diabetes mellitus (GDM) in a large, multiethnic population: The pattern of urinary metabolites, at least above micromolar concentrations, is not influenced strongly and consistently enough by the condition. Nonetheless, an increase of excreted citrate correlated with the severity of GDM was observed, that was consistent with earlier findings.
The authors would like to express their gratitude to Eberhard Humpfer from Bruker BioSpin GmbH, Rheinstetten, Germany, for his invaluable assistance in setting up the NMR profiling methods and protocols.