Early Antenatal Prediction of Gestational Diabetes in Obese Women: Development of Prediction Tools for Targeted Intervention

All obese women are categorised as being of equally high risk of gestational diabetes (GDM) whereas the majority do not develop the disorder. Lifestyle and pharmacological interventions in unselected obese pregnant women have been unsuccessful in preventing GDM. Our aim was to develop a prediction tool for early identification of obese women at high risk of GDM to facilitate targeted interventions in those most likely to benefit. Clinical and anthropometric data and non-fasting blood samples were obtained at 15+0–18+6 weeks’ gestation in 1303 obese pregnant women from UPBEAT, a randomised controlled trial of a behavioural intervention. Twenty one candidate biomarkers associated with insulin resistance, and a targeted nuclear magnetic resonance (NMR) metabolome were measured. Prediction models were constructed using stepwise logistic regression. Twenty six percent of women (n = 337) developed GDM (International Association of Diabetes and Pregnancy Study Groups criteria). A model based on clinical and anthropometric variables (age, previous GDM, family history of type 2 diabetes, systolic blood pressure, sum of skinfold thicknesses, waist:height and neck:thigh ratios) provided an area under the curve of 0.71 (95%CI 0.68–0.74). This increased to 0.77 (95%CI 0.73–0.80) with addition of candidate biomarkers (random glucose, haemoglobin A1c (HbA1c), fructosamine, adiponectin, sex hormone binding globulin, triglycerides), but was not improved by addition of NMR metabolites (0.77; 95%CI 0.74–0.81). Clinically translatable models for GDM prediction including readily measurable variables e.g. mid-arm circumference, age, systolic blood pressure, HbA1c and adiponectin are described. Using a ≥35% risk threshold, all models identified a group of high risk obese women of whom approximately 50% (positive predictive value) later developed GDM, with a negative predictive value of 80%. Tools for early pregnancy identification of obese women at risk of GDM are described which could enable targeted interventions for GDM prevention in women who will benefit the most.


Introduction
Recent estimates suggest that 7 million women were obese in the UK in 2014, and that by 2025, 1 in 5 women in the world will be similarly affected. [1] Obesity is a major risk factor for gestational diabetes (GDM), increasing the likelihood of the disorder 3-5 fold. [2] Women with GDM require intensive antenatal care to achieve optimal blood glucose control and to identify other common obstetric complications, particularly fetal macrosomia and large for gestational age (LGA) infants. [3] The recent demonstration in a nulliparous prospective cohort of more than 4000 women that diagnosis of GDM is preceded by excessive fetal growth occurring between 20-28 weeks' gestation, and that this is compounded by maternal obesity, provides a clear rationale for early pregnancy risk identification and intervention to prevent GDM and associated fetal growth. [4] The identification of insulin resistance in the absence of overt diabetes in early pregnancy in obese women provides further reason for targeting treatment to obese women early in gestation [5]. This recognition has led to several recent randomised controlled trials (RCTs) of early interventions in unselected obese women to prevent GDM, including dietary and physical activity advice and pharmacological (metformin) approaches, but the majority have been unsuccessful. [6][7][8][9] At present all obese pregnant women are considered to be equally at high risk of developing GDM, whereas approximately only 15-30% (depending on criteria for diagnosis) will develop the disorder. [2] Prediction tools as a means to stratify disease risk are increasingly used in medical [10] and obstetric practice [11] with a focus on precision prevention and treatment for at risk sub-groups. Correctly identifying obese women with heightened risk of GDM early in pregnancy would enable targeted intervention in women most likely to benefit.
There is no accepted strategy to identify obese women at high risk of GDM early in pregnancy. Current clinical risk assessment, such as that recommended by UK National Institute for Health and Care Excellence guidelines, to determine which women should have an oral glucose tolerance test (OGTT) later in pregnancy includes obesity as a risk factor (!30kg/m 2 ). This screening criterion is clearly not applicable for assessment of risk amongst obese women. Previously reported early pregnancy prediction tools for GDM, as yet not adopted in clinical practice, have been constructed in populations unselected for body mass index (BMI). [12][13][14][15][16][17][18] With the inclusion of weight or BMI in all tools, performance amongst obese women is likely to be limited.
We have previously established proof of principle for a prediction algorithm combining clinical variables and biomarkers from 106 obese pregnant women who were recruited to the pilot study of UPBEAT, an RCT of a behavioural (diet and physical activity) intervention. [19] The aim of the present study was to develop a simple, robust and easily accessible GDM prediction tool designed specifically for obese women using the entire UPBEAT cohort, with the intention of facilitating early intervention in those women at the highest risk of the disorder. To achieve this aim we measured 21 biomarkers of biological relevance to GDM and a targeted metabolome of 158 metabolites in early pregnancy samples from 1303 women who were obese. Using statistical modelling to combine clinical variables and the best performing biomarkers we developed several prediction tools with potential for early pregnancy stratification for GDM risk.

Study design
This was a prospective cohort study using clinical data and samples from the UPBEAT trial (ISRCTN 89971375), a multi-centre RCT of a complex dietary and physical activity intervention designed primarily to prevent GDM in obese women, and LGA in their offspring. [8] All participants, including women aged 16 and 17 years (assessed as competent applying Fraser guidelines), provided informed written consent prior to taking part. This process together with all other aspects of the study was approved by the NHS Research Ethics Committee (UK Integrated Research Application System; reference 09/H0802/5). In brief, the UPBEAT cohort comprised 1555 women recruited between 2009 and 2014. Women >16 years of age with a BMI of !30kg/m 2 and a singleton pregnancy were randomised between 15 +0 and 18 +6 weeks' gestation (trial entry) to either standard antenatal care or a physical activity and dietary behavioural intervention superimposed on standard antenatal care. [8,20] For the purposes of this analysis the trial was treated as a cohort study as the primary outcomes (GDM and LGA infants) did not differ between control and intervention groups. [8] Participants Participants were women recruited to the UPBEAT trial with available OGTT data. The trial protocol stated that an OGTT would be performed between 27 +0 and 28 +6 weeks'. We adopted a clinically pragmatic approach and included all OGTTs in a wider time frame (23+0-32 +6 weeks'; mean 27 +5 ). Two individuals were excluded; one because of a positive early OGTT (13 +5 weeks'), and the second because of an uninterpretable OGTT result.

Procedures
At trial entry (mean 17 +0 weeks') clinical data including socio-demographic and clinical characteristics, medical and family history, and information about the index pregnancy were recorded and non-fasting blood samples taken. Blood (whole blood, plasma and serum) was kept on ice, processed within 2 hours and stored at -80˚C. The diagnosis of GDM was according to IADPSG (International Association of Diabetes and Pregnancy Study Groups) criteria, with one or more positive plasma glucose values; fasting !5.1 mmol/l, 1 hour !10.0 mmol/l, 2 hour !8.5 mmol/l, following a 75g oral glucose load. [21] Three sets of analyses contributed to development of the prediction tools; Model 1-clinical and demographic variables (clinical tool), Model 2-the clinical tool with addition of candidate biomarkers, and Model 3-addition of a targeted nuclear magnetic resonance (NMR) metabolome to Model 2. The rationale was to develop the 'simplest' accurate tool. The selection of clinical variables was based on a-priori knowledge of plausible association with GDM including age, ethnicity, socioeconomic status (Index of Multiple Deprivation [8]), parity, BMI, previous GDM, family history (first degree relative with hypertension, ischaemic heart disease, GDM or type 2 diabetes mellitus), polycystic ovarian syndrome (self-reported) and smoking (at trial entry). Maternal anthropometric data and blood pressure (BP) measurements were undertaken by staff trained in these measurements. Maternal skinfold thicknesses (triceps, biceps, suprailiac and subscapular) were measured in triplicate, using Harpenden skinfold Calipers (Holtain Ltd, Felin-y-Gigfran, Crosswell, UK). [22] The mean of the three measurements was used for analyses. Maternal circumferences (waist, hip, thigh, neck, mid-arm and wrist) were measured using a calibrated plastic tape; waist-midway between iliac crest and inferior margin of lowest rib; hip-maximum diameter over buttocks; thigh-maximum diameter; neckmidway between mid-cervical spine and mid-anterior neck; mid-arm-diameter midway between elbow and edge of the acromion with arm held straight; wrist-narrowest point around wrist inferior to radial promontory. BP was recorded using the pregnancy validated Microlife BP3BT0-A blood pressure monitor (Microlife, Widnau, Switzerland).
The candidate biomarkers included 21 analytes with a-priori associations with insulin resistance, GDM or type 2 diabetes mellitus [23]: adipokines (adiponectin and leptin); inflammatory and endothelial markers (interleukin-6, high sensitivity C-reactive protein and tissue plasminogen activator antigen); lipids (triglycerides, total cholesterol, LDL cholesterol and HDL cholesterol); liver associated markers (aspartate aminotransferase, alanine aminotransferase, gamma-glutamyl transferase, sex hormone binding globulin (SHBG), and ferritin); markers of glucose homeostasis (glucose, insulin, haemoglobin A1c (HbA1c), Cpeptide, and fructosamine) and other (vitamin D and human placental lactogen). Analytical methodologies are reported in S1 Table. One hundred and fifty eight metabolites were also measured in serum using an NMR targeted metabolome platform (Brainshake Ltd, http:// brainshake.fi/) including 138 lipid measures (lipoprotein particle subclasses, particle size, cholesterols, fatty acids, apolipoproteins, glycerides and phospholipids), and 20 low-molecular weight metabolites including branched chain and aromatic amino acids, glycolysis metabolites, and ketone bodies. A full list of metabolites is presented in S2 Table. This highthroughput NMR metabolomics approach has been widely used in epidemiological studies, [24][25][26][27] and experimental details (sample preparation and analysis) have been previously described. [27] All blood samples were processed by laboratory technicians blinded to participant data.

Statistical analysis
Statistical analysis was performed using Stata software, version 14.0 (StataCorp LP, College Station, Texas). Distributions of all potential predictors were checked for normality. As women were recruited between 15 +0 -18 +6 weeks' gestation, appropriate clinical factors and all biochemical variables were checked for variation and transformed into gestational age corrected centiles where required (xriml command, Stata [28]). Summary statistics between those who developed GDM and those who did not were compared using either Student's t test or Mann Whitney tests for continuous data as appropriate and chi-squared tests for categorical data. Candidate biomarkers with a non-parametric distribution were log transformed (base2). Following these transformations regression model assumptions (linear associations) were checked. HDL cholesterol showed a non-linear association and was transformed into a categorical variable using a clinically meaningful threshold. [29] Three prediction models were developed. Univariate logistic regression was performed on all factors and a pre-defined p value threshold of 0.1 was used to identify predictors for testing in the multivariate models. Clinical variables below the p value threshold were utilised to construct Model 1. Next, the candidate biomarkers identified in univariate regression were incorporated with the selected clinical variables of Model 1, creating Model 2. Finally, selected clinical and candidate biomarker variables from Model 2 were 'offered' alongside all identified NMR metabolites to generate Model 3. Forward stepwise logistic regression was used for the development of these models.
Predictive accuracy of the three models was assessed (and compared between models) using the Area Under the Receiver Operator Characteristics Curve (AUC). Model calibration was assessed by discrimination of actual versus predicted GDM risk at differing levels of predicted risk. The test performances of the models were assessed using sensitivity, specificity and positive and negative predictive values at different risk thresholds.

Translational prediction models for clinical use
In addition to Models 1-3, we explored a range of clinically translatable models (Models 4-8). The variables included were selected from Models 1-3 or correlated measures. Selection was on the basis of established laboratory assays and ease of measurement in the clinic. No stepwise procedures were undertaken for these models.

Missing data
Missing data for clinical variables was minimal (<1.5%) except for family history of GDM (4%). Candidate biomarkers were available in 73% (n = 953) and NMR metabolites in 69% (n = 895). Most missing blood biomarker data was because participants did not provide a blood sample. All models were constructed using complete data based on each group of factors (clinical, candidate biomarker and metabolome). The sample for Model 1 (clinical model) and Model 3 (including candidate biomarker and metabolome) were 1267 and 770 respectively. All other models used a single data set with complete data for the main clinical and candidate biomarkers (Model 2, 4-8, n = 805). We explored the possibility of bias due to missing data by comparing associations of clinical predictors with GDM in the sample with maximal data (Model 1, clinical factors) to the same associations assessed in the complete case samples (Model 2 and Model 3, biomarkers and metabolome respectively).

Validation of the prediction model
Two methods of ten-fold cross validation were used for internal validation of the different models. [30,31]

Sensitivity analyses
As women with a previous history of GDM are frequently considered as a high risk sub-group necessitating specific management, [3] multivariable and discrimination analyses were repeated for Models 1-3 following removal of women with previous GDM (n = 25).

Results
Of the 1555 participants in the UPBEAT trial, 1303 were included in this study (median BMI 35 kg/m 2 ). Of these, 337 (25.9%) developed GDM (Fig 1). The diagnosis of GDM in the majority of women was based on elevated fasting glucose (72%). A further 24% and 4% were because of raised 1-hour and 2-hour post-load glucose respectively (Fig 2). Women with GDM were older than women who did not develop GDM, and more likely to have had GDM in a previous pregnancy or a first-degree relative with type 2 diabetes mellitus. BMI and BP (systolic and diastolic) were higher in those with GDM, as were skinfold thicknesses and neck, waist, hip, wrist and mid-arm circumferences (Table 1). In univariate analysis, most of the candidate biomarkers and many of the NMR metabolites were associated with GDM (Table 2; S3 Table). GDM related NMR metabolites included lipoprotein particle subclasses, some fatty acids, amino acids and ketone bodies.
Models 1 (clinical factors), Model 2 (plus candidate biomarkers) and Model 3 (Model 2 plus metabolome) are shown in Table 3. A clinical tool including previous GDM, age, systolic BP, sum of maternal skinfold thicknesses and anthropometric ratios (waist:height and neck: thigh) showed good discrimination (AUC 0.71, 95% CI 0.68-0.74). This improved with addition of candidate biomarkers to 0.77 (95% CI 0.73-0.80) (p <0.001 vs Model 1). Candidate biomarkers contributing to this model were HbA1c, glucose, fructosamine, triglycerides, adiponectin and SHBG. The contribution of some clinical factors selected in Model 1 was attenuated by addition of these biomarkers. The addition of the NMR metabolites (Model 3) did not improve upon the performance of Model 2 (AUC 0.77, 95% CI 0.74-0.81; p = 0.22 vs Model 2). All three models were well calibrated (S4 Table).
Models that would easily translate to the clinical setting were also explored. These focused on readily attainable clinical variables, and biomarkers with established and inexpensive assays. Models showed good levels of discrimination (AUC >0.70) ( Table 3). Models also showed a high level of internal validity (Table 3).
Sensitivity, specificity and positive and negative predictive values were estimated at different risk thresholds for all models. Different thresholds were explored to balance sensitivity and specificity. Setting the estimated risk for GDM at !35% as identifying the high-risk sub-group, approximately 50% of this group progressed to GDM, with 80% of those not developing GDM correctly identified as not at risk (Table 4).
When compared to the analyses in the maximal number (n = 1267), the selected variables, their magnitude of associations and the AUC were similar in the sub-groups that were included in Model 2 (n = 805) (S5 Table). In the sub-group used in Model 3 (n = 770) the magnitudes of associations of anthropometric predictors with GDM were stronger and an additional predictor (waist:thigh) appeared in the clinical model for this sub-sample (S5 Table). In both these sub-samples previous history of GDM did not appear as a predictor despite its strong magnitude of association in the larger sample of 1267 women. This is likely related to the smaller numbers of women with previous history of GDM in models 2 and 3 (n = 14, and n = 13 respectively). Sensitivity analysis removing women with a previous history of GDM from Models 1-3 identified similar clinical predictors, candidate biomarkers and NMR metabolites. These models included an additional anthropometric measure (waist:thigh), strengthened the association of previously identified anthropometric measures, and identified further candidate biomarkers and metabolites from the metabolome (S6 Table). The AUC were similar to those found in the primary analysis in Models 1-3.

Discussion
Current clinical guidelines for GDM risk assessment do not differentiate between obese women with differing metabolic risk. In this, the largest and most comprehensive study to date to investigate early pregnancy risk factors for later onset of GDM in obese women, we have used an extensive range of clinical and biomarker variables to develop prediction tools to identify obese women at high risk of GDM. The model with best discrimination combined four clinical characteristics with six candidate biomarkers. Models that focused on a few clinical factors and biomarkers readily available in clinical practice, and with minimal cost, also performed well. In addition, we identified a model that does not require blood sampling, which could be developed for low and middle income countries where the prevalence of GDM and obesity is rapidly increasing. [32] Clinical use of tests such as these which provide risk assessment at the first antenatal visit would enable prompt intervention (behavioural or pharmacological) in at risk women. Women identified as being at low risk of developing GDM should be managed according to clinical guidelines for all pregnant obese women, including dietary advice, and be alerted to potential risks during pregnancy and beyond. The decision to focus on obese pregnant women was predicated by the increasing prevalence of obesity and GDM, the lack of predictive algorithms specific to obese women and because mechanistic pathways leading to GDM may differ in obese compared with normal weight women. Obesity related GDM is initially associated with insulin resistance [33] whereas in lean women an inadequate insulin secretory response to the physiological state of insulin resistance in pregnancy is considered predominant. [34] The recognition that the metabolic defects of maternal obesity are potentially modifiable has stimulated several well conducted RCTs that have tested behavioural (dietary and/or physical activity changes) and a pharmacological intervention (metformin) in early pregnancy to prevent GDM and associated adverse pregnancy outcomes. [6][7][8][9] Most have been ineffective, which we hypothesise is because treatment is more likely to benefit only the sub-group with the highest metabolic risk. Having demonstrated the ability to better identify obese women at risk of GDM there is now the potential through new RCTs to address this hypothesis.   As guidelines frequently recommend specific care pathways for women with a previous history of GDM we performed sensitivity analysis excluding these women. Since GDM discrimination remained similar for later GDM across the three comprehensive prediction models (Models 1-3), there was no rationale to support exclusion of previous GDM.
BMI which is recognised, especially in pregnancy, to be a poor index of fat mass, was superseded in the statistical models by other anthropometric measures, three of which were independent predictors of GDM. To our knowledge these simple measures, whilst recognised in a few earlier reports, [35][36][37] have been largely ignored in assessment of GDM risk. In contrast to clinical guidelines for GDM in the whole population, we found no evidence of a strong role of ethnicity in this mixed ethnic population.
To our knowledge this study is the first to address a wide range of biomarkers in prediction of GDM in obese women. As anticipated, most candidate biomarkers were individually associated with the disorder. The identification of adiponectin in Models 2 and 3 concurs with a   recent review which reported that this adipokine is a predictor of GDM in women of mixed BMI. [38] Of the NMR metabolites univariably associated with GDM, 56 were lipoprotein particle subclasses, with the majority being very large density lipoprotein particle subtypes. Branched chain amino acids, previously associated with insulin resistance and type 2 diabetes mellitus, [39,40] were also associated. As only 2 of these metabolites were selected for inclusion in the combined model, and did not improve the test performance, this metabolome is unlikely to provide valuable biomarkers for clinical GDM prediction. Current practice for evaluating GDM risk in pregnant women does not include risk stratification amongst obese women, treating all those with a BMI of !30kg/m 2 as high risk. With escalating rates of obesity worldwide, these models will become increasingly unuseful. The clinical risk factors in the simple tools are quick to measure with minimal training, and the biomarkers, HbA1c and adiponectin are readily accessible for routine clinical laboratory measurement. An added advantage is that the samples were non-fasted; there are practical and ethical issues in asking women in early pregnancy to attend a clinic appointment in the fasted state, and this is not current practice in the UK. We acknowledge that a fasting sample may have provided additional predictive potential, however the original study protocol was designed pragmatically to allow simple clinical translatability.
Following a recent report of strong associations between abnormal OGTTs, particularly raised fasting glucose, in obese women in early pregnancy and measures of insulin resistance [5], an alternative to the use of an early prediction tool as described here might be to bring forward the OGTT or simply measure fasting glucose. Other than the practicalities of fasting, an early OGTT has yet to be validated in regard to maternal and neonatal outcomes or GDM, as traditionally diagnosed. Moreover, several studies in BMI heterogeneous populations have suggested that an early abnormal glucose test is not adequately sensitive or specific to replace a later OGTT for the diagnosis of GDM. [41,42] All such approaches require further validation and randomised controlled trials to determine the efficacy of either lifestyle or pharmacological interventions in early pregnancy targeted to those identified at risk.

Study strengths and limitations
Strengths include novelty, large sample size and mixed ethnicity of the study population, as well as identification of a range of models with immediate clinical applicability. One limitation in terms of current practice could be the use of IADPSG criteria to define GDM, which although recommended by the World Health Organization [43] is not universally adopted. [3] However, when tested using the current UK diagnostic criteria [3], performance of the prediction tools was comparable. Up to 40% of participants had missing data on one or more of the blood based biomarkers or clinical characteristics and sensitivity analysis did demonstrate some differences in smaller sub-samples, but performance was broadly similar. Because of the uniqueness of the data collection, external validation of the full model was not feasible but validity was strongly supported by internal validation, using two complementary methods. [30,31] We recognise that the study population may not be representative of a 'normal' obstetric population as they were participants in an RCT, an important reason for pursuing external validation of the predictive models. Future evalution studies could also include validation at earlier gestations. Previous GDM did not contribute to Models 2 and 3, but as only 25 women in the cohort had GDM previously, of whom 11 did not develop the disorder in the index pregnancy, a previous history of GDM should be considered for inclusion in future validation studies. Whilst ethnicity was not selected as a predictor in our mixed ethnic population, repetition in specific ethnic groups would be valuable.
In summary, we have demonstrated a method to more accurately identify obese women at high risk of GDM than currently practised. To date, no early pregnancy intervention in obese women has successfully reduced the risk of GDM. The use of the tools described has the potential to enable targeted intervention for those at highest risk and therefore likely to benefit most.
Supporting Information S1