Fig 1.
Relationship between GlycA and its constituent glycoproteins.
A) Heatmap of the Pearson correlation between GlycA, AAT, AGP, HP, and TF in the 626 DILGOM07 participants with matched NMR metabolite measurements and glycoprotein assay data after log transformation and standardisation. Rows and columns have been ordered in decreasing order of correlation coefficient with GlycA. B) Heatmap of the log transformed and standardised concentrations of GlycA and each glycoprotein. Columns correspond to DILGOM07 participants, which have been hierarchically clustered (average linkage) based on their Euclidean distance calculated on their GlycA and glycoprotein measurements. Rows are ordered as in panel A.
Fig 2.
Comparison of imputation models to glycoprotein immunoassays in the 626 DILGOM07 participants with matched glycoprotein assay and metabolite quantification by NMR metabolomics.
A) Comparison of the imputed glycoprotein levels (y-axes) to the immunoassayed glycoprotein levels (x-axes) after log transformation and standardisation. The r2 value indicates the proportion of variance in the assayed glycoprotein explained by the respective imputation models. B) Boxplots of the Spearman correlation between the imputed and observed concentrations observed in the 10-fold cross validation procedure used for model training. Red triangles show the Spearman correlation between the predicted and observed concentrations in panel A (detailed in S2 Table).
Fig 3.
Glycoprotein associated risks of disease and mortality.
Comparison of Cox proportional hazard ratios (triangles) for the first diagnosis occurrence (hospitalisation or mortality) conferred per standard deviation increase of AAT, HP, AGP, or GlycA in inverse-variance weighted fixed effects meta-analysis of DILGOM07 and FINRISK97. Bars around each hazard ratio indicate the 95% confidence interval. Diagnosis data were analysed for a total of 351 outcomes with >20 events in both DILGOM07 and FINRISK97 over a matched 8-year follow-up period. Models were fit using age as the time scale and adjusting for sex, smoking status, BMI, blood pressure, alcohol consumption, prevalent disease prior to baseline (Methods), and previously identified biomarkers for 5-year risk of all-cause mortality (citrate, albumin, and VLDL particle size). Only outcomes with a significant and replicable association with at least one of AAT, HP, or AGP are shown (Storey-Tibshirani FDR adjusted P-value < 0.05/3, adjusting for the three glycoproteins, in DILGOM07, FINRISK97, and meta-analysis). Associations which were significant and replicable are shown with solid hazard ratios and 95% confidence intervals. The alphanumeric codes in the square brackets indicate the ICD10 code or ICD10 disease group for each diagnosis. The number of events in DILGOM07 and FINRISK97 are shown to the left of each hazard ratio for each outcome. Different numbers of events for the same outcome between biomarkers arise from differences in the number of samples for which each glycoprotein was successfully imputed (Methods). Hazard ratios fit separately in DILGOM07 and FINRISK97 along with comparison to the hazard ratios calculated from the immunoassayed AAT, HP, and AGP measurements can be found in S2 Fig. Hazard ratios for all tested outcomes are detailed in S3 Table.
Table 1.
Cohort characteristics.
Fig 4.
Comparison of biomarkers across all outcomes in meta-analysis of DILGOM07 and FINRISK97.
A) Quantile-Quantile plots of distributions of hazard ratio estimate P-values (y-axis) compared to distribution of expected P-values under the null hypothesis that the corresponding biomarker is not associated with any outcome (x-axis). Hazard ratio estimate P-values are shown after adjustment for multiple testing using the Storey-Tibshirani FDR method. The dashed line indicates the location where p-values would fall if the observed distribution was identical to the null distribution. Points above the red dashed line indicated hazard ratios with FDR adjusted P < 0.05 in the meta-analysis, while points above the blue dashed line indicate hazard ratios with FDR adjusted P < 0.05/3 in the meta-analysis. B) Density plots comparing each biomarker’s distribution of hazard ratio standard errors across all outcomes. C) Density plots comparing each biomarker’s distribution of hazard ratios across all outcomes.
Table 2.
Highlighted gene sets significantly enriched for genes associated with AAT.