Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants

doi:10.1371/journal.pone.0213653

Fig 1.

An illustrative schematic for AutoPrognosis.

In this depiction, AutoPrognosis constructs an ensemble of three ML pipelines. Pipeline 1 uses the MissForest algorithm to impute missing data, and then compresses the data into a lower-dimensional space using the principal component analysis (PCA) algorithm, before using the random forest algorithm to issue predictions. Pipelines 2 and 3 use different algorithms for imputation, feature processing, classification and calibration. AutoPrognosis uses the algorithm in [19] to make decisions on what pipelines to select and how to tune the pipelines’ parameters.

More »

Expand

Table 1.

List of algorithms included in AutoPrognosis.

More »

Expand

Table 2.

Performance of all prediction models under consideration.

More »

Expand

Table 3.

Variable ranking by their contribution to the predictions of AutoPrognosis.

More »

Expand

Table 4.

Performance of AutoPrognosis in the diabetic patient subgroup.

More »

Expand

Table 5.

Variable ranking for the diabetic population.

More »

Expand

Fig 2.

Predictive ability of the UK Biobank variables for men and women.

Each point represents a variable in the UK Biobank ordered by the ability to predict CVD events for men and women. Predictions based solely on age achieved an AUC-ROC of 0.632 ± 0.003 for men and 0.665 ± 0.002 for women. We report the AUC-ROC from models trained with individual variables in addition to age, and only display variables that achieved a statistically significant improvement in AUC-ROC compared to predictions based on age only. Each color represents a different variable category. Variables deviating from the (dotted gray) regression line have an AUC-ROC that differs between men and women more than expected in view of the overall association between the two genders, suggesting a stronger relative importance in one gender group.

More »

Expand