Modeling health risks using neural network ensembles

doi:10.1371/journal.pone.0308922

Fig 1.

An example of training a classification network.

In the left panel, each dot corresponds to weight and waist circumference measurements for thousands of participants in NHANES. The dot colors (blue/orange) correspond to participants ground truth class (negative/positive to a given health condition, e.g., hypertension). In the right panel, a neural network classifies previously unseen data points as positive or negative according to a separating hypersurface in as many dimensions as the input biomarkers.

More »

Expand

Fig 2.

Visual overview of the proposed neural network ensemble.

More »

Expand

Fig 3.

A network’s categorical output can be converted into a risk map.

The Softmax network output (A) can be interpreted as a surface defined over the multi-dimensional space of input biomarkers (C). We can then slice that surface into regions based on percentiles of Softmax network output. In each of these “risk regions” we calculate the condition prevalence by simply counting how many participants in that region are positive, versus the total number of people in the region (D). The multi-dimensional shape of the risk regions is defined by the network output, and the actual risk associated with each region is simply the condition prevalence in that region.

More »

Expand

Fig 4.

Visualization of 3D health risk maps via 2D slices for {percent body fat, waist circumference, thigh circumference} and {percent body fat, waist circumference, weight} input sets.

The dark maroon regions correspond to the top 10% of Softmax model outputs; light maroon is the next 20%, white is the middle 40%, light green is the next 20%, and dark green is the lowest 10%.

More »

Expand

Table 1.

Performance of different neural network architectures vs. number of input features using AUROC (test) as the evaluation metric.

More »

Expand

Table 2.

Comparison of health risk prediction performance using BMI-only input (baseline) and multiple input features for diabetes, hypertension, and any-condition models.

More »

Expand

Table 3.

Softmax output thresholding to establish binary positive and negative outputs.

More »

Expand

Fig 5.

BMI vs. the distribution of Softmax outputs conditioned on BMI for two different models: BMI-only input (A), and all 8 inputs (B). Both models predict the union (any) among nine common health conditions. (C) highlights the large stratification of Softmax outputs at 30 BMI for the 8-inputs model.

More »

Expand

Fig 6.

Any-condition prevalence is highly correlated with age.

More »

Expand

Table 4.

Comparison of health risk prediction performance using BMI-only input (baseline) and multiple input features for diabetes, hypertension, and condition-agnostic (“Any”) models for different age groups.

More »

Expand

Fig 7.

Visualization of improved ensemble generalization compared to single models.

The separating surface is similar between single networks and ensembles, but ensembles yield smoother, monotonic risk regions as we move immediately outside the region of high data density.

More »

Expand

Fig 8.

Average performance asymptotically increases (orange) and variability in performance asymptotically decreases (blue) as neural network models are added to the ensemble, as measured across 64 trials for each ensemble size.

More »

Expand