A multi-parameterized artificial neural network for lung cancer risk prediction

doi:10.1371/journal.pone.0205264

Table 1.

The demographics of the NHIS dataset that was used in our ANN.

We show means and standard deviations for the continuous variables, means for the binary variables, and the percentage for each race.

More »

Expand

Fig 1.

A sketch of our ANN.

All lines are weights connecting one layer to next, with each circle either being an input, neuron, or output. The bias terms are analogous to intercepts and they improve the model’s performance.

More »

Expand

Table 2.

A description of the inputs used in our ANN.

More »

Expand

Fig 2.

The sensitivity and specificity for the training and validation datasets as functions of the cutoff values.

More »

Expand

Fig 3.

An ROC plot for our ANN’s training and validation datasets.

More »

Expand

Fig 4.

An ROC plot for our ANN’s training and validation datasets as well as the performance of Random Forest and Support Vector Machine.

More »

Expand

Fig 5.

Cumulative distribution function for high risk (solid line) and low risk (dashed line) population without cancer (orange) and population with cancer (blue) populations in the validation dataset.

Allowing for a 1% misclassification rate (black line), we can divide individual cancer risk into 3 categories: high (red), medium (yellow), and low (green, too narrow to see on the left of this figure).

More »

Expand

Table 3.

NHIS 2016 data risk stratification results by our ANN.

More »

Expand

Table 4.

The various screening methods, with their sensitivities and specificities.

More »

Expand