Machine learning models can predict subsequent publication of North American Spine Society (NASS) annual general meeting abstracts

doi:10.1371/journal.pone.0289931

Table 1.

Variables used as inputs (i.e., features) into the machine learning models.

More »

Expand

Table 2.

Demographic breakdown of presented and published abstracts across the NASS AGM 2013–2015 for the entire dataset.

More »

Expand

Fig 1.

Network plot representing the correlation of features in the training set.

Colour represents direction according to the scale on the right-hand side. Line thickness and proximity of features represent the strength of correlation. *Represent categorical features.

More »

Expand

Fig 2.

Receiver operator curve (ROC) plot for the models during training and testing.

Models with larger areas under the ROC represent better models.

More »

Expand

Fig 3.

Confusion matrices of various algorithms applied to the testing data.

Matrices to be interpreted like a 2-by-2 epidemiologic table, with true positives and true negatives on the top left and bottom right corners and false positives and false negatives in the top right and bottom left corners.

More »

Expand

Table 3.

The mean of the resampled accuracy, area under the receiver operator curve (AUC), sensitivity, specificity, positive predictive value (PPN) and negative predictive value (NPV) during model training, cross validation, and testing.

More »

Expand

Fig 4.

Bar plot representing the top ten most important features used by the random forest model.

Importance is represented by percentage (%) normalized with respect to the most important feature.

More »

Expand