Performance of deep learning to detect mastoiditis using multiple conventional radiographs of mastoid

doi:10.1371/journal.pone.0241796

Fig 1.

Typical images of each labeling category.

(a,b) AP view (a) and lateral view (b) show bilateral, clear mastoid air cells (red circles) with honey combing pattern of category 0. (c,d) right ear (red circles) of AP view (c) and lateral view (d) shows slightly increased haziness in mastoid air cells suggesting category 1 and left ear (white circles) of both views shows bony defects with air cavities suggesting category 3. (e, f) AP view (e) and lateral view (f) show bilateral, total haziness and sclerosis of mastoid air cells (red circles) suggesting category 2.

More »

Expand

Fig 2.

Location of center points for right and left cropping in AP view.

The yellow dots represent the center points of cropping the right/left ears.

More »

Expand

Fig 3.

Network architectures for predicting mastoiditis.

The CNNs (convolutional neural networks) for single view (a) show a process in which AP and lateral views are separately trained in CNN. The CNN for multiple views (b) shows a process in which AP and lateral views are simultaneously trained. After Log-Sum-Exp pooling, the layers were also marked with dimensions. [1] means 1×1 size vector, and [2] means 1×2 size vector.

More »

Expand

Fig 4.

The receiver operating characteristic curves (ROC curves) for validation set and three test sets.

The area under the ROC curves (AUCs) using the multiple views show statistically significant higher values than AUCs using a single view (AP view or lateral view only) in the validation set and all test sets.

More »

Expand

Table 1.

Baseline characteristics of all data sets.

More »

Expand

Table 2.

Comparison of the diagnostic performance between the algorithm using single view and the algorithm using multiple views in each data set based on labels by conventional radiography.

More »

Expand

Fig 5.

Confusion matrices between predicted labels and temporal bone CT based gold-standard labels.

Predicted labels are normal (label 0) or abnormal (label 1) since the deep learning algorithm was trained based on the dichotomized data (e.g., normal or abnormal).

More »

Expand

Table 3.

Comparison of diagnostic performance for gold standard test set between the deep learning algorithm (using multiple views) and radiologists based on the labels by standard reference (temporal bone CT).

More »

Expand

Fig 6.

Confusion matrices between the predicted labels of a deep learning algorithm and labels based on conventional radiography.

In all test sets, the proportion of the incorrectly diagnosed cases was larger in mild labeled group (category 1) than in severe labeled group (category 2).

More »

Expand

Table 4.

Diagnostic performance of deep learning algorithm in all test sets based on labels by conventional radiography.

More »

Expand

Fig 7.

Class activation mappings of true positive (a), true negative (b), false positive (c), false negative (d), and postoperative state (e) examples.

(a) Lesion related regions in mastoid air cells are detected on both AP view and lateral view. (b) No specific region is detected in either AP view or lateral view. (c) A false lesion related region is detected on lateral view. (d) Equivocal haziness is suspected on both views. The algorithm diagnosed this case as normal. An AP view with bilateral sides (right upper) shows marked asymmetry suggesting abnormality in right side. (e) Both views show lesion related regions.

More »

Expand