Fig 1.
The U-Net constructed with an EfficientNet-B0-based encoder and symmetrical decoder is trained to minimize the following losses: (i) BCE; (ii) Weighted BCE-Dice, (iii) Focal, (iv) Tversky, and (v) Focal Tversky. The trained models predict lung masks in the Montgomery TB CXR collection. The predictions of the top-3 performing models are bitwise-ANDed to produce the final lung mask.
Fig 2.
The EfficientNet-B0-based encoder is truncated at the block-5c-add layer and appended with the classification layers to output multi-class prediction probabilities. GAP denotes the global average pooling layer and DCL denotes the deepest convolutional layer in the trained models. The classification model is trained to minimize the various loss functions discussed in this study. The top-K (K = 3, 5) performing models are used to construct prediction-level and model-level ensembles.
Table 1.
Segmentation performance achieved by the individual models and the bitwise-ANDed ensemble of the top-3 performing models.
Fig 3.
Confusion matrix, AUROC, and AUPRC curves obtained using the model that is trained to minimize the calibrated CCE loss function.
Table 2.
Classification performance achieved by the classification models that are trained using the loss functions discussed in this study.
Table 3.
Performance metrics achieved by the prediction-level ensembles using the top-K (K = 3, 5) models.
Fig 4.
Confusion matrix, AUROC, and AUPRC curves obtained by the weighted averaging ensemble of the top-5 performing models.
Fig 5.
Confusion matrix, AUROC, and AUPRC curves obtained through the weighted averaging ensemble of the predictions of top-3 and top-5 model level ensembles.
Table 4.
Classification performance achieved by model-level ensembles.
Table 5.
Comparison of the proposed approach with the SOTA literature.
Fig 6.
Grad-CAM-based localization of the disease ROIs.
(a) and (h) denote instances of CXR with expert annotations showing bacterial and viral pneumonia manifestations, respectively. The sub-parts (b), (c), (d), (e), (f), and (g) show Grad-CAM-based ROI localization achieved using the models trained with calibrated CCE, CCE with entropy-based regularization, calibrated negative entropy, label-smoothed categorical focal, calibrated categorical Hinge loss functions, and the top-5 model-level ensemble, respectively, highlighting regions of bacterial pneumonia manifestations. The sub-parts (i), (j), (k), (l), (m), and (n) show the localization achieved using the models in the same order as above, highlighting viral pneumonia manifestations.