Fig 1.
Sample images from the Kaggle [41] dataset (a) Negative for pneumonia (b) Typical (c) Indeterminate (d) Atypical.
Table 1.
Distribution of image classes.
Fig 2.
Each EfficientNetn (where n = 1, 2, 3 …, 10) has been trained on a different subset of train data. The variants of YOLOv5an (where n = 1, 2) have been trained in the same manner.
Fig 3.
The original image (a) and the results of the selected image pre-processing techniques—(b) Unsharp Masking and Histogram Equalisation (c) CLAHE (d) Psuedo-Coloring.
Fig 4.
Image Augmentation [from Top to Bottom] Original, Horizontal Flipped, Vertical Flipped, Saturation, Contrast, Rotation, Shear, Zoom and Shift.
Fig 5.
Region of interests for opacity localisation.
Fig 6.
Proposed neural network model.
Table 2.
Hyperparameter values used in proposed architecture.
Table 3.
Model performance metrics.
Table 4.
Combined confusion matrix.
Table 5.
Mean average precision.
Fig 7.
Results of opacity localisation: (a) Ground Truth (b) WBF Top 3 Bounding Boxes (c) WBF Top 5 Bounding Boxes (d) WBF Top 10 Bounding Boxes (e) WBF All Bounding Boxes with lowered threshold.
Table 6.
Mean average precision for the training dataset.
Table 7.
Comparison with top scoring methodologies on [41].
Table 8.
Comparison of proposed methodology with existing techniques on RSNA dataset.
Fig 8.
Gradient-weighted Class Activation Mapping (Grad-CAM): From left to right–Negative for pneumonia, typical appearance, indeterminate and atypical appearance. The heat map of the activations show that the area of high activations shrinks in diseased images as compared to healthy images.