Fig 1.
Overview diagram of the process of collecting and building medical image dataset.
The process consists of five steps: data collection from PACS and HIS, PA-view filtering, XML parser, data matching and data annotation.
Fig 2.
The description in a typical radiology report in Vietnam.
The description is divided into four main categories: chest wall, pleura, lungs (parenchyma) and cardiac.
Fig 3.
Radiology reports extraction process for CXR examinations collected from HIS [38].
The original Vietnamese counterparts are put inside square brackets.
Fig 4.
Algorithm for matching a DICOM file obtained from PACS with a radiology report collected from HIS.
Fig 5.
Semi-automated data annotation pipeline.
The system consists of 4 steps, the first 3 steps are automatic and the last one is carried out manually.
Table 1.
Examples of Vietnamese keywords indicate abnormalities in chest wall, pleura, parenchyma, cardiac classes and abnormality out of these four group.
English translations are enclosed in square brackets.
Table 2.
Number of instances which contain five labeled observations in training, validation and the whole dataset.
Table 3.
Evaluation results of proposed labeling tool.
Evaluation was performed on 3001 samples of the validation set.
Table 4.
Experimental results with different pre-train datasets and loss functions.
Model pre-trained on CheXpert dataset and using Asymmetric loss function yields the best performance.
Table 5.
Experimental results with different backbones and input sizes.
Model with EfficientNet-B2 architecture and input size of 768 delivers the best performance.
Table 6.
Performance of EfficientNet-B2 on five classes.
Fig 6.
Pleura class delivered the highest AUC value, at 0.96 (95% CI 0.94, 0.97) whereas chest wall class performed the lowest AUC value, with the figure of 0.81 (95% CI 0.75, 0.85).
Table 7.
The mappings between CheXpert data labels (14 classes) and the proposed set of labels (5 classes).
P and N refer to positive and negative respectively.
Table 8.
Comparison of coarse and fine classification on CheXpert.
Fig 7.
Original images and respective Grad-CAMs.
There is a collarbone (nondisplaced fracture) in the first two figures, while the last two ones containing pleural effusion in the pleura. Both of these pathologies were correctly highlighted.