Fig 1.
Landmarks of the lateral semicircular canals.
(A) Left anterior LSCC landmark. (B) Left posterior LSCC landmark. (C) Right anterior LSCC landmark. (D) Right posterior LSCC landmark.
Fig 2.
Inference part of the landmark localization models.
(A) Fully connected inference. (B) DSNT inference.
Fig 3.
Downsampling network (Down-Net).
Fig 4.
Convolutional only network (ConvOnly-Net).
Fig 5.
U network (U-Net).
Fig 6.
Spatial configuration network (SCN).
Fig 7.
Cascaded pyramid network (CPN).
RB: residual block.
Fig 8.
Four-stage landmark detection framework.
At the first stage, the full images are resized and the approximate center of landmark locations is determined. In the second stage, this center is used as the center of a window of arbitrary size for crop size optimization. The third stage uses the optimal crop window (obtained from stage 2) to train deep learning models. In the fourth and final stage, the best models selected from the previous stage are fine-tuned for robustness validation.
Fig 9.
Training and validation results for the approximate landmark detection stage.
Fig 10.
An example showing six outer layers removed from the image in the row+ direction.
Fig 11.
Changes in prediction error with iterative layer removal.
Image layers were iteratively replaced with zero values in each of six directions: row + , row-, column + , column-, slice + , and slice-. The vertical bars represent the number of layers removed when the prediction error increased by 10% and 50% relative to the original prediction error.
Fig 12.
Crop size optimization.
Table 1.
Landmark localization results of the networks with fully connected inference. Best results are shown in bold.
Table 2.
Landmark localization results of the networks with DSNT inference. Best results are shown in bold.
Table 3.
Landmark location results of the optimised and cross validated models.
Fig 13.
Bland–Altman plots for Down-Net with FC inference on the additional dataset.
Differences (in millimeters) between the model’s predictions and the human expert’s annotations are shown across all three axes. The bias (mean difference) is shown as a blue dashed line, and the limits of agreement (mean ± 1.96 × SD) are shown as red and green lines.
Fig 14.
Bland–Altman plots for U-Net with DSNT inference on the additional dataset.
Differences (in millimeters) between the model’s predictions and the ground truth are shown across all three axes. The bias (mean difference) is shown as a blue dashed line, and the limits of agreement (mean ± 1.96 × SD) are shown as red and green lines.
Table 4.
Landmark localization results for the additional dataset. The performance indices shown are: median, minimum, maximum, first quartile (Q1), third quartile (Q3) and interquartile range (IQR). P-values < 0.05 provide evidence, at the 5% significance level, that the median error lies below 0.5 mm.
Fig 15.
Landmark localization results.
Blue and red points represent expert annotations and model predictions, respectively. The axial, coronal, and sagittal views are shown from left to right. (A) Anterior landmark with an error distance of 0.19 mm. (B) Posterior landmark with an error distance of 0.40 mm.