High-throughput adaptive sampling for whole-slide histopathology image analysis (HASHI) via convolutional neural networks: Application to invasive breast cancer detection

doi:10.1371/journal.pone.0196828

Table 1.

Set of hand-crafted features used for comparison against the CNN based feature learning approach.

More »

Expand

Fig 1.

Overview of HASHI method.

Overview of the high-throughput adaptive sampling for whole-slide histopathology images method (HASHI) based on CNNs for automated detection of invasive breast cancer (BCa) in WSIs. Training data cohorts: Hospital of the Univ. of Pennsylvania (HUP) and Case Western Reserve Univ. (CWRU). Validation/Testing data cohorts: Cancer Institute of New Jersey (CINJ) and The Cancer Genome Atlas (TCGA).

More »

Expand

Fig 2.

Illustration of the CNN architecture used to distinguish between invasive and non-invasive breast cancer (BCa) image tiles.

The architecture is a 2-layer CNN with 256 neurons in the first layer convolutional (C1) and subsampling/pooling layer (S2) and 256 neurons in the fully-connected layer (FC), (i.e. CS256-FC256). Amongst the various architectures considered, this architecture was selected because it has a good trade-off between classification performance and a shallower architecture (fewer layers).

More »

Expand

Table 2.

Breast cancer data cohorts used for training, validation and testing in the experimental evaluation.

More »

Expand

Table 3.

Comparison between CNN models and state-of-the-art hand-crafted features trained with D₃ and evaluated on D₄ in terms of AUC.

More »

Expand

Fig 3.

Comparison between sampling methods (regular and dense) with HASHI using gradient-based quasi-Monte Carlo sampling (grad-qmc-halton) [59, 61].

The new unseen WSI (A) with its corresponding ground truth annotation from an expert pathologist (B). The probability maps using regular sampling with a step size equal to the patch size (C) and regular dense sampling with step size equal to 1 pixel (D). HASHI involves an iterative process of extracting patch samples (E-H) and obtaining the corresponding probability maps (I-L) for the 1st (E, I), 2nd (F, J), 8th (G, K) and 20th iteration respectively (H, L).

More »

Expand

Fig 4.

Quantitative evaluation of the different sampling strategies in terms of average Dice coefficient (y-axis) versus the number of samples (x-axis) used.

All strategies were trained with D₃ and evaluated with D₅.

More »

Expand

Fig 5.

GPU memory size requirements (Megabytes) for different image dimensions (height × width × channels) for the experimentation of FCN based on CNN₂ model.

More »

Expand

Table 4.

Invasive BCa detection performance of HASHI and the equivalent FCN architecture on D₅ in terms of Dice coefficient.

More »

Expand

Table 5.

Invasive BCa detection performance of our method on the D_test testing dataset in terms of Dice, PPV, NPV, TPR, TNR, FPR, FNR.

More »

Expand

Fig 6.

Performance comparison between HASHI and M₁ in terms of Dice coefficient in the independent D_test test data cohort by varying the classification threshold of the invasive BCa probability map.

More »

Expand

Fig 7.

Results of the invasive BCa probability maps (second and fourth rows) predicted by HASHI on representative WSIs from D_test compared to the ground truth annotations from expert pathologists (first and third rows).

Red regions represent locations identified by HASHI as having a high likelihood of cancer presence while the blue regions represent the lowest likelihood of cancer presence.

More »

Expand