Fig 1.
Region proposal network (RPN) shares the same base CNN with a fast R-CNN network. The region proposal is generated by sliding a small convolutional network over the shared feature maps, and these proposals are used to produce final detection results.
Fig 2.
Base network is truncated from a standard network. The detection layer computes confident scores for each class and offsets to default boxes.
Fig 3.
The diagram shows the basic building block of ResNet [55] and DetNet [66]. (a) After each ResNet block, the resolution is reduce in half. (b) The DetNet preserves the feature map resolution and increases the receptive field by using dilated convolutions.
Fig 4.
The architecture has three modules: Anchor refinement module (ARM), transfer connection block (TCB), and object detection module (ODM).
Fig 5.
Sample frames from different colonoscopy.
(a) has a higher resolution and a warm color temperature; (b) has lower resolution and a green tone; (c) is more natural in color tone but has a transparent cover around the frame edges.
Fig 6.
From frame 1 to frame 146, the camera shows unnoticeable movement.
Fig 7.
Some bad examples of colonoscopy frames.
Fig 8.
Six sample frames from the generated dataset.
Table 1.
Dataset organization.
Table 2.
Experiment setup.
Fig 9.
Three examples of the detection results with the predicted classes and confidence scores.
Table 3.
Results for frame-based two-class polyp detection.
Fig 10.
True positive (green plot) and false positive (red plot) count w.r.t. confidence.
We discard any predictions with a confidence score below 0.01 since they tend to be random predictions.
Table 4.
Result for training on KUMC and testing on the full combined test set.
Table 5.
Result for frame-based one-class polyp detection.
Table 6.
Frame-based one-class polyp detection results for each class.
Fig 11.
Percentage of the dominant class.
Detectors predict the polyp category in each individual frame. The category with more than 50% of all frames is the dominant category for that video sequence. The charts show the percentage of frames classified as the dominant class in each test sequence. (ad) and (hp) on the bottom means ground truth class adenomatous and hyperplastic respectively. Correct predictions are in green and misclassifications are in red.
Table 7.
Result for sequence-based two-class polyp classification.