Fig 1.
Inherent in metal surface defect data, we identify two primary types of intra-class differences that significantly affect metal surface defect detection.
Firstly, the is delineated through the examination of 3 support-and-query pairs, showcasing the distinct differences within the same defect categories. These differences are often due to diverse manufacturing processes, lighting conditions, or noise interference, leading to distinct appearances of defects like Steel, Rail, or Aluminum (Al). Secondly, the is highlighted in another set of 3 support-and-query pairs, where defects exhibit differences caused by optical lens distortions or the specific perspective of the image capture. This can result in the altered shape, scale, and orientation of defect instances, further complicating the detection and classification process.
Fig 2.
Comparison of traditional approaches and our multi-protorype learning.
Firstly, traditional models extract features from image based prototypes. In contrast, LDMP-RENet employs local descriptor-based multi-prototype to represent more implicit local relations. Secondly, our model generates features in local (by Reasoning operation) and global views (by Excitation operation and Global Edge Infomation operation). The two differences are addressed separately after acquiring the local-view graph space features (represented by yellow squares) and the global-view features (represented by blue squares).
Fig 3.
LDMP-RENet for 5-shot segmentation.
(1) represents the process of Multi-Prototype Reasoning. (2) denotes the Multi-Prototype Excitation. Given and
from the above steps, we will get the prediction
by (3) Information Fusion Module. Finally, we utilize BCE loss to train our model.
Fig 4.
It activates foreground multi-prototype via the channel and spatial attention and then yields the activated multi-prototype
.
Table 1.
Compare with state-of-the-art metal surface defect FSS and amateur networks on Surface Defect- in mIoU and FB-IoU under 1-shot and 5-shot. The best and second best results are highlighted with bold and underline
Table 2.
Compare with state-of-the-art metal surface defect FSS and amateur networks on FSSD-12 in mIoU and FB-IoU under 1-shot and 5-shot. The best and second best results are highlighted with bold and underline
Fig 5.
Qualitative results of baseline and components of our LDMP-RENet on Surface Defect-.
Contrary to MPE, MPE* lacks the incorporation of global edge information
Fig 6.
Qualitative outcomes for the baseline, CPANet,TGRNet and our LDMP-RENet in 1-shot setting.
The left panel is from Surface Defect-, and the right one is from FSSD-12. Each row from top to bottom represents the support images with ground-truth (GT) masks (), query images with GT masks (), CPANet results (), TGRNet results (), and our results (), respectively. The 1st to 3rd and 6th to 9th columns correspond to semantic intra-class difference, whereas the 4th and 5th columns illustrate the distortion intra-class difference.
Table 3.
Comparison of model performance on Surface Defect-. “FLOPs" indicates the computational overhead. “#Params." indicates the number of learnable parameters
Fig 7.
Qualitative analysis of LDMP-RENet with different noise on Surface Defect-.
Table 4.
Ablation studies on each component on the Surface Defect-4i. Contrary to MPE, MPE* lacks the incorporation of global edge information
Table 5.
Ablation studies on GCN and MPR on the Surface Defect-. GCN indicates the conventional graph reasoning
Table 6.
1-shot mIoU and FB-IoU of ablation study for Resnet and VGG.
Fig 8.
Ablation experiment of K-shot on Surface Defect-.
(a) and (b) denote the K-shot performance of VGG-16 and ResNet-50 respectively.
Table 7.
Computational costs with LDMP-RENet on Surface Defect-.
Fig 9.
Pipeline working mode based on LDMP-RENet.