EFEN-YOLOv8: Surface defect detection network based on spatial feature capture and multi-level weighted attention

doi:10.1371/journal.pone.0339617

Fig 1.

EFEN-YOLOv8 architecture.

Thermal maps demonstrate the effectiveness of each improvement component. The enhanced model successfully captures defect information at shallow layers while maintaining focus on defect features through multi-scale fusion and attention mechanisms.

More »

Expand

Fig 2.

SAConv module architecture and computational flow.

The module employs multi-scale kernel operations followed by adaptive pooling and attention mechanisms for enhanced shallow feature extraction.

More »

Expand

Fig 3.

Geometric configuration for -FEIoU computation.

The predicted bounding box (blue), ground truth box (red), and minimum enclosing rectangle (yellow) define the spatial relationships used in loss calculation. Parameters b, w, and h represent box centers, widths, and heights respectively.

More »

Expand

Fig 4.

LSKA module architecture and computational flow.

The module processes input features through cascaded depth-wise convolutions: standard DW-Conv followed by dilated DW-D-Conv operations. Results are concatenated with the original feature map after convolution to produce the final attended output. Parameters: C denotes input channels, H and W represent spatial dimensions, d controls dilation rate, and k defines maximum receptive field extent.

More »

Expand

Fig 5.

WASPP module architecture and multi-scale feature integration.

The module employs parallel convolutional branches with varying receptive fields, followed by adaptive weighting mechanisms and feature concatenation. Each pathway contributes scale-specific information that is selectively emphasized through sigmoid-based attention before final fusion.

More »

Expand

Fig 6.

Representative defect categories in NEU-DET dataset.

Each class exhibits distinct morphological characteristics and varying degrees of visual complexity, with irregular spatial distributions that challenge detection algorithms.

More »

Expand

Fig 7.

Defect category distribution in GC10-DET dataset.

The ten defect classes represent diverse steel surface anomalies with varying scales, textures, and morphological characteristics.

More »

Expand

Table 1.

mAP values under different losses with a training-to-testing ratio of 9:1.

More »

Expand

Table 2.

mAP values under different losses with a training-to-testing ratio of 8:2.

More »

Expand

Table 3.

Effects of different LSKA convolution kernel in NEU-DET Dataset.

More »

Expand

Table 4.

Effects of different LSKA convolution kernel in GC10-DET Dataset.

More »

Expand

Table 5.

The ablation results of each module.

More »

Expand

Fig 8.

Comparative feature extraction visualization through HiResCam analysis.

Heat maps demonstrate superior defect localization capabilities of our proposed architecture compared to baseline YOLOv8n across representative defect categories, revealing enhanced sensitivity to subtle anomalies and improved spatial feature extraction.

More »

Expand

Table 6.

Performance comparison with state-of-the-art detection methods on NEU-DET dataset.

More »

Expand

Table 7.

Generalization performance comparison across different detection architectures.

More »

Expand

Fig 9.

Comparative visualization of detection performance across different methods on industrial defect samples.

More »

Expand

Table 8.

Statistical significance analysis of ablation components across 5 random splits on NEU-DET dataset.

The evaluation indicators mainly include the mean (), standard deviation (), 95% confidence interval, improvement over baseline (), and p-value of mAP scores.

More »

Expand

Table 9.

Statistical analysis of experimental results across 5 random splits.

The evaluation indicators mainly include the mean (), standard deviation (), and 95% confidence interval of mAP scores.

More »

Expand

Fig 10.

Confusion matrices demonstrating classification performance under different training-testing data splits.

More »

Expand