Fig 1.
The proposed industrial defect detection architecture integrates semantic guidance and hierarchical attention.
It combines a query enhancement mechanism and a multi-scale feature fusion module to improve detection accuracy and structural modeling capability.
Fig 2.
The proposed framework of the query enhancement mechanism with semantic guidance aims to optimize the initial query representation through multi-source semantic fusion and residual feedback.
The semantic pathways are implemented by lightweight attention blocks operating on query tokens, and semantic priors are derived from multi-scale encoder features via pooling and linear projection within the same network. This design enhances the modeling capability of multi-scale defect features and improves the consistency of attention responses.
Fig 3.
The overall architecture of the proposed Hierarchical Attention Fusion Structure integrates multi-scale convolutional encoding and an attention fusion mechanism.
The fused representation is delivered to the Feature Aggregation Buffer for scale aligned consolidation and then mapped by the Branch-wise Generator to form branch specific candidate features. This structure is designed to enhance the representation capability of defect features and to improve semantic consistency modeling.
Fig 4.
Ten typical defects in NEU-DET.
Fig 5.
DAGM2007 Ten typical defects of medium and high precision industrial surfaces.
Fig 6.
Ten typical defects of printed circuit boards in PCB-DET.
Table 1.
Experimental Settings of the Proposed Model.
Table 2.
Comparison of Defect Detection Results on NEU-DET Dataset.
Table 3.
Comparison of Defect Detection Results on DAGM2007 Dataset.
Table 4.
Comparison of Defect Detection Results on PCB-DET Dataset.
Fig 7.
The impact of the number of layers of the hierarchical attention module on the experimental results.
Table 5.
Ablation study of the proposed modules on three datasets.
Fig 8.
Qualitative experimental results on the NEU-DET dataset.
Fig 9.
Qualitative experimental results on the DAMG2007 dataset.
Fig 10.
Qualitative experimental results on the PCB-DET dataset.
Fig 11.
Grad-Cam experimental results on the PCB-DET dataset.
Fig 12.
Grad-Cam experimental results on the NEU-GRAD dataset.
Fig 13.
Reliability diagrams at 30% label noise on NEU-DET, DAGM2007, and PCB-DET.
Compared with the RT-DETR baseline, our QEM-SG + HAF method yields confidence–accuracy curves closer to the ideal diagonal, indicating better calibration under noisy annotations.