Fig 1.
CCMIM and comparison with state-of-the-art methods.
(a) Results on the RDD2022 dataset. (b) Results on the SDNET2018 dataset. (c) Results on the CCCD dataset.
Fig 2.
CCMIM network architecture diagram.
The model captures and enhances local features through multiple MiM modules, while the SPT module is used for multi-scale feature fusion, improving the model’s computational efficiency. The DDF module strengthens the fusion of fine-grained and coarse-grained features, enhancing feature discriminability.
Fig 3.
Illustration of the vision clue merge block.
Fig 4.
DDF network architecture diagram.
It integrates ECA and DF mechanisms to improve feature fusion and discriminability.
Table 1.
Experimental environment configuration.
Table 2.
Hyperparameter configuration.
Table 3.
Results comparison of different algorithms with the RDD2022 dataset.
The best results are displayed in bold.
Table 4.
Results comparison of different algorithms with the SDNET2018 dataset.
The best results are displayed in bold.
Table 5.
Results comparison of different algorithms with the CCCD dataset.
The best results are displayed in bold.
Fig 5.
Qualitative comparison results of CCMIM on the RDD2022 dataset.
Fig 6.
Qualitative comparison results of CCMIM on the SDNET2018 dataset.
Fig 7.
Qualitative comparison results of CCMIM on the CCCD dataset.
Fig 8.
The relationship between model parameters, computational complexity, and mAP50 of CCMIM on the RDD2022, SDNET2018, and CCCD datasets.
(a) RDD2022 dataset, (b) SDNET2018 dataset, (c) CCCD dataset.
Table 6.
Ablation study.
Table 7.
Ablation study.