Fig 1.
Sample images from TSAppleData Dataset, showing various growth stages, lighting conditions, occlusion types, and shooting distances.
Fig 2.
Schematic diagram of partial data augmentation methods for Online Augmentation.
Fig 3.
Schematic diagram of the MSRRT-DETR overall architecture, where ResBlock denotes the residual module.
Fig 4.
Schematic diagram of ResNet architecture.
Fig 5.
Schematic diagram of MSBlock architecture.
Fig 6.
Schematic diagram of the SCSA attention mechanism structure.
Fig 7.
Schematic illustration of the Efficient RepGFPN architecture.
Fig 8.
Structural illustration of the CSPStage and Rep modules.
Table 1.
Ablation Study on the TSAppleData Dataset.
Table 2.
Performance comparison of different attention mechanisms on MSRRT-DETR evaluated on the TSAppleData dataset.
Fig 9.
Feature response heatmaps of different attention mechanisms in apple detection across multiple scenarios.
Fig 10.
Comparison of detection results before and after model improvement for apples at different growth stages and spatial distributions, along with typical false detection examples.
The first row shows original images, the second and third rows display inference results from the improved and original models respectively, while the fourth row presents enlarged views of error regions from the original model (marked by gray boxes). Undetected targets are indicated by orange boxes, and false detections by orange circles.
Table 3.
Performance comparison of MSRRT-DETR versus mainstream object detection models on the TSAppleData dataset.
Fig 11.
Performance comparison of different object detection models in terms of FPS, mAP50, parameter count and Composite Score.
The horizontal axis represents the model’s FPS value, while the vertical axis represents the model’s mAP50 value. The circle size indicates the model’s parameter count (model complexity), with larger circles representing higher parameter counts. The color depth of the circles represents the model’s comprehensive score, where darker colors indicate better overall performance in both accuracy and speed.
Table 4.
Detailed statistics and characteristic descriptions of apple datasets for generalization experiment.
Fig 12.
Sample images from each dataset used in the generalization experiments.
Table 5.
Cross-domain generalization performance comparison of different detection models across datasets.
Fig 13.
Performance comparison of various detection models on multi-source apple detection datasets (F1-score bar chart and mAP50 heatmap).