OFIDA: Object-focused image data augmentation with attention-driven graph convolutional networks

doi:10.1371/journal.pone.0302124

Fig 1.

The working process of the OFIDA.

Training DynamicFocusNet with the MS-COCO 2017 dataset to achieve accurate classification and localization of target images (a). Evaluating the performance of DynamicFocusNet using the MS-COCO 2017 test set (b). Utilizing the trained DynamicFocusNet to detect and localize target images in original images (c), and employing a cropping technique to accurately separate detected objects from original images (d), enabling precise one-to-many image data augmentation of samples.

More »

Expand

Fig 2.

Integrated view of the OFIDA framework and its modules.

More »

Expand

Fig 3.

The framework of our head network.

Given a feature map X, RepConv conducts parameter reorganization, resulting in X′. Then, the content-aware attention module (CAAM) separates content-aware category representations M from X′. The Dynamic Graph Convolutional Network (D-GCN) models global and local relations in M, generating a robust representation P with rich relational information across categories. Object detection is performed by DETECT on X′, producing classification scores Cls and bounding box regression results Bbox. Finally, the classification scores Cls are averaged with S, yielding the final scores Y for each category.

More »

Expand

Table 1.

Parameters setting.

More »

Expand

Fig 4.

Visual examples of object-focused image data augmentation algorithm: Localization, classification, and separation of target regions from original images.

More »