CRPGAN: Learning image-to-image translation of two unpaired images by cross-attention mechanism and parallelization strategy | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

Several results of our proposed method on various tasks.
Here we show object transformation ((a), (b)) and image style transfer((c)-(f)). The three images from left to right are the source image (which provides the main content), the target image (which provides style and high-level semantic information), and the translated image.

More »

Fig 2 — Fig 2.

CRPGAN network architecture.
In (a) and (b), our model consists of two symmetric pyramids of GANs that gradually refine the generated images from global structure to local details. We start training at ‘scale 0’ by using the lowest resolution image and the smallest generator. With the scale increasing, the size of the generator are gradually increased and the resolution of the image are also changed from low to high. (c) is our Cross-CBAM attention mechanism to extract the image global information and local information.

More »

Fig 3 — Fig 3.

Structure comparison of the different order of channel and spatial attention.
(a) is CBAM attention mechanism including a channel attention and spatial attention mechanism. (b) is the mechanism of two channel followed by two spatial attentions. (c) is the mechanism of two spatial attentions followed by two channel attentions. (d) is our CRCBAM attention mechanism.

More »

Fig 4 — Fig 4.

Channel attention and spatial attention mechanism of CBAM.

More »

Fig 5 — Fig 5.

The visualization results of the different order of channel and spatial attention.

More »

Table 1 — Table 1.

Quantitative comparison between different order of channel and spatial attention η of CRPGAN in terms of SIFID.
The best scores are in bold.

More »

Fig 6 — Fig 6.

Translation results of CRPGAN with various baselines on Summer↔Winter.
Among them, SinGAN is trained using one target domain image, TuiGAN and CRPGAN in this paper are trained using two unpaired images, others are trained using the complete dataset.

More »

Fig 7 — Fig 7.

Translation results of CRPGAN with various baselines on Horse↔Zebra.

More »

Fig 8 — Fig 8.

Translation results of CRPGAN with various baselines on Photo→Van Gogh.

More »

Table 2 — Table 2.

Average SIFID and training time of various baselines versus our method on general UI2I tasks.

More »

Fig 9 — Fig 9.

Results of painting-to-image translation.
We amplify the green box in the translated image at the second row to show more detail.

More »

Fig 10 — Fig 10.

Visual results of scale factor η.

More »

Table 3 — Table 3.

Quantitative comparison between different scale factor η of CRPGAN in terms of SIFID.
The best scores are in bold.

More »

Fig 11 — Fig 11.

Visual results of parametric study.

More »

Table 4 — Table 4.

Quantitative comparison between different scale factor η of CRPGAN in terms of SIFID.
The best scores are in bold.

More »

Fig 12 — Fig 12.

Visual results of ablation study.

More »

Table 5 — Table 5.

Quantitative comparisons results for CRPGAN ablations study in terms of SIFID.
The best scores are in bold.

More »

Fig 13 — Fig 13.

Our model can accurately transfer animal hair color and opera face style features.

More »