Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

Example of the workflow for the MabCUT method.

Train and generate images from both the ST and TS directions, extract features using independent embedding blocks, and perform contrastive learning by querying key points with the attention extractor.

More »

Fig 1 Expand

Fig 2.

MabCUT framework.

The framework achieves bidirectional mapping by utilizing the mappings A : ST and B : TS, effectively enabling I2IT between unpaired images while avoiding the strict cycle consistency constraint. In this paper, we define generators A and B, where Aenc and Benc serve as encoders, and Adec and Bdec serve as decoders. By employing Aenc and HS as the embedding blocks to extract features from various layers of the source domain, and utilizing Benc and HT as the embedding blocks for the target domain. The attention matrix selects multiple layers of features through queries and calculates the PatchNCE loss. Additionally, discriminators DS and DT compute the GAN loss.

More »

Fig 2 Expand

Fig 3.

The operational principle of global attention extractor.

Image features are extracted from Ss and A(Ss) using embedding blocks S and T, respectively. These features are then mapped to three-dimensional matrices Fs and Ft. Various operations, including reshaping and transposing, are applied to matrix Fs to derive a two-dimensional attention matrix Mg. N rows are selected based on the importance of each row in the matrix. These rows are then matched with the value matrices of the target and source domains to find the relevant important points, negative examples, and positive examples. The contrast loss is subsequently calculated. Among them, the feature blocks inside the blue, red, and green boxes represent key points, positive examples, and negative examples respectively.

More »

Fig 3 Expand

Fig 4.

(a) ResNet generator structure. (b) Depthwise convolution. (c) Pointwise convolution.

More »

Fig 4 Expand

Table 1.

Image data for the three datasets.

More »

Table 1 Expand

Fig 5.

Comparison results on the Horse2Zebra dataset.

More »

Fig 5 Expand

Table 2.

FID and KID×100 scores on the Horse2Zebra dataset, with the best performance indicated in bold.

More »

Table 2 Expand

Fig 6.

Comparison results on the Cat2Dog dataset.

More »

Fig 6 Expand

Table 3.

FID and KID×100 scores on the Cat2Dog dataset.

More »

Table 3 Expand

Fig 7.

Comparison results on the Cityscapes dataset.

More »

Fig 7 Expand

Table 4.

FID and KID×100 scores on the Cityscapes dataset.

More »

Table 4 Expand

Fig 8.

Qualitative ablation experiment.

Here, MabCUT represents the results of this model, and (A)→(E) are the I2IT results of each ablation module in sequence.

More »

Fig 8 Expand

Table 5.

The quantitative comparison results from ablation experiments.

In order to demonstrate the effects of each of our contributions on I2IT.

More »

Table 5 Expand

Fig 9.

User study results.

In this paper, we consolidate and compute the proportional rankings provided by users across various models. Subsequently, we conduct a thorough analysis of the quality of these models using detailed graphical representations. The horizontal axis shows the percentage of ranks, while the vertical axis refers to the various models.

More »

Fig 9 Expand