Robust deepfake detector against deep image watermarking

Jian Yu; Xin Liu; Fengbiao Zan; Yanhan Peng

doi:10.1371/journal.pone.0338778

Abstract

Deepfake technology poses a significant threat to information security,rendering deepfake detection research crucial. However, current detection methods experience a marked performance degradation in the presence of deep watermarking within images. In this paper, we propose a multi-module model, which integrates Efficient Multi-scale Attention within Xception as the detection module and introduces a feature dropout module to eliminate redundant image features. Experimental results demonstrate that when 50% and 100% of the images in the dataset contain MBRS watermarks, the accuracy (ACC) metrics of our model are comparable to those of existingbaseline models. However, when 50% and 100% of the images contain FaceSigns watermarks, the ACC metrics of our model outperform those of other baseline models by approximately 10% and 20%, respectively.

Citation: Yu J, Liu X, Zan F, Peng Y (2025) Robust deepfake detector against deep image watermarking. PLoS One 20(12): e0338778. https://doi.org/10.1371/journal.pone.0338778

Editor: Sadiq H. Abdulhussain, University of Baghdad, IRAQ

Received: June 16, 2025; Accepted: November 27, 2025; Published: December 31, 2025

Copyright: © 2025 Yu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data underlying the results presented in the study are available from the DeepfakeBench open-source repository (https://github.com/SCLBD/DeepfakeBench/blob/main/README.md#2-download-data). The UADFV, Celeb-DF-v1, and Celeb-DF-v2 datasets used in this study are owned by their original third-party providers, and we do not have the permission to redistribute the datasets directly. All researchers may obtain the datasets by following the download instructions specified in the aforementioned repository link.

Funding: This research was funded by the Kunlun Talent High end Innovation and Entrepreneurship Talent Project in Qinghai Province, 2023, received by XL; Science and Technology Innovation Platform Construction Project of National Sustainable Development Agenda Innovation Demonstration Zone in Hainan Prefecture, Qinghai Province(2024-HN-P03); 2024 Graduate Innovation Project of Qinghai Nationalities University: “Research on Full-Process Deepfake Defense Methods” (09M2024007); Natural Science Foundation of Qinghai Province: “Research and Application of Key Technologies for Green Power Traceability and Heterogeneous Computing Power Integration in Qinghai Province” (2024-GX-A3). All funders had no role in study design, data collection and analysis, decision to publish, or manuscript preparation.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Recent advancements in deepfake technology have significantly enhanced the field of face synthesis. This progress is largely driven by breakthroughs in deep learning frameworks, such as generative adversarial networks (GANs) and variational autoencoders (VAEs). Currently, deepfake technology presents a substantial challenge to the credibility of social media content due to its low cost, ease of use, and high-quality generation capabilities. As a result, research aimed at countering deepfake technology has become critically important, with the identification of fabricated facial content online remaining a significant challenge.

Given the rapid advancement of deepfake technology, both academia and industry have actively pursued research on countermeasures, yielding several notable results. These efforts can be broadly categorized into three types: proactive forensics, passive forensics, and proactive defense techniques. Proactive forensics techniques involve the use of encoders to preprocess images initially. Following image manipulation, these techniques employ corresponding decoders to assess the authenticity of the images. Such approaches often rely on robust digital watermarking or steganographic embedding, such as reversible watermarking schemes resilient to geometric attacks [1], secret signal hiding based on color-space conversion [2], and semantic-aware protection methods guided by fine-grained human parsing in complex scenes [3]. Proactive defense methods focus on introducing adversarial perturbations to images, ensuring that, when manipulated, the images do not convey misleading information. At the video level, recent work has explored content-aware encryption of semantically critical segments using chaotic systems, thereby limiting the exploitable narrative structure of original footage [4]. Passive forensics leverages the robust feature-extracting capabilities of deep neural networks to learn the characteristics of forged images, facilitating their detection.

In the current domain of deepfake detection, passive forensics techniques do not require preprocessing of images and can directly detect manipulated content. These techniques have emerged as the primary method for identifying potentially forged content online, owing to their high adaptability and broad applicability. In contrast, some proactive forensics techniques employ deep learning-based digital watermarking, which involves embedding a specific watermark into images prior to their dissemination in order to protect particular objects. However, previous studies often treat these two techniques in isolation, overlooking the potential impact that deep watermarks in proactive forensics may have on the performance of passive detection algorithms. To address this issue, some researchers have proposed optimization modules to mitigate these negative effects [5]; however, their implementation necessitates modifications to pre-trained models. In contrast, optimizing the passive detection approach by refining training strategies allows a single model to adapt to image processing by various active defense forensics techniques without altering the model structure, thereby enhancing the overall system’s compatibility and generalization capabilities.

In this paper, we present a robust deepfake detection model that does not require prior knowledge of the data distribution containing target deep image watermark. During the training phase, the model is trained solely on the original dataset, while during the inference phase, it is tested on data embedded with deep image watermark to assess its performance. To evaluate the model’s robustness in predefined scenarios, we conduct multiple experiments using three major deepfake detection datasets. Under varying conditions of deep image watermarking distribution, we examine the model’s performance in scenarios where the data contain unknown deep image watermark. The contributions of this paper are as follows:

We explore the innovative approach of discarding redundant image features during detection and introduce a feature dropout module, which enhances the model’s robustness against unknown deep image watermark noise. The module is designed with a simple five-layer architecture that includes downsampling, upsampling, and skip connections to eliminate features that interfere with classification.
Leveraging existing deep image watermarking models, we construct a deep image watermarking pool module to simulate real-world deep image watermarked data and evaluate our model using watermarked images.
To further improve robustness, we integrate an Efficient Multi-scale Attention mechanism into the classification model, thereby enhancing its capability to detect watermarked data.

Related works

Deepfake generation

In current research and practical applications, facial deepfake technologies can be broadly categorized into four types: facial manipulation, face swapping, attribute editing, and full-face synthesis. Among these, face swapping and facial manipulation typically require both source and target facial images, generating manipulated outputs based on the attributes of the source face. Face swapping techniques generally involve either cropping-and-pasting replacements [6] or face identity replacement [7], with the goal of forging the identity of the target face. Facial manipulation methods, on the other hand, transfer the movements and expressions of the source face to the target face [8]. Although all of these approaches produce synthetic facial content, attribute editing modifies features such as hair color, skin tone, age, and gender, and can also add accessories like glasses, using generative models [9,10]. The aforementioned methods rely on real reference data, whereas full-face synthesis generates entirely synthetic faces from random noise, without any reference input [11]. As forgery techniques continue to evolve and produce increasingly realistic content, the development of robust detection technologies has become imperative.

Passive forensics

Leveraging the feature extraction capabilities of deep neural networks, contemporary deepfake detection methods typically build upon these architectures to identify forged characteristics within images and assess their authenticity. Zhang et al. [12] trained a model to detect sharp edges in the YCbCr color space for forgery identification. However, as deepfake generation techniques evolved, such overt artifacts became increasingly difficult to detect. Consequently, some researchers began treating forged images as originating from different distributions. For instance, Chen et al. [13] employed self-supervised learning to capture image consistency, while Wang et al. [14] utilized Transformer-based models to extract and compare local and global image features. Recognizing that social media compression severely degrades typical forgery traces, Liao et al. [15] proposed modeling facial muscle motion dynamics to detect deepfakes under low-quality conditions. Although these methods perform well within the same dataset, they often exhibit limited generalization across different datasets. To overcome this limitation, Shiohara et al. [16] introduced a self-mixing module to simulate forgery traces from multiple methods, and Liu et al. [17] developed a Fake Blender module based on self-mixed images. This module enhances the diversity of simulated forgeries and improves the generalization ability of detection algorithms.

Proactive forensics

Digital image watermarking ensures the security of visual content by embedding and extracting information within images through algorithmic techniques, playing a critical role in digital media protection.Recent work has extended deep watermarking to handle real-world distortions such as screen shooting through grayscale deviation simulation [18] and wavelet-based recovery architectures [19]. At the same time, researchers have explored adaptive steganography using texture-aware payload allocation [20] and revealed new vulnerabilities via concealed adversarial attacks that impair watermark extraction while preserving visual fidelity [21]. Following the success of the HiDDeN model [22], which enabled end-to-end deep watermark generation, researchers have further investigated the broader applications of deep image watermarking. To address HiDDeN’s limitations under JPEG compression, Jia et al. [23] proposed the MBRS model, which was trained using real JPEG algorithms. Fang et al. [24] enhanced watermarking performance by jointly optimizing transparency and robustness. In the domain of deepfake defense, Beuve et al. [25] utilized semi-fragile watermarks for tamper detection, Zhao et al. [26] employed robust watermarks for identity verification, and Wu et al. [27] developed SepMark, a model capable of detecting watermarks across varying network depths. Unlike earlier methods that focused on protecting individual faces, FaceSigns by Neekhara et al. [28] introduced multi-target protection by embedding watermarks into multiple faces within a single image.

Proactive defense

Similar to proactive forensics, proactive defense methods protect against deepfakes by introducing imperceptible image modifications that interfere with the generative process of deepfake models. One line of work aims to weaken the generation capabilities of manipulation systems. Yeh et al. [29] proposed the limit-aware self-guiding gradient sliding attack (LaS-GSA), which suppresses the generative ability of image-to-image translation GANs. He et al. [30] further used latent-space search to obtain images that remain robust under reconstruction, making them difficult to forge. Another line of research leverages adversarial perturbations for cross-model disruption. Huang et al. [31] introduced CMUA-Watermark by exploiting the transferability of adversarial examples, causing protected images to yield reconstructions with severe artifacts. More recently, Zhang et al. [32] proposed a unified framework that jointly embeds forensic watermarks and adversarial perturbations, providing both traceability and anti-forgery protection across diverse attack scenarios.

Previous research addressing deepfake countermeasures has predominantly focused on optimizing the performance of proactive forensics and passive forensics methods within self-defined scenarios. Passive forensics techniques have accounted for factors such as social media noise [33] and typical image quality degradation [34] during image transmission. Proactive forensics techniques, on the other hand, have considered challenges like image compression during distribution and the potential interference of watermarks with downstream tasks [5]. However, while proactive forensics methods may disrupt passive forensics, prior studies on passive forensics have seldom considered the impact of such interference from proactive forensics.

Methodology

Problem statement

Let and denote the encoder and decoder of the deep watermarking model, respectively, with x as the input image and w as the embedded information. Additionally, let represent the image similarity loss function. For instance, in the HiDDeN model, typically refers to the l₂ loss used during training and optimization. Other models employ alternative loss functions, such as Mean Squared Error (MSE), Structural Similarity Index (SSIM), or Peak Signal-to-Noise Ratio (PSNR) loss. To enhance the realism of watermarked images, a discriminator assesses the output of the encoder, and its loss function, , is commonly based on cross-entropy. represents the Bit Error Rate (BER) of the extracted watermarks. The global loss function of such models can be expressed by Eq (1), although the specific details vary across different algorithms.

(1)

In images embedded with deep watermarking, the watermark information is inserted into the feature layers of the image, which are subsequently reconstructed into a watermarked image x_w through a deconvolution process. The embedded watermark can be approximately expressed as − x. Deep image watermarking typically does not produce a noticeable visual difference between the original and watermarked images. However, even minor alterations can affect the feature representations. As a result, conventional deepfake detection models may extract misleading features during the representation learning process, leading to erroneous predictions and a decline in detection accuracy.

Deep image watermarking pool

Current deepfake detection datasets contain only original or forged images, without incorporating deep image watermark. However, since our model requires evaluation on watermarked data, environment simulation becomes essential. To facilitate this, we selected two models: MBRS, a general-purpose deep image watermarking model, and FaceSigns, a proactive forensic approach for deepfake detection. The watermarks embedded by FaceSigns remain intact even after processing by deepfake generation algorithms. After training both models, we integrated them into a unified deep image watermarking pool. This pool probabilistically adds watermarks to input images, where a proportion p determines that of the dataset images will be watermarked. During the watermarking process, the specific model used for embedding can be selected manually. Fig 1 illustrates the spatial-domain effects of watermarks added by the two methods. The watermark embedded by MBRS is nearly imperceptible in the spatial domain, whereas the watermark added by FaceSigns exhibits visible structural traces. Moreover, the module allows for embedding different types of watermarks across various samples in the dataset to enhance simulation fidelity. The watermark embedding process is defined by the Eq (2).

(2)

Download:

Fig 1. Comparison of spatial-domain watermark effects.

Method MBRS yields an invisible watermark, whereas method FaceSigns introduces visible structural cues in the spatial domain.

https://doi.org/10.1371/journal.pone.0338778.g001

Model architecture

As illustrated in Fig 2, the proposed model comprises two main components: the Feature Dropout (Fd) module and the classification module EMA-Xception. During the training phase, we first train the EMA-Xception network using the original dataset. Subsequently, to train the Fd module, we employ the watermark simulation module W(x) to simulate watermarked images. Afterward, the parameters of EMA-Xception are frozen, and its output is used to supervise the training of the Fd module. During inference, the model is evaluated on datasets processed by the deep image watermarking pool.

Download:

Fig 2. Overview of our proposed method.

After two stages of training, our model maintains good performance regardless of whether the images contain watermarks or not.

https://doi.org/10.1371/journal.pone.0338778.g002

Feature dropout module.

The architecture of the Fd module is illustrated in Fig 3. Most contemporary deep image watermarking models adopt an encoder-decoder framework, where watermark information is encoded as tensors and embedded into the image’s feature layers. The primary function of the Fd module is to eliminate redundant features from the original image, thereby reducing their interference with the classification model’s decision-making process. Similar to watermarking models, the module also employs an encoder-decoder structure. Following the U-Net design, a skip connection is incorporated between the encoder and decoder to facilitate information flow. Although fully convolutional networks and autoencoders are commonly used, deepfake detection tasks necessitate the retention of shallow features where forgery traces often reside. Conventional convolutional networks tend to discard these shallow features during processing; therefore, skip connections help preserve them, preventing the loss of critical forgery-related information.

Download:

Fig 3. Architecture of the feature dropout module.

This lightweight component consists of three core layers, including two encoding layers and one decoding layer. A skip connection is introduced between the first encoding layer (encode1) and the decoding layer (decode) to retain low-level features during feature suppression. The module processes input images while maintaining their spatial dimensions, ensuring that the output image size matches that of the input.

https://doi.org/10.1371/journal.pone.0338778.g003

EMA-Xception.

The Xception network [35] demonstrates strong performance in visual tasks such as image classification and semantic segmentation. Derived from the Inception architecture [36], it employs depthwise separable convolutions and cross-channel activation, along with global average pooling for feature aggregation. By replacing the standard Inception convolutions with depthwise separable convolutions, Xception introduces a novel modular structure that is both more efficient and computationally powerful. This design significantly reduces the number of parameters and model complexity while enhancing accuracy and generalization capabilities in visual recognition tasks.

However, the feature dropout module compresses images through convolutional operations and subsequently reconstructs them using deconvolution, which removes watermark features. Due to the irreversible nature of convolutional compression, features critical for determining image authenticity are also lost. Therefore, it is necessary to improve the multi-scale feature extraction capacity of the Xception backbone, integrate channel and spatial information, and enhance attention mechanisms. Since our task involves detecting unknown watermarked images, generalization is of paramount importance. To meet this requirement, we incorporated the Efficient Multi-scale Attention (EMA) [37] module into the Xception architecture, thereby improving its applicability to this task.

Experiments

Experiments settings

Dataset.

To evaluate the effectiveness of our method, we utilized widely recognized datasets commonly employed in deepfake detection: UADFV, CelebDF-v1 and CelebDF-v2. Although these datasets are publicly available, they necessitate preprocessing steps, including frame extraction, face cropping, and alignment. As a result, we opted for the preprocessed dataset from DeepfakeBench [38], which randomly selects 32 frames per video, aligns faces using facial landmark detection, crops them to a resolution of 256256, and provides well-organized labels. Using subsets of DeepfakeBench as our baseline, we created four watermarked datasets with the deep image watermarking pool: 50% or 100% of the images are watermarked with MBRS, and 50% or 100% with FaceSigns.

Baseline.

SRM [39] extracts both frequency and spatial features using SRM filters and integrates them through cross-attention mechanisms to enhance the model’s generalization capability in forgery detection.

UCF [40] disentangles image information into three components: features unrelated to forgery, method-specific forgery cues, and common forgery patterns. It reconstructs images based on these forgery-related features and applies contrastive regularization to distinguish between different types of forgeries.

CORE [41] generates augmented views of input images, extracts features using a shared encoder, enforces representation similarity using cosine distance, and performs classification for each view using supervised labels.

MINet [42] first adaptively extracts several non-overlapping local features and uses mutual information theory to ensure they do not share redundant information. It then keeps task-relevant information, removes irrelevant details, and fuses all features into a compact global representation to determine whether the image is real or fake.

Implementation details.

EMA-Xception. The Middle Flow of the Xception architecture incorporates depthwise separable convolutions and residual connections, with these combined units organized into multiple stacked layers. This design reduces the model’s parameters and computational load while maintaining effective feature extraction. Building upon Xception, we introduced the EMA attention mechanism solely in the Entry and Exit Flows. In the Middle Flow, we stacked the convolution-residual units eight times. The updated model architecture is illustrated in Fig 4.

Download:

Fig 4. Architecture of EMA-Xception.

A modified version of the Xception network with one EMA module integrated into both the Entry Flow and Exit Flow. The EMA modules are designed to capture multi-scale contextual features by aggregating information across different spatial resolutions, thereby enhancing the network’s capacity to model hierarchical visual patterns in the input (Entry Flow) and to refine global–local feature interactions in the output (Exit Flow).

https://doi.org/10.1371/journal.pone.0338778.g004

Training. The model training process is conducted in two stages. In the first stage, the classifier is trained on the original dataset images, and the optimization is performed using cross-entropy loss. In the second stage, the trained weights are loaded into the classifier, its parameters are frozen, and adversarial training is applied. Each batch consists of two categories of images: one real image from the dataset and one image processed by the frozen HiDDeN encoder and feature dropout module. The classifier compares these images to optimize the feature dropout module. Throughout the training process, only cross-entropy loss is used to maintain computational simplicity, and early stopping is employed to prevent overfitting.

Experiment results

Problem verification experiment.

In this section, we evaluate the impact of deep watermarking on detection models utilizing three commonly used backbone networks in deepfake detection: Xception, EfficientNet, and ResNet. All models were trained on the Celeb-V2 dataset. For robustness testing, 50% and 100% of the original Celeb-V2 images were replaced with watermarked versions. As shown in Fig 5, the performance of these backbone models declines when images with MBRS or FaceSigns watermarks are incorporated into the original dataset at varying ratios.

Download:

Fig 5. The result graph of problem verification experiment.

Left panel: AUC scores of Xception, ResNet, and EfficientNet on datasets with increasing proportions of MBRS-watermarked samples (0, 0.5, 1). Right panel: Corresponding AUC scores for datasets watermarked using FaceSigns. Solid lines denote the mean values over three independent trials, while shaded regions represent the standard deviation. Both panels illustrate a consistent decline in model performance as the proportion of watermarked images increases.

https://doi.org/10.1371/journal.pone.0338778.g005

To investigate the factors contributing to the decline in model performance, we utilized Gradient-weighted Class Activation Mapping (Grad-CAM) to analyze images that were correctly classified by the Xception model on the original dataset but misclassified after the introduction of FaceSigns watermarks. Heatmaps generated from the final convolutional layer of Xception were used to visualize the model’s attention regions, as shown in Fig 6. The results reveal significant differences in the model’s focus between the original and watermarked images, which ultimately led to misclassification errors.

Download:

Fig 6. Grad-CAM visualizations of model attention for the original image (a) and the image embedded with a FaceSigns watermark (a_w).

Red regions indicate higher attention. White boxes are manually annotated to highlight the most concentrated attention areas for illustrative comparison. The observed shift in attention suggests that watermark embedding alters the model’s attention distribution.

https://doi.org/10.1371/journal.pone.0338778.g006

Comparative experiment.

In this section, we conduct experiments on three mainstream datasets to fully assess the effectiveness of our proposed method for detecting images containing deep image watermarks. All models are trained using early stopping to prevent overfitting from affecting the results.

Results on UADFV. We conduct experiments on the UADFV dataset and compare the performance of our method against baseline models described in Section Baseline. As shown in Table 1, our model performs less effectively than the baselines on both the original dataset and the dataset embedded with MBRS watermarks. In contrast, our method outperforms all baseline models on the dataset containing FaceSigns watermarks. Furthermore, the consistent performance of our method across different datasets underscores its robustness to deep image watermark interference.

Download:

Table 1. Performance comparison of SRM, UCF, CORE, MINet, and the proposed method on the UADFV dataset, processed with FaceSigns and MBRS deep image watermarking models (AUC/ACC).

https://doi.org/10.1371/journal.pone.0338778.t001

Results on Celeb-DF-V1. To assess the performance of our proposed method on the Celeb-DF-V1 dataset, we conduct experiments, with detection results presented in Table 2. The results show that our method surpasses other approaches in nearly all scenarios, clearly demonstrating its superior performance on Celeb-DF-V1.

Download:

Table 2. Performance comparison of SRM, UCF, CORE, MINet, and the proposed method on the Celeb-V1 dataset, processed with FaceSigns and MBRS deep image watermarking models (AUC/ACC).

https://doi.org/10.1371/journal.pone.0338778.t002

Results on Celeb-DF-V2. Among the datasets utilized, Celeb-DF-V2 contains the largest volume of data, comprising 207,951 images. As shown in Table 3, the overall performance of our proposed method surpasses that of the comparative approaches. These results demonstrate that our method maintains strong detection performance even on large-scale datasets.

Download:

Table 3. Performance comparison of SRM, UCF, CORE, MINet, and the proposed method on the Celeb-V2 dataset, processed with FaceSigns and MBRS deep image watermarking models (AUC/ACC).

https://doi.org/10.1371/journal.pone.0338778.t003

Ablation study.

To evaluate the effectiveness of the Fd and EMA modules under deep watermark interference, we train the model on Celeb-DF-V1 with all other experimental settings kept constant. We then conduct ablation studies using two deep watermark types (FaceSigns and MBRS) and two replacement ratios (50% and 100%), as shown in the Table 4.

Download:

Table 4. Ablation study of the Fd and EMA modules under deep watermark interference (51: with module, 55: without module).

https://doi.org/10.1371/journal.pone.0338778.t004

The results show that using Fd or EMA alone can yield good performance in some settings, but their stability is limited. For example, the EMA module drops to an AUC of 0.64 when all data are embedded with FaceSigns deep watermarks. In contrast, the combined Fd + EMA (base) configuration remains consistently robust across all perturbations (AUC > 0.81) and achieves an average AUC of 0.9134 with no obvious weaknesses. This demonstrates that the two modules work synergistically to improve the model’s robustness and generalization against diverse deep watermarks.

A frequency-domain analysis further clarifies the role of Fd. Deep watermarks typically introduce redundant high-frequency patterns that can overwhelm subtle manipulative cues. The Fd module selectively suppresses these watermark-dominant components while largely preserving intrinsic forgery-related features. Quantitatively, averaged over the validation set, the low-, mid-, and high-frequency energies decrease from 41.31 to 32.33, 3.85 to 1.42, and 1.35 to 0.50, corresponding to suppression rates of 21.7%, 63.1%, and 62.8%, respectively. This asymmetric suppression explains why Fd effectively removes watermark redundancy without destroying the original forgery traces, supporting the observed improvement in detection robustness.

Conclusion

Experimental results demonstrate that the addition of deep image watermarks to original images leads to a decline in detector performance. Existing deepfake detection models exhibit considerable performance degradation when processing images with unknown deep watermarks. To mitigate this issue, we propose a novel detection method that effectively counteracts watermark interference, without requiring watermarked data for training. Future research will focus on enhancing the model’s performance across diverse datasets while preserving its robustness.

References

1. Wang C, Zhang Q, Wang X, Zhou L, Li Q, Xia Z, et al. Light-field image multiple reversible robust watermarking against geometric attacks. IEEE Trans Dependable and Secure Comput. 2025;22(6):5861–75.
- View Article
- Google Scholar
2. Li Q, Ma B, Wang X, Wang C, Gao S. Image steganography in color conversion. IEEE Trans Circuits Syst II. 2024;71(1):106–10.
- View Article
- Google Scholar
3. Liu Y, Wang C, Lu M, Yang J, Gui J, Zhang S. From simple to complex scenes: learning robust feature representations for accurate human parsing. IEEE Trans Pattern Anal Mach Intell. 2024;46(8):5449–62. pmid:38363663
- View Article
- PubMed/NCBI
- Google Scholar
4. Gao S, Zhang Z, Li Q, Ding S, Iu HH-C, Cao Y, et al. Encrypt a story: a video segment encryption method based on the discrete sinusoidal memristive rulkov neuron. IEEE Trans Dependable and Secure Comput. 2025;22(6):8011–24.
- View Article
- Google Scholar
5. Wu X, Liao X, Ou B, Liu Y, Qin Z. Are watermarks bugs for deepfake detectors? Rethinking proactive forensics. In: Larson K, editor. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24. International Joint Conferences on Artificial Intelligence Organization; 2024. p. 6089–97.
6. Thies J, Zollhofer M, Stamminger M, Theobalt C, Niessner M. Face2Face: real-time face capture and reenactment of RGB videos. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016. p. 2387–95.
7. Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B. High-resolution image synthesis and semantic manipulation with conditional GANs. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018. p. 8798–807.
8. Wiles O, Koepke AS, Zisserman A. X2Face: a network for controlling face generation using images, audio and pose codes. In: Computer Vision – ECCV 2018 : 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, 2018. p. 690–706.
9. Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. p. 4396–405.
10. Yang S, Jiang L, Liu Z, Loy CC. StyleGANEX: StyleGAN-based manipulation beyond cropped aligned faces. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 2023. p. 20943–53.
11. Karras T, Aila T, Laine S, Lehtinen J. Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations; 2018.
12. Zhang K, Liang Y, Zhang J, Wang Z, Li X. No one can escape: a general approach to detect tampered and generated image. IEEE Access. 2019;7:129494–503.
- View Article
- Google Scholar
13. Chen H, Lin Y, Li B, Tan S. Learning features of intra-consistency and inter-diversity: keys toward generalizable deepfake detection. IEEE Trans Circuits Syst Video Technol. 2023;33(3):1468–80.
- View Article
- Google Scholar
14. Wang T, Cheng H, Chow KP, Nie L. Deep convolutional pooling transformer for deepfake detection. ACM Trans Multimedia Comput Commun Appl. 2023;19(6):1–20.
- View Article
- Google Scholar
15. Liao X, Wang Y, Wang T, Hu J, Wu X. FAMM: facial muscle motions for detecting compressed deepfake videos over social networks. IEEE Trans Circuits Syst Video Technol. 2023;33(12):7236–51.
- View Article
- Google Scholar
16. Shiohara K, Yamasaki T. Detecting deepfakes with self-blended images. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA, 2022. p. 18699–708.
17. Liu Q, Xue Z, Liu H, Liu J. Enhancing deepfake detection with diversified self-blending images and residuals. IEEE Access. 2024;12:46109–17.
- View Article
- Google Scholar
18. Li Y, Liao X, Wu X. Screen-shooting resistant watermarking with grayscale deviation simulation. IEEE Trans Multimedia. 2024;26:10908–23.
- View Article
- Google Scholar
19. Fu L, Liao X, Guo J, Dong L, Qin Z. WaveRecovery: screen-shooting watermarking based on wavelet and recovery. IEEE Trans Circuits Syst Video Technol. 2025;35(4):3603–18.
- View Article
- Google Scholar
20. Liao X, Yin J, Chen M, Qin Z. Adaptive payload distribution in multiple images steganography based on image texture features. IEEE Trans Dependable and Secure Comput. 2021:1.
- View Article
- Google Scholar
21. Li Q, Wang X, Ma B, Wang X, Wang C, Gao S, et al. Concealed attack for robust watermarking based on generative model and perceptual loss. IEEE Trans Circuits Syst Video Technol. 2022;32(8):5695–706.
- View Article
- Google Scholar
22. Zhu J, Kaplan R, Johnson J, Fei-Fei L. HiDDeN: hiding data with deep networks. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y, editors. Computer Vision – ECCV 2018. vol. 11219. Cham: Springer International Publishing; 2018. p. 682–97.
23. Jia Z, Fang H, Zhang W. MBRS: enhancing robustness of DNN-based watermarking by mini-batch of real and simulated JPEG compression. In: Proceedings of the 29th ACM International Conference on Multimedia. 2021. p. 41–9. http://dx.doi.org/10.1145/3474085.3475324
24. Fang H, Jia Z, Qiu Y, Zhang J, Zhang W, Chang E-C. De-END: decoder-driven watermarking network. IEEE Trans Multimedia. 2023;25:7571–81.
- View Article
- Google Scholar
25. Beuve N, Hamidouche W, Déforges O. WaterLo: protect images from deepfakes using localized semi-fragile watermark. In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Paris, France: IEEE; 2023. p. 393–402.
26. Zhao Y, Liu B, Ding M, Liu B, Zhu T, Yu X. Proactive deepfake defence via identity watermarking. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 2023. p. 4591–600. https://doi.org/10.1109/wacv56688.2023.00458
27. Wu X, Liao X, Ou B. SepMark: deep separable watermarking for unified source tracing and deepfake detection. In: Proceedings of the 31st ACM International Conference on Multimedia. 2023. p. 1190–201. https://doi.org/10.1145/3581783.3612471
28. Neekhara P, Hussain S, Zhang X, Huang K, McAuley J, Koushanfar F. FaceSigns: semi-fragile watermarks for media authentication. ACM Trans Multimedia Comput Commun Appl. 2024;20(11):1–21.
- View Article
- Google Scholar
29. Yeh CY, Chen HW, Shuai HH, Yang DN, Chen MS. Attack as the best defense: nullifying image-to-image translation GANs via limit-aware adversarial attack. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada. 2021. p. 16168–77.
30. He Z, Wang W, Guan W, Dong J, Tan T. Defeating DeepFakes via adversarial visual reconstruction. In: Proceedings of the 30th ACM International Conference on Multimedia. 2022. p. 2464–72. https://doi.org/10.1145/3503161.3547923
31. Huang H, Wang Y, Chen Z, Zhang Y, Li Y, Tang Z, et al. CMUA-watermark: a cross-model universal adversarial watermark for combating deepfakes. AAAI. 2022;36(1):989–97.
- View Article
- Google Scholar
32. Zhang Y, Ye D, Xie C, Tang L, Liao X, Liu Z, et al. Dual defense: adversarial, traceable, and invisible robust watermarking against face swapping. IEEE TransInformForensic Secur. 2024;19:4628–41.
- View Article
- Google Scholar
33. Wu H, Zhou J, Tian J, Liu J, Qiao Y. Robust image forgery detection against transmission over online social networks. IEEE TransInformForensic Secur. 2022;17:443–56.
- View Article
- Google Scholar
34. Ke J, Wang L. DF-UDetector: an effective method towards robust deepfake detection via feature restoration. Neural Netw. 2023;160:216–26. pmid:36682271
- View Article
- PubMed/NCBI
- Google Scholar
35. Chollet F. Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. p. 1800–7.
36. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016. p. 2818–26.
37. Ouyang D, He S, Zhang G, Luo M, Guo H, Zhan J, et al. Efficient multi-scale attention module with cross-spatial learning. In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023. p. 1–5. https://doi.org/10.1109/icassp49357.2023.10096516
38. Yan Z, Zhang Y, Yuan X, Lyu S, Wu B. DeepfakeBench: a comprehensive benchmark of deepfake detection. In: Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, editors. Advances in Neural Information Processing Systems. vol. 36. Curran Associates, Inc.; 2023. p. 4534–65.
- View Article
- Google Scholar
39. Luo Y, Zhang Y, Yan J, Liu W. Generalizing face forgery detection with high-frequency features. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN, USA: IEEE; 2021. p. 16312–21.
40. Yan Z, Zhang Y, Fan Y, Wu B. UCF: uncovering common features for generalizable deepfake detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2023. p. 22412–23.
41. Ni Y, Meng D, Yu C, Quan C, Ren D, Zhao Y. CORE: consistent representation learning for face forgery detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). New Orleans, LA, USA: IEEE; 2022. p. 12–21.
42. Ba Z, Liu Q, Liu Z, Wu S, Lin F, Lu L, et al. Exposing the deception: uncovering more forgery clues for deepfake detection. AAAI. 2024;38(2):719–28.
- View Article
- Google Scholar

[ref1] 1. Wang C, Zhang Q, Wang X, Zhou L, Li Q, Xia Z, et al. Light-field image multiple reversible robust watermarking against geometric attacks. IEEE Trans Dependable and Secure Comput. 2025;22(6):5861–75.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Li Q, Ma B, Wang X, Wang C, Gao S. Image steganography in color conversion. IEEE Trans Circuits Syst II. 2024;71(1):106–10.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Liu Y, Wang C, Lu M, Yang J, Gui J, Zhang S. From simple to complex scenes: learning robust feature representations for accurate human parsing. IEEE Trans Pattern Anal Mach Intell. 2024;46(8):5449–62. pmid:38363663
View Article
PubMed/NCBI
Google Scholar

[8] View Article

[9] PubMed/NCBI

[10] Google Scholar

[ref4] 4. Gao S, Zhang Z, Li Q, Ding S, Iu HH-C, Cao Y, et al. Encrypt a story: a video segment encryption method based on the discrete sinusoidal memristive rulkov neuron. IEEE Trans Dependable and Secure Comput. 2025;22(6):8011–24.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref5] 5. Wu X, Liao X, Ou B, Liu Y, Qin Z. Are watermarks bugs for deepfake detectors? Rethinking proactive forensics. In: Larson K, editor. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24. International Joint Conferences on Artificial Intelligence Organization; 2024. p. 6089–97.

[ref6] 6. Thies J, Zollhofer M, Stamminger M, Theobalt C, Niessner M. Face2Face: real-time face capture and reenactment of RGB videos. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016. p. 2387–95.

[ref7] 7. Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B. High-resolution image synthesis and semantic manipulation with conditional GANs. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018. p. 8798–807.

[ref8] 8. Wiles O, Koepke AS, Zisserman A. X2Face: a network for controlling face generation using images, audio and pose codes. In: Computer Vision – ECCV 2018 : 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, 2018. p. 690–706.

[ref9] 9. Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. p. 4396–405.

[ref10] 10. Yang S, Jiang L, Liu Z, Loy CC. StyleGANEX: StyleGAN-based manipulation beyond cropped aligned faces. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 2023. p. 20943–53.

[ref11] 11. Karras T, Aila T, Laine S, Lehtinen J. Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations; 2018.

[ref12] 12. Zhang K, Liang Y, Zhang J, Wang Z, Li X. No one can escape: a general approach to detect tampered and generated image. IEEE Access. 2019;7:129494–503.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref13] 13. Chen H, Lin Y, Li B, Tan S. Learning features of intra-consistency and inter-diversity: keys toward generalizable deepfake detection. IEEE Trans Circuits Syst Video Technol. 2023;33(3):1468–80.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref14] 14. Wang T, Cheng H, Chow KP, Nie L. Deep convolutional pooling transformer for deepfake detection. ACM Trans Multimedia Comput Commun Appl. 2023;19(6):1–20.
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref15] 15. Liao X, Wang Y, Wang T, Hu J, Wu X. FAMM: facial muscle motions for detecting compressed deepfake videos over social networks. IEEE Trans Circuits Syst Video Technol. 2023;33(12):7236–51.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref16] 16. Shiohara K, Yamasaki T. Detecting deepfakes with self-blended images. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA, 2022. p. 18699–708.

[ref17] 17. Liu Q, Xue Z, Liu H, Liu J. Enhancing deepfake detection with diversified self-blending images and residuals. IEEE Access. 2024;12:46109–17.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref18] 18. Li Y, Liao X, Wu X. Screen-shooting resistant watermarking with grayscale deviation simulation. IEEE Trans Multimedia. 2024;26:10908–23.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref19] 19. Fu L, Liao X, Guo J, Dong L, Qin Z. WaveRecovery: screen-shooting watermarking based on wavelet and recovery. IEEE Trans Circuits Syst Video Technol. 2025;35(4):3603–18.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref20] 20. Liao X, Yin J, Chen M, Qin Z. Adaptive payload distribution in multiple images steganography based on image texture features. IEEE Trans Dependable and Secure Comput. 2021:1.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref21] 21. Li Q, Wang X, Ma B, Wang X, Wang C, Gao S, et al. Concealed attack for robust watermarking based on generative model and perceptual loss. IEEE Trans Circuits Syst Video Technol. 2022;32(8):5695–706.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref22] 22. Zhu J, Kaplan R, Johnson J, Fei-Fei L. HiDDeN: hiding data with deep networks. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y, editors. Computer Vision – ECCV 2018. vol. 11219. Cham: Springer International Publishing; 2018. p. 682–97.

[ref23] 23. Jia Z, Fang H, Zhang W. MBRS: enhancing robustness of DNN-based watermarking by mini-batch of real and simulated JPEG compression. In: Proceedings of the 29th ACM International Conference on Multimedia. 2021. p. 41–9. http://dx.doi.org/10.1145/3474085.3475324

[ref24] 24. Fang H, Jia Z, Qiu Y, Zhang J, Zhang W, Chang E-C. De-END: decoder-driven watermarking network. IEEE Trans Multimedia. 2023;25:7571–81.
View Article
Google Scholar

[52] View Article

[53] Google Scholar

[ref25] 25. Beuve N, Hamidouche W, Déforges O. WaterLo: protect images from deepfakes using localized semi-fragile watermark. In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Paris, France: IEEE; 2023. p. 393–402.

[ref26] 26. Zhao Y, Liu B, Ding M, Liu B, Zhu T, Yu X. Proactive deepfake defence via identity watermarking. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 2023. p. 4591–600. https://doi.org/10.1109/wacv56688.2023.00458

[ref27] 27. Wu X, Liao X, Ou B. SepMark: deep separable watermarking for unified source tracing and deepfake detection. In: Proceedings of the 31st ACM International Conference on Multimedia. 2023. p. 1190–201. https://doi.org/10.1145/3581783.3612471

[ref28] 28. Neekhara P, Hussain S, Zhang X, Huang K, McAuley J, Koushanfar F. FaceSigns: semi-fragile watermarks for media authentication. ACM Trans Multimedia Comput Commun Appl. 2024;20(11):1–21.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref29] 29. Yeh CY, Chen HW, Shuai HH, Yang DN, Chen MS. Attack as the best defense: nullifying image-to-image translation GANs via limit-aware adversarial attack. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada. 2021. p. 16168–77.

[ref30] 30. He Z, Wang W, Guan W, Dong J, Tan T. Defeating DeepFakes via adversarial visual reconstruction. In: Proceedings of the 30th ACM International Conference on Multimedia. 2022. p. 2464–72. https://doi.org/10.1145/3503161.3547923

[ref31] 31. Huang H, Wang Y, Chen Z, Zhang Y, Li Y, Tang Z, et al. CMUA-watermark: a cross-model universal adversarial watermark for combating deepfakes. AAAI. 2022;36(1):989–97.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref32] 32. Zhang Y, Ye D, Xie C, Tang L, Liao X, Liu Z, et al. Dual defense: adversarial, traceable, and invisible robust watermarking against face swapping. IEEE TransInformForensic Secur. 2024;19:4628–41.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref33] 33. Wu H, Zhou J, Tian J, Liu J, Qiao Y. Robust image forgery detection against transmission over online social networks. IEEE TransInformForensic Secur. 2022;17:443–56.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref34] 34. Ke J, Wang L. DF-UDetector: an effective method towards robust deepfake detection via feature restoration. Neural Netw. 2023;160:216–26. pmid:36682271
View Article
PubMed/NCBI
Google Scholar

[72] View Article

[73] PubMed/NCBI

[74] Google Scholar

[ref35] 35. Chollet F. Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. p. 1800–7.

[ref36] 36. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016. p. 2818–26.

[ref37] 37. Ouyang D, He S, Zhang G, Luo M, Guo H, Zhan J, et al. Efficient multi-scale attention module with cross-spatial learning. In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2023. p. 1–5. https://doi.org/10.1109/icassp49357.2023.10096516

[ref38] 38. Yan Z, Zhang Y, Yuan X, Lyu S, Wu B. DeepfakeBench: a comprehensive benchmark of deepfake detection. In: Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, editors. Advances in Neural Information Processing Systems. vol. 36. Curran Associates, Inc.; 2023. p. 4534–65.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref39] 39. Luo Y, Zhang Y, Yan J, Liu W. Generalizing face forgery detection with high-frequency features. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN, USA: IEEE; 2021. p. 16312–21.

[ref40] 40. Yan Z, Zhang Y, Fan Y, Wu B. UCF: uncovering common features for generalizable deepfake detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2023. p. 22412–23.

[ref41] 41. Ni Y, Meng D, Yu C, Quan C, Ren D, Zhao Y. CORE: consistent representation learning for face forgery detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). New Orleans, LA, USA: IEEE; 2022. p. 12–21.

[ref42] 42. Ba Z, Liu Q, Liu Z, Wu S, Lin F, Lu L, et al. Exposing the deception: uncovering more forgery clues for deepfake detection. AAAI. 2024;38(2):719–28.
View Article
Google Scholar

[85] View Article

[86] Google Scholar

Figures

Abstract

Introduction

Related works

Deepfake generation

Passive forensics

Proactive forensics

Proactive defense

Methodology

Problem statement

Deep image watermarking pool

Model architecture

Feature dropout module.

EMA-Xception.

Experiments

Experiments settings

Dataset.

Baseline.

Implementation details.

Experiment results

Problem verification experiment.

Comparative experiment.

Ablation study.

Conclusion

References