深入研究深膜检测的顺序贴片

论文标题

深入研究深膜检测的顺序贴片

Delving into Sequential Patches for Deepfake Detection

论文作者

Guan, Jiazhi, Zhou, Hang, Hong, Zhibin, Ding, Errui, Wang, Jingdong, Quan, Chengbin, Zhao, Youjian

论文摘要

面部伪造技术的最新进展几乎可以产生视觉上无法追踪的深泡罩视频，这可能会以恶意的意图来利用。结果，研究人员致力于深泡检测。先前的研究已经确定了局部低级提示和时间信息在追求跨层次方法中的概括的重要性，但是，它们仍然遭受鲁棒性问题的影响。在这项工作中，我们提出了基于本地和时间感知的变压器的DeepFake检测（LTTD）框架，该框架采用了局部到全球学习协议，特别关注本地序列中有价值的时间信息。具体而言，我们提出了一个局部序列变压器（LST），该局部序列变压器（LST）对限制空间区域序列的时间一致性进行了建模，其中低级信息通过学习的3D滤波器的浅层层进行层次增强。基于局部时间嵌入，我们以全球对比的方式实现了最终分类。对流行数据集进行的广泛实验验证了我们的方法有效地发现了本地伪造线索并实现最先进的表现。

Recent advances in face forgery techniques produce nearly visually untraceable deepfake videos, which could be leveraged with malicious intentions. As a result, researchers have been devoted to deepfake detection. Previous studies have identified the importance of local low-level cues and temporal information in pursuit to generalize well across deepfake methods, however, they still suffer from robustness problem against post-processings. In this work, we propose the Local- & Temporal-aware Transformer-based Deepfake Detection (LTTD) framework, which adopts a local-to-global learning protocol with a particular focus on the valuable temporal information within local sequences. Specifically, we propose a Local Sequence Transformer (LST), which models the temporal consistency on sequences of restricted spatial regions, where low-level information is hierarchically enhanced with shallow layers of learned 3D filters. Based on the local temporal embeddings, we then achieve the final classification in a global contrastive way. Extensive experiments on popular datasets validate that our approach effectively spots local forgery cues and achieves state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题