What is it about?
Deepfake technology aims to synthesize highly realistic facial images and videos, with broad application potential in entertainment, film production, and digital human modeling. Deep learning has driven major progress in generative modeling, from VAEs and GANs to the recent rise of diffusion models. The latter have sparked a renewed wave of research through their superior generation quality. In addition to deepfake generation, corresponding detection technologies continuously evolve to regulate the potential misuse of deepfakes, such as privacy invasion and phishing attacks. This survey comprehensively reviews the latest developments in deepfake generation and detection, summarizing and analyzing current state-of-the-arts in this rapidly evolving field. First, we unify task definitions, comprehensively introduce datasets and metrics, and summarize the underlying technologies. Then, we review the development of several related sub-fields and examine four representative deepfake research fields: face swapping, face reenactment, talking-face generation, and facial attribute editing, as well as forgery detection. Subsequently, we benchmark representative methods on widely adopted datasets to provide a comprehensive and up-to-date evaluation of the most influential published works. Finally, we discuss the key challenges and outline future research directions for the field.
Featured Image
Photo by Boitumelo on Unsplash
Why is it important?
This survey on Deepfake Generation and Detection is highly important and innovative for both academic research and industrial applications. It systematically unifies task definitions, benchmarks state-of-the-art methods, and covers the latest advances including diffusion models and NeRF-based techniques, filling gaps in prior reviews that overlooked emerging generative architectures. By comprehensively comparing datasets, metrics, and representative approaches across face swapping, reenactment, talking-face generation, facial attribute editing, and forgery detection, it provides a clear roadmap for researchers to understand the field’s evolution, identify limitations, and pursue future directions. Its novelty lies in integrating up-to-date diffusion-based frameworks, multimodal modeling, and 3D-aware methods while delivering fair benchmark comparisons; practically, it supports the healthy development of AIGC, helps mitigate misuse risks such as privacy invasion and misinformation, strengthens media forensics and information security, and guides ethical governance and regulatory standardization for synthetic media.
Perspectives
As the author of this survey on Deepfake Generation and Detection: A Benchmark and Survey, I have systematically organized, categorized, and synthesized relevant research work in the deepfake field. First and foremost, this survey is a comprehensive sorting and integration of the rapidly evolving deepfake domain. When I started the research, I found that existing reviews only covered partial tasks or outdated technologies, while diffusion models, NeRF, and multimodal frameworks were rapidly reshaping the field. Therefore, I aimed to build a unified framework: from task definition, technical evolution, dataset and metric comparison, to method benchmarking and future prospects, so that readers can clearly grasp the full landscape of deepfake generation and detection. From a research perspective, Deepfake technology is a double-edged sword. On the one hand, driven by GANs, diffusion models, and 3D representations like NeRF, face swapping, reenactment, talking-face generation, and facial attribute editing have achieved unprecedented realism, bringing huge value to entertainment, film, digital humans, and content creation. On the other hand, the risk of malicious abuse is increasingly prominent—privacy invasion, identity impersonation, misinformation dissemination, and ethical crises have become urgent issues that cannot be ignored. This contradiction also makes detection technology equally critical and even more challenging. In terms of innovation, this survey has three core values that I highly recognize: It covers the latest technologies for the first time, especially diffusion-based and 3D-aware generation methods, filling the gap in previous reviews; It provides standardized benchmark comparisons for representative methods, using unified datasets and metrics to help researchers objectively evaluate model performance; It systematically summarizes challenges and future directions, pointing out clear paths for generalization, controllability, lightweight deployment, and cross-domain robustness. As a researcher, writing this survey also deepened my thinking about the field, the future of deepfake should be responsible innovation. Generation technology needs to pursue higher fidelity and controllability, while detection technology must keep pace with evolution, achieving stronger generalization and anti-interference ability. At the same time, technological progress must be combined with ethical norms and policy supervision to ensure that AIGC benefits society rather than harm it. In short, this survey is not only a review of the state-of-the-art, but also a reflection on the past, present, and future of the deepfake field. I hope it can become a useful reference for peers, promote more rigorous and innovative research, and boost the healthy and sustainable development of synthetic media technology.
Dr. Gan Pei
East China Normal University
Read the Original
This page is a summary of: Deepfake Generation and Detection: A Benchmark and Survey, ACM Computing Surveys, March 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3801962.
You can read the full text:
Contributors
The following have contributed to this page







