CapST: Leveraging Capsule Networks and Temporal Attention for Accurate Model Attribution in Deep-fake Videos

Wasim Ahmad; Yan-Tsung Peng; Yuan-Hao Chang; Gaddisa Olani Ganfure; Sarwar Khan

doi:10.1145/3715138

What is it about?

In “CapST” we go beyond real/fake detection to pinpoint which AI model created a deepfake—crucial for forensic tracing and defense. Our CapST framework combines Capsule Networks, temporal attention, and a streamlined VGG19 to accurately attribute deepfakes while remaining computationally efficient.

Why is it important?

The CapST model addresses a critical gap in the field of deep-fake video forensics by focusing not just on detecting whether a video is fake, but also identifying the specific generative model used to produce the fake. This is crucial for forensic investigations, as knowing the source model helps trace back the origin and understand the techniques used, enabling better legal accountability and defensive strategies. Additionally, CapST offers significant improvements in accuracy while maintaining computational efficiency, which makes it practical for real-world deployment.

Perspectives

From a forward-looking perspective, CapST sets a foundation for more advanced, scalable, and generalizable deep-fake attribution methods. Its modular and lightweight architecture opens the door for deployment on edge devices and integration into broader digital media authentication systems. Furthermore, as deep-fake generation technologies evolve, future research can build on CapST to handle even more subtle and complex forgeries, extend beyond face-swapping, and support multimodal detection (e.g., combining visual and audio cues).
Dr Wasim Ahmad
Academia Sinica

This page is a summary of: CapST: Leveraging Capsule Networks and Temporal Attention for Accurate Model Attribution in Deep-fake Videos, ACM Transactions on Multimedia Computing Communications and Applications, April 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3715138.
You can read the full text:

Read

Contributors

The following have contributed to this page

Dr Wasim Ahmad
Academia Sinica

Leveraging Capsule Networks and Temporal Attention for Accurate Model Attribution in Deepfake Videos

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Leveraging Capsule Networks and Temporal Attention for Accurate Model Attribution in Deepfake Videos

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management