Tiny is not small enough: High quality, low-resource facial animation models through hybrid knowledge distillation

Zhen Han; Mattias Teye; Derek Yadgaroff; Judith Bütepage

doi:10.1145/3730929

What is it about?

A model compression framework in the area of 3D facial animation. We compressed a large 3D facial model so that it can run on a user device in real time, while still keeping high performance. The large model has about 1000 million parameters and consumes ~5GB of GPU memory to generate animation, while our compressed on-device model has ~0.3 million parameters and takes 3.4MB of CPU memory. The real-time model has a latency of ~81ms.

Why is it important?

The state-of-the-art 3D facial animation models of recent years are extremely large due to the use of pre-trained speech encoders, making them unable to run on a user device or in real time. We are interested in next-gen game applications like real-time 3D avatars or animating user-specific dialogues on the user's device to bring more personalized, unique, and immersive game experiences to players. This work lays the foundation for realizing our long-term vision.

Perspectives

We hope this research challenges the current trend of building ever-larger models, and instead encourages more efficient, lightweight alternatives. As researchers in a game company, our goal is to apply AI in practical, product-driven ways.
Zhen Han
Electronic Arts Inc

This page is a summary of: Tiny is not small enough: High quality, low-resource facial animation models through hybrid knowledge distillation, ACM Transactions on Graphics, July 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3730929.
You can read the full text:

Read

Resources

Presentation
SIGGRAPH 2025 official website
SIGGRAPH presentation

Contributors

The following have contributed to this page

Zhen Han
Electronic Arts Inc

Real-time and on-device 3D lip-sync facial animation models driven by speech.

What is it about?

Why is it important?

Perspectives

Resources

SIGGRAPH 2025 official website

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Real-time and on-device 3D lip-sync facial animation models driven by speech.

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Resources

SIGGRAPH 2025 official website

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management