What is it about?

Our review provides a detailed examination of the recent advances in markerless motion capture systems and the prior research that made it possible. It focuses on methods capable of tracking multiple people in real- time using multiple cameras, without the need for physical markers. The review systematically analyzes the performance of state-of-the-art approaches in terms of accuracy, latency, and computational efficiency. It explores key architectural designs such as top-down, bottom-up, and voxel- based pipelines, and evaluates how these influence the aforementioned metrics. A comprehensive analysis of the performance of keys methods on modern GPU helps understand the scaling law with the number of person tracked in a scene and the number of cameras that are needed to accurately track motion. By mapping the field’s evolution from early geometric methods to modern deep learning–based systems, the paper identifies both major breakthroughs and the remaining challenges towards fully markerless, real- time 3D motion tracking in multi-person scenarios.

Featured Image

Why is it important?

Real-time markerless motion capture creates a path toward more natural, accessible, and scalable motion analysis. By eliminating the need for physical markers, it enables seamless motion tracking in real-world settings, examples of which range from sports or physical rehabilitation to immersive virtual and augmented reality experiences. Real-time performance enables realistic interactions and feedback. Markerless motion capture stands at the intersection of computer vision, biomechanics, and machine learning, driving innovation in deep learning– based pose estimation, multi-view fusion and real-time 3D reconstruction. Advancements in this field not only improve the accuracy and efficiency of motion analysis technologies but also strengthen our understanding of how to model and interpret complex human motion at scale.

Perspectives

This article was a deep-dive into the computer vision methods applied to motion-analysis. Understanding the scaling low and the required computing of past and current methods helped to identify the promising methods. Measuring and understanding the world in real-time is a requirement for any system that interacts with human.

Dr Pierre Nagorny
Artanim Foundation

Read the Original

This page is a summary of: A Comprehensive Review of Real-Time Multi-View Multi-Person Markerless Motion Capture, ACM Computing Surveys, September 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3757733.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page