What is it about?

An efficient baseline model for skeleton-based action recognition is proposed and released, which consists of three main parts: first, multiple input branches, then, residual GCN, and last, part-wise attention blocks.

Featured Image

Why is it important?

1. An early fused multi-branch architecture is designed to take inputs from three individual spatio-temporal feature sequences (Joint, Velocity and Bone) obtained from raw skeleton data, which enables the baseline model to extract sufficient structural features. 2. To further enhance the efficiency of our model, a residual bottleneck structure is introduced in GCN, where the residual links reduce the difficulties in model training and the bottleneck structure reduces the computational costs in parameter tuning and model inference. 3. A part-wise attention block is proposed to compute attention weights for different human body parts to further improve the discriminative capability of the features, which meanwhile provides an explanation for the classification results through visualizing the class activation maps. 4. Extensive experiments are conducted on two large-scale skeleton action datasets, i.e., NTU RGB+D 60 and 120, where the PA-ResGCN can achieve the SOTA performance, and the ResGCN with bottleneck structure obtains competitive performance with much fewer parameters.

Perspectives

A highly cited paper.

Yi-Fan Song
University of Chinese Academy of Sciences

Read the Original

This page is a summary of: Stronger, Faster and More Explainable: A Graph Convolutional Baseline for Skeleton-based Action Recognition, October 2020, ACM (Association for Computing Machinery),
DOI: 10.1145/3394171.3413802.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page