Self-Supervised Learning for Videos: A Survey

Madeline C. Schiappa; Yogesh S. Rawat; Mubarak Shah

doi:10.1145/3577925

What is it about?

An introduction into self-supervised learning for videos and a summary of the current research landscape. The main areas include: 1) pretext task learning, 2) generative learning, 3) contrastive learning, and 4) cross-modal agreement. In addition to covering self-supervised learning for video in vision only, we include multimodal approaches that use additional modalities like audio and text. More info can be found at our GitHub project link: https://bit.ly/3Oimc7Q

Photo by Sigmund on Unsplash

Why is it important?

Self-supervised learning reduces the requirement of dense annotation for training and provides generalizable foundation models that can be used for downstream tasks or emergent behaviors.

Perspectives

I hope this article can help researchers new to the field as a gentle introduction into the topic of self-supervised learning for videos and to help guide future research.
Madeline Chantry
University of Central Florida

This page is a summary of: Self-Supervised Learning for Videos: A Survey, ACM Computing Surveys, July 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3577925.
You can read the full text:

Read

Resources

Project
GitHub Project Page
The official GitHub page for the survey paper "Self-Supervised learning for Videos: A survey". Contains a summary and list of works with links.

Contributors

The following have contributed to this page

Madeline Chantry
University of Central Florida

Self-Supervised Learning for Videos: A Survey

What is it about?

Why is it important?

Perspectives

Resources

GitHub Project Page

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Self-Supervised Learning for Videos: A Survey

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Resources

GitHub Project Page

Contributors

Share this page:

You might also like

Noise-based enhancement for foveated rendering

Image Encryption Based on Fisher-Yates Shuffling and Three Dimensional ChaoticEconomic Map

P-EdgeCoolingMode: An Agent Based Performance Aware Thermal Management Unit for DVFS Enabled Heterogeneous MPSoCs

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management