Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments

Abhinav Joshi; Naman Gupta; Jinang Shah; Binod Bhattarai; Ashutosh Modi; Danail Stoyanov

doi:10.1145/3536221.3556596

What is it about?

A real-world application or setting involves interaction between different modalities (e.g., video, speech, text). In order to process the multimodal information automatically and use it for an end application, Multimodal Representation Learning (MRL) has emerged as an active area of research in recent times. However, in practice, the data acquired from different sources are typically noisy. In some extreme cases, a noise of large magnitude can completely alter the semantics of the data leading to inconsistencies in the parallel multimodal data. In this paper, we propose a novel method for multimodal representation learning in a noisy environment via the generalized product of experts technique. In the proposed method, we train a separate network for each modality to assess the credibility of information coming from that modality, and subsequently, the contribution from each modality is dynamically varied while estimating the joint distribution.

Photo by Possessed Photography on Unsplash

Why is it important?

This work addresses an important problem in AI domain. For an AI system to be seamless in usage, it should be able to make use of information coming from different modalities. This work is a step towards this goal. We propose how to reliably combining information coming from different modalities in real world settings.

Perspectives

Working on this was fun as this work addresses an important area. Hopefully the method proposed by this paper will be useful for other researchers.
Dr. Ashutosh Modi
Indian Institute of Technology Kanpur

This page is a summary of: Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments, November 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3536221.3556596.
You can read the full text:

Read

Resources

Open Access version
Full Paper
Paper with all details.

Contributors

The following have contributed to this page

Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments

What is it about?

Why is it important?

Perspectives

Resources

Full Paper

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Resources

Full Paper

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management