Utilizing Prior Knowledge to Improve Automatic Speech Recognition in Human-Robot Interactive Scenarios

Pradip Pramanick; Chayan Sarkar

doi:10.1145/3568294.3580129

What is it about?

Spoken interaction is an important part of a system that aims to enable a seamless way of instructing a robot. The recent success of automatic speech recognition (ASR) systems has paved the way for realistic applications of ASR in human robot interaction. However, the efficacy of an ASR system deployed in a robot depends upon various factors and its accuracy is predominantly affected by noise, speaker's accent and distance. Research on ASR primarily focus on improving transcription accuracy for general-purpose applications and existing ASR systems that are publicly available (commercial or otherwise), follow the same general approach of modeling and training. Previous attempts at improving speech recognition in robots primarily focus on either improving the quality of the received speech signal or exploit controllable acoustic characteristics.

Photo by Possessed Photography on Unsplash

Why is it important?

The existing works that include ASR in robots, do not use such factual knowledge. We propose a method to include the KG during ASR inference, which is based on an approach to bias the "beam search decoding" process of ASR. However, we propose significant modifications to the biasing method to make it suitable for biasing using a KG.

Perspectives

This work aims to improve ASR accuracy, assuming the ASR is utilized to transcribe natural language instructions given to a robot. Specifically, we introduce the problem of incorporating domain-specific prior knowledge about objects in the environment while performing inference with a pre-trained ASR model. We consider three types of relational knowledge about objects - affordance, physical attributes, and co-occurrence relations (spatial attributes). Even though a particular instance of such knowledge would be domain-specific, the knowledge types are common to human-robot interaction that involves natural language.
Chayan Sarkar
TCS Research

This page is a summary of: Utilizing Prior Knowledge to Improve Automatic Speech Recognition in Human-Robot Interactive Scenarios, March 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3568294.3580129.
You can read the full text:

Read

Contributors

The following have contributed to this page

Utilizing Prior Knowledge to Improve ASR in Human-Robot Interactive Scenarios

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Utilizing Prior Knowledge to Improve ASR in Human-Robot Interactive Scenarios

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management