What is it about?
Spoken interaction is an important part of a system that aims to enable a seamless way of instructing a robot. The recent success of automatic speech recognition (ASR) systems has paved the way for realistic applications of ASR in human robot interaction. However, the efficacy of an ASR system deployed in a robot depends upon various factors and its accuracy is predominantly affected by noise, speaker's accent and distance. Research on ASR primarily focus on improving transcription accuracy for general-purpose applications and existing ASR systems that are publicly available (commercial or otherwise), follow the same general approach of modeling and training. Previous attempts at improving speech recognition in robots primarily focus on either improving the quality of the received speech signal or exploit controllable acoustic characteristics.
Featured Image
Photo by Possessed Photography on Unsplash
Why is it important?
The existing works that include ASR in robots, do not use such factual knowledge. We propose a method to include the KG during ASR inference, which is based on an approach to bias the "beam search decoding" process of ASR. However, we propose significant modifications to the biasing method to make it suitable for biasing using a KG.
Perspectives
This work aims to improve ASR accuracy, assuming the ASR is utilized to transcribe natural language instructions given to a robot. Specifically, we introduce the problem of incorporating domain-specific prior knowledge about objects in the environment while performing inference with a pre-trained ASR model. We consider three types of relational knowledge about objects - affordance, physical attributes, and co-occurrence relations (spatial attributes). Even though a particular instance of such knowledge would be domain-specific, the knowledge types are common to human-robot interaction that involves natural language.
Chayan Sarkar
TCS Research
Read the Original
This page is a summary of: Utilizing Prior Knowledge to Improve Automatic Speech Recognition in Human-Robot Interactive Scenarios, March 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3568294.3580129.
You can read the full text:
Contributors
The following have contributed to this page