Two-stage spoken term detection system for under-resourced languages

Deekshitha G; Leena Mary

doi:10.1049/iet-spr.2019.0131

What is it about?

The speech signal of a word uttered by a speaker at two different times varies in terms of energy content, duration etc. So it is a difficult job to find occurrences of the query by finding similar speech signals. It is even more challenging when the speech database is huge containing the utterances of a large number of speakers having variations due to differences in gender, age, speech style etc. Spoken Term Detection (STD) refers to the process of locating the occurrences of spoken queries in a large speech database. For the STD task, all we have is speech signals of corresponding queries and search database. This work mentions about the processing of these speech signals to find the query locations.

Photo by Standsome Worklifestyle on Unsplash

Why is it important?

Generally, two methods have been adopted for STD: an Automatic Speech Recognition (ASR) based label sequence matching or feature-based template matching. ASR-based techniques utilize phoneme models of a language, which require a considerable amount of labelled training data in the selected language. Hence such techniques are considered as language-dependent, and it is not feasible to develop ASR for each language. The feature-based template matching techniques address this task in a language-independent manner, but they are computationally complex. This work combines the positive aspect of both the methods by introducing a multistage architecture to address the task of STD for low-resourced languages.

This page is a summary of: Two-stage spoken term detection system for under-resourced languages, IET Signal Processing, July 2020, the Institution of Engineering and Technology (the IET),
DOI: 10.1049/iet-spr.2019.0131.
You can read the full text:

Read

Contributors

The following have contributed to this page

Deekshitha G
Rajiv Gandhi Institute of Technology, Kottayam, Kerala, India

Multistage architecture to locate the input query locations from datasets of low-resourced languages

What is it about?

Why is it important?

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Multistage architecture to locate the input query locations from datasets of low-resourced languages

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

You might also like

A direct numerical simulation display of the rotational frame preference of turbulence

Combining trajectory optimization, supervised machine learning, and model structure for mitigating the curse of dimensionality in the control of bipedal robots

Current Control Based on Limit Cycle Stability for Photovoltaic Arrays

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management