What is it about?

Modern AI systems need to consume large amounts of training data to achieve outstanding performances on various tasks. In this article, we demonstrate that beyond the quantity and quality of the data, it is also important to find the optimal order to present the training samples to AIs. We compare several curriculum criteria to ensure that the models first learn the task by first seeing the easy samples, and then moving on to the more challenging examples.

Featured Image

Why is it important?

We demonstrate that choosing the right metric to sort the training data leads to significantly better speech recognition accuracy. Furthermore, using a training curriculum speeds up the training convergence considerably, even with smaller datasets. Lastly, we demonstrate that the optimal curriculum is often not monotonic, meaning that we need to mix some challenging samples with the easy ones at the beginning of training otherwise the AI could easily overfit.

Perspectives

While we conducted the experiments, I have learnt a lot about modern speech recognizers and how they obtain their knowledge. In my opinion, the community often overlooks the fact that AIs also require structured learning just like humans, and by designing curriculums, we could enable faster training and higher performance. I feel that curriculum learning will be a major research area in the future, especially, when the size of training corpora grows to extreme scales.

Tamas Grosz
Aalto-yliopisto

Read the Original

This page is a summary of: Comparison and Analysis of New Curriculum Criteria for End-to-End ASR, September 2022, International Speech Communication Association,
DOI: 10.21437/interspeech.2022-10046.
You can read the full text:

Read

Contributors

The following have contributed to this page