What is it about?

An important goal of Affective Computing is to develop machines that own emotional intelligence - ideally, machines that are able to identify emotions the same way as a real human. This process involves training Artificial Intelligence (AI) models with data from a variety of modalities - for instance, from text or speech. We do not know yet, however, how well machines, in comparison to human beings, are performing in Speech Emotion Recognition (SER). In part, this lack of knowledge is due to several fallacies prevalent in traditional Affective Computing, including the fact that promising SER performance typically results from assessing just a few basic emotions produced in clean in-the-lab environments. In contrast, “real” emotions are often mixed and the environments are noisy. In addition, it is well known that training AI models usually requires a large amount of data, a resource which is considerably limited in the context of SER. The extent to which the performance of SER is impacted by the training data size is still unknown. These are examples of some of the variables playing a role in SER performance which are often disregarded.

Featured Image

Why is it important?

This study is important because it contributes to a more adequate modelling of emotions encoded in speech, by addressing some of the fallacies prevalent in traditional Affective Computing.


We believe that in order to understand the real potential of Speech Emotion Recognition (in particular) and of Affective Computing (in general), a more adequate modelling is needed. We hope this study serves as an inspiration to other researchers aiming to pursue this goal.

Emilia Parada-Cabaleiro
Johannes Kepler Universitat Linz

Read the Original

This page is a summary of: Perception and classification of emotions in nonsense speech: Humans versus machines, PLoS ONE, January 2023, PLOS, DOI: 10.1371/journal.pone.0281079.
You can read the full text:




The following have contributed to this page