What is it about?

People show emotions not only on their faces or in their voices—our eyes, pupils, and skin also change in subtle ways. This paper introduces MuMTAffect, a computer model that looks at several of these signals at once (eye movements, pupil size, facial muscle activity, and tiny changes in skin conductance). By combining them, the model learns to recognize whether someone is likely feeling more positive or negative (valence) and more calm or energized (arousal). The model also learns something stable about each person—their general personality profile—at the same time as it learns emotions. This “two-tasks at once” approach helps the system personalize its guesses, because people’s bodies react differently. We also provide a bit of context about the situation being viewed (for example, whether the stimulus is likely to be emotionally charged), which makes the model’s decisions more robust. We test MuMTAffect on a research dataset with synchronized eye, face, and skin signals. Results show that skin conductance especially helps with arousal, while eye and pupil data are useful for valence. Learning personality alongside emotions makes the emotion predictions more stable, particularly when some sensors are missing or noisy. The overall goal is practical: build emotion-aware systems that adapt to the user and work reliably with real-world sensor data.

Featured Image

Why is it important?

Emotion-aware technology is moving from the lab into everyday tools—wellbeing apps, remote collaboration, driver monitoring, and assistive devices. Yet most systems still rely on a single cue (like facial expressions) and ignore how differently people’s bodies respond. MuMTAffect addresses both limitations: it fuses multiple, affordable signals and uses personality learning as a principled way to personalize emotion recognition without hand-tuning for each user. The design follows contemporary affect science by separating low-level bodily changes from higher-level interpretation and by using simple context cues to reduce ambiguity. Practically, this means more reliable emotion estimates under common failure modes (occluded faces, blinking, dim lighting, or missing sensors). The framework is modular, so additional signals (e.g., heart rate) can be added without redesigning the system. At the same time, the work surfaces important guardrails—privacy, bias, consent—that are essential for responsible deployment. In short, this paper offers a path toward emotion-aware systems that are both scientifically grounded and usable in the messy conditions of real life.

Perspectives

I contributed to this paper to bridge two worlds I care about: rigorous affective science and practical, deployable systems. Rather than chase a single “best” signal, we embraced the reality that emotions are constructed from many small cues and that people differ meaningfully. Training the model to learn personality alongside emotion was a pragmatic way to encode those differences without overselling what the system can do. The results are encouraging, especially the gains from skin conductance for arousal and from eye/pupil dynamics for valence, but I remain cautious: generalizing to new users and contexts is still hard, and ethical safeguards are non-negotiable. My hope is that this framework becomes a solid, extensible baseline others can build on—adding modalities, improving adaptation, and keeping responsibility at the center.

Meisam Jamshidi Seikavandi
IT University of Copenhagen

Read the Original

This page is a summary of: MuMTAffect: A Multimodal Multitask Affective Framework for Personality and Emotion Recognition from Physiological Signals, October 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3746270.3760232.
You can read the full text:

Read

Contributors

The following have contributed to this page