What is it about?

We present a new way to help researchers and the public explore large collections of recorded speech. Our system uses unsupervised learning to group related transcripts and automatically generates simple illustrations that visually represent what each recording contains. These AI-generated images make it easier to browse, spot themes, and choose interesting recordings without needing to listen to everything.

Featured Image

Why is it important?

Vast archives of interviews, oral histories and sound recordings are difficult to navigate. Our approach combines automatic topic clustering and adds quick, meaningful visuals to guide discovery. This can save time for historians, linguists and educators and open up collections for wider public use.

Perspectives

We were inspired by how much spoken history remains hidden in archives. Creating automatic illustrations and grouping topics helped us think about how people search and learn visually. We hope it will encourage cultural institutions to experiment with new AI tools for making sound heritage more accessible.

Sirina Håland
Universitetet i Stavanger

Read the Original

This page is a summary of: Navigating Speech Recording Collections with AI-Generated Illustrations, July 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3726302.3730136.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page