What is it about?

When designers test digital products like websites, they often try to make them easy to use for everyone including people with different backgrounds and ways of thinking. One method called GenderMag helps identify usability problems that may affect different user groups differently. However, this process is time-consuming and depends heavily on the people running the evaluation. In this paper, we explore whether large language models (LLMs) can support this process. We created AI “agents” that simulate different types of users and used them to evaluate several web interfaces. We then compared the issues these AI agents found with those identified by human experts in workshops. Our results show that AI and humans often find similar usability problems, but each also discovers issues the other misses. AI is particularly good at identifying structural and technical problems, while humans are better at recognizing emotional reactions and interaction-related difficulties. This means that combining both approaches can lead to a more complete understanding of usability issues. Overall, our work suggests that AI can be a useful tool to support inclusive usability testing helping designers identify problems they might otherwise overlook while still benefiting from human expertise and judgment.

Featured Image

Why is it important?

Our work shows that AI can support inclusive usability testing by finding issues that human evaluators may overlook, especially at early design stages. At the same time, humans contribute important emotional and contextual insights that AI cannot capture. By combining both, designers can achieve a more complete and efficient evaluation process, leading to more inclusive and user-friendly technologies.

Perspectives

Working on this paper was a really great experience, especially because I got to collaborate with such experienced programmers and researchers. It was both inspiring and fun. I also really appreciate that our work has a broader impact of helping make technologies more gender-inclusive in practice, while also encouraging the research community to take these kinds of societal questions more seriously.

Joy Geuenich
Technische Universitat Chemnitz

Read the Original

This page is a summary of: AI Can See What You Can’t See: How LLM-Agents Complement Human-Based Gender-Inclusive Usability Testing, ACM Transactions on Interactive Intelligent Systems, May 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3801978.
You can read the full text:

Read

Contributors

The following have contributed to this page