What is it about?
This paper explores how people make decisions about combining data tables (table unionability), a common yet complex task in data science. We conducted an experimental study where participants evaluated whether pairs of tables could be merged effectively. We analyzed how their confidence, speed, and accuracy related to their decisions. Then we developed machine learning models to see if we could improve those particular human judgments. Finally, we compared the human performance with that of a Large Language Model, here we used Llama-3.3-70B-Instruct model, discovering that combining human insights with AI can lead to more accurate results than either of them can do so alone.
Featured Image
Photo by BoliviaInteligente on Unsplash
Why is it important?
Combining data from multiple sources is critical for uncovering key insights in data science, yet determining whether a pair of tables can be effectively merged remains challenging as always, due to the different interpretations of table unionability. Our research shows that carefully integrating human judgment and LLMs’ leads to more accurate and reliable decisions. This combination can significantly improve the information retrieval process by building smarter data discovery systems.
Perspectives
This study was personally very interesting to me, it grew my passion for behavioral and cognitive research, highlighted how human behavior directly influences data science tasks, an aspect often overlooked in technical research. It emphasizes that AI alone is not enough, and that human judgment, despite its imperfections, can significantly boost the performance of automated systems.
Mr. Sreeram Marimuthu
Worcester Polytechnic Institute
Read the Original
This page is a summary of: Humans, Machine Learning, and Language Models in Union: A Cognitive Study on Table Unionability, June 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3736733.3736740.
You can read the full text:
Resources
Contributors
The following have contributed to this page







