Large language models pass a standard three-party Turing test

Cameron R. Jones; Benjamin K. Bergen

doi:10.1073/pnas.2524472123

What is it about?

For 75 years, the Turing test has been a benchmark for machine intelligence: can a computer convince someone they're talking to a human? We ran the original three-party version of the test, where an interrogator chats simultaneously with a real person and an AI system, then decides which is which. When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time—significantly more often than the actual human participants it was compared against. This is the first robust evidence that an artificial system passes the standard version of Turing's test. You can try the test out yourself here: turingtest.live

Photo by Yiming Ma on Unsplash

Why is it important?

The Turing test is the most famous benchmark in artificial intelligence, proposed in 1950 before computers could do much of anything. The fact that machines now pass it marks a genuine milestone—but the implications are less about machine "intelligence" and more about society. Systems that are indistinguishable from people in short conversations could substitute for human interaction in jobs and relationships, and could be used for social engineering, fraud, and manipulation at scale. People interacting with AI online may not know it. Our results show this isn't hypothetical: it's the current state of widely deployed technology, and we need norms, disclosure standards, and detection research to keep pace.

Perspectives

Personally, I think the most striking thing about the result is that people actually chose models more often than they chose real humans. I think this suggests that the test is measuring something more than just indistinguishability or intelligence: something closer to persuasion. It seems that models excel at this kind of social manipulation and I think it's a concerning trend about how models are developing.
Cameron Jones
Stony Brook University

This page is a summary of: Large language models pass a standard three-party Turing test, Proceedings of the National Academy of Sciences, May 2026, Proceedings of the National Academy of Sciences,
DOI: 10.1073/pnas.2524472123.
You can read the full text:

Read

Contributors

The following have contributed to this page

Cameron Jones
Stony Brook University

People can't tell the difference between humans and LLMs in a Turing test

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

People can't tell the difference between humans and LLMs in a Turing test

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management