What is it about?
Red-teaming is a crucial part of the development of AI models, but its virtually invisible to users. The process of testing your own model to see how, with the right prompting, you can make the model break its own rules and deliver harmful, offensive, or dangerous responses. But this is not just a technical accomplishment. As we note in the paper - like the moderation of social media platforms - this is a consquential moment when values are being determined by the company for its users, it is a human infrastructure of labor that is being put into place in the AI industry, and it comes with the threat of real mental health challenges for the people who do this work.
Featured Image
Why is it important?
The paper is a call for more sociological attention to this issue, which is an element often overlooked by the technical research communities. And it has an eye for the lessons of history, which this publication and its readership seem committed to.
Perspectives
Its important to say that, while the authors are all researchers at Microsoft, this paper is not professing a company stance or policy. It is speaking to the research community of which we are part, and to AI practitioners who are concerned with the social, labor, and ehtical implications of the systems they build.
Tarleton Gillespie
Microsoft Corp
Read the Original
This page is a summary of: AI Red-Teaming Is a Sociotechnical Problem, Communications of the ACM, January 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3731657.
You can read the full text:
Contributors
The following have contributed to this page







