Surveying the Benchmarking Landscape of Large Language Models in Code Intelligence

Mohammad Abdollahi; Ruixin Zhang; Nima Shiri Harzevili; Jiho Shin; Song Wang; Hadi Hemmati

doi:10.1145/3800957

What is it about?

surveying benchmarks used to evaluate LLMs in software engineering tasks (e.g., code generation, program repair). Exploring how these datasets were created and how their quality is ensured. Checking how they handle and mitigate data contamination. Reviewing the evaluation pipelines and metrics used to test LLMs on these tasks. Providing future directions for creating better benchmark datasets.

Photo by Árpád Czapp on Unsplash

Why is it important?

This survey provides an evidence-based critique of the LLM benchmark datasets in software engineering, exposing hidden flaws like data contamination. Also, this work will help researchers choose the best benchmarks for their specific software engineering tasks and provide an actionable roadmap to create better, real-world benchmarks.

This page is a summary of: Surveying the Benchmarking Landscape of Large Language Models in Code Intelligence, ACM Transactions on Software Engineering and Methodology, March 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3800957.
You can read the full text:

Read

Contributors

The following have contributed to this page

Mohammad Abdollahi
York University

Survey on large language model (LLM) benchmarks for software engineering and coding tasks

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Survey on large language model (LLM) benchmarks for software engineering and coding tasks

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management