OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution

Lianghong Guo; Wei Tao; Runhan Jiang; Yanlin Wang; Jiachi Chen; Xilin Liu; Yuchi Ma; Mingzhi Mao; Hongyu Zhang; Zibin Zheng

doi:10.1145/3728871

What is it about?

OmniGIRL is a multilingual & multimodal GitHub-issue-resolution benchmark with 959 tasks spanning four programming languages. Inputs may include text, screenshots, rendered web pages and other modalities.

Why is it important?

Key Features - Convenient, Standardized Evaluation Environment Provide Pre-built Docker images, significantly simplifying the environment setup process and guaranteeing the consistency and reproducibility of evaluation tests. - Extensive Programming Language Coverage Support Python, Java, JavaScript, and TypeScript, ensuring effective evaluation across these four major programming language ecosystems. - Rich Multimodal Input Data Integrate diverse modalities (text, web content, and images), requiring evaluated models to understand and leverage information from all sources to effectively resolve issues. - Automatic Environment Setup & Dataset Construction Tool We introduce SWE-Factory, an automatic issue-resolution benchmark construction pipeline based on a multi-agent framework. For more information and the full source code.

This page is a summary of: OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution, Proceedings of the ACM on Software Engineering, June 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3728871.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page

Wei Tao
Fudan University

OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution

What is it about?

Why is it important?

Resources

Leaderboard

GitHub Repository

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution

What is it about?

Featured Image

Why is it important?

Read the Original

Resources

Leaderboard

GitHub Repository

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management