Multi-Parallel Corpus of North Levantine Arabic

Mateusz Krubiński; Hashem Sellat; Shadi Saleh; Adam Pospíšil; Petr Zemánek; Pavel Pecina

doi:10.18653/v1/2023.arabicnlp-1.34

What is it about?

Low-resource Machine Translation (MT) is characterized by the scarce availability of training data and/or standardized evaluation benchmarks. In the context of Dialectal Arabic, recent works introduced several evaluation benchmarks covering both Modern Standard Arabic (MSA) and dialects, mapping, however, mostly to a single Indo-European language-English. In this work, we introduce a multi-lingual corpus consisting of 120,600 multi-parallel sentences in English, French, German, Greek, Spanish, and MSA selected from the OpenSubtitles corpus, which were manually translated into the North Levantine Arabic. By conducting a series of training and fine-tuning experiments, we explore how this novel resource can contribute to the research on Arabic MT.

Why is it important?

Adds a resource for a low resource language - North Levantine (Syrian) Arabic.

This page is a summary of: Multi-Parallel Corpus of North Levantine Arabic, January 2023, Association for Computational Linguistics (ACL),
DOI: 10.18653/v1/2023.arabicnlp-1.34.
You can read the full text:

Read

Contributors

The following have contributed to this page

Petr Zemánek
Univerzita Karlova

Description of a parallel corpus of North Levantine Arabic with Modern Standard Arabic and English.

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Description of a parallel corpus of North Levantine Arabic with Modern Standard Arabic and English.

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management