What is it about?

Natural Language Processing (NLP) is a field of computer science that aims to enable machines to understand, interpret and generate human language. Research into NLP in Arabic has grown considerably since the 2010s. However, NLP for Arabic dialects remains a very active field, facing challenges due to linguistic diversity within the Arab world and dialectal variations between regions. This research focuses on the sentiment analysis of texts written in Moroccan dialect (Darija) using deep learning techniques.

Featured Image

Why is it important?

The study presents a rich related work section, focusing on three key areas: Moroccan dialect corpora, sentiment analysis studies in Modern Standard Arabic and Arabic dialects, and pre-trained models for Moroccan dialect. It also explains in detail the preprocessing and tokenization of texts written in Darija using both Arabic and Latin scripts. Additionally, the research describes the fine-tuning of several base BERT models, detailing each step with relevant algorithms. The results highlight the effectiveness of pretrained models on Moroccan Darija, with DarijaBERT achieving an accuracy of 82.5% and an F1 score of 0.80.

Perspectives

we recommend a second study focusing on Aspect-Based Sentiment Analysis (ABSA). This approach would enable a fine-grained analysis of different aspects of a text and their associated sentiments, providing a more nuanced understanding of opinions and emotions expressed in Moroccan Darija comments.

Kaoutar ABOUKASS

Read the Original

This page is a summary of: Comparative study of Moroccan Dialect Sentiment analysis: Finetuning Deep learning transformers, April 2024, ACM (Association for Computing Machinery),
DOI: 10.1145/3659677.3659708.
You can read the full text:

Read

Contributors

The following have contributed to this page