What is it about?

In this study, we used a trained deep learning model called BERT to understand emotions in social media posts in English and Luganda, a widely spoken language in Uganda. Just as humans learn from experience, BERT learns from lots of text to better guess the emotion behind new sentences. We trained BERT using posts from Twitter and Reddit. Our work shows that with the right training, BERT can recognize feelings in widely used English and less common languages like Luganda. This can help us build technology that understands people better worldwide.

Featured Image

Why is it important?

Our research is pioneering in its approach to sentiment analysis for both a global language, English, and a less commonly studied language, Luganda, using social media content. It is unique in demonstrating that advanced machine learning models, such as BERT, can learn and provide accurate sentiment analysis even for languages that are underrepresented in technology. The timeliness of this work is underscored by the rapid expansion of social media use globally, where understanding the sentiment of users is increasingly important. Our work can influence the development of more inclusive technology that recognizes and understands diverse languages, improving communication and information dissemination across different cultures and communities.

Perspectives

Embarking on this research was eye-opening, given the stark lack of resources for languages like Luganda. The absence of large datasets and pre-trained models based on the transformer architecture for Luganda posed a significant challenge, prompting us to begin with established machine-learning techniques. Our journey from foundational models such as Naive Bayes, Random Forest, Support Vector Machines, and Gradient Boosting to the sophisticated BERT model was driven by curiosity and a commitment to inclusivity in language processing. Our findings underscore the potential of NLP to revolutionize communication in healthcare and education by leveraging natural languages. This experience has strengthened our resolve to develop a versatile large language model for African languages aimed at contributing to economic growth and societal well-being.

Richard Kimera
Handong Global University

Read the Original

This page is a summary of: Fine-Tuning BERT on Twitter and Reddit Data in Luganda and English, December 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3639233.3639344.
You can read the full text:

Read

Contributors

The following have contributed to this page