Filtering and Extended Vocabulary based Translation for Low-resource Language pair of Sanskrit-Hindi

Piyush Jha; Rashi Kumar; Vineet Sahula

doi:10.1145/3580495

What is it about?

Recent ML models require a large and clean corpus of parallel data. Most of the time, they cannot even deal with rare words effectively. Due to the unavailability of a large parallel corpus, it is challenging to use ML models for translating Sanskrit. However, we have improved the translation accuracy even under zero-shot conditions using morphological patterns (such as Dhatu, Vibhakti, and compound words) and improved filtering heuristics.

Why is it important?

Much work needs to be done using ML to address the challenges in translating Sanskrit, one of the oldest and rich languages known to the world, with its morphological richness and limited multilingual parallel corpus.

Perspectives

Improving the translation models is, in my opinion, one of the best ways to preserve an ancient and rich language like Sanskrit, which was once known to the majority of the Indian population, but is now only spoken by a few thousand.
Piyush Jha
University of Waterloo

This page is a summary of: Filtering and Extended Vocabulary based Translation for Low-resource Language pair of Sanskrit-Hindi, ACM Transactions on Asian and Low-Resource Language Information Processing, January 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3580495.
You can read the full text:

Read

Contributors

The following have contributed to this page

Simple and creative ways to improve Sanskrit-Hindi translation

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Simple and creative ways to improve Sanskrit-Hindi translation

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management