Cornell University
5 February 2026
X
A new study presents BhashaSetu, featuring the innovative GETR (Graph-Enhanced Token Representation) method for cross-lingual knowledge transfer to extremely low-resource languages with just hundreds of labeled examples.
The research focuses on improving performance for both sentence-level and word-level NLP tasks in Indian languages. The GETR approach leverages graph neural networks to transfer linguistic knowledge from high-resource languages, outperforming existing multilingual methods.
Results show significant improvements: 13 percentage points for POS tagging in truly low-resource languages like Mizo and Khasi, and impressive gains of 20 and 27 percentage points in macro-F1 scores for sentiment classification and named entity recognition in simulated low-resource languages (Marathi, Bangla, Malayalam).
The study also analyzes the specific mechanisms that make cross-lingual knowledge transfer successful in this context, providing insights for future work on computational linguistics for under-represented languages.
Bangla Sign Language Day: Adviser Murshid Emphasizes Rights of Speech and Hearing Impaired
14 February 2026
Gujarat Minister Advocates AI-Powered 'Bhashini' Tools to Break Language Barriers in Governance
13 February 2026
Government Allocates ₹6,000 Crore for Higher Education Textbooks in Indian Languages
12 February 2026
Tamil Brahmi Inscriptions in Egypt's Valley of Kings Reveal Ancient India-Rome Connections
12 February 2026
MANUU Workshop Emphasizes Quality Standards for Urdu Higher Education Textbooks
11 February 2026
MRPL Conducts National Hindi Seminar on Official Language Development Through Harmony and Inclusion
10 February 2026

