#low-resource-languages tag

Google Introduces TranslateGemma Open Models for Multilingual Translation

TranslateGemma is an open suite of 4B, 12B, and 27B translation models delivering efficient machine translation across 55 languages for diverse hardware.

#ai-translation

fromNieman Lab

3 months ago

Artificial intelligence

Studies on AI transcription and translation in journalism reveal "low-resource" language gap, new report finds

AI transcription and translation tools are accurate and accessible for dominant languages like English but perform poorly and create accessibility barriers for low-resource languages.

fromFortune Asia

7 months ago

Digital life

The world's best AI models operate in English. Other languages-even major ones like Cantonese-risk falling further behind

AI translation models struggle with languages that have limited online data, leading to mistranslations and inaccuracies.

fromNieman Lab

3 months ago

Artificial intelligence

Studies on AI transcription and translation in journalism reveal "low-resource" language gap, new report finds

fromFortune Asia

7 months ago

Digital life

The world's best AI models operate in English. Other languages-even major ones like Cantonese-risk falling further behind

more#ai-translation

fromInfoQ

5 months ago

Hugging Face Introduces mmBERT, a Multilingual Encoder for 1,800+ Languages

Hugging Face has released mmBERT, a new multilingual encoder trained on more than 3 trillion tokens across 1,833 languages. The model builds on the ModernBERT architecture and is the first to significantly improve upon XLM-R, a long-time baseline for multilingual understanding tasks. mmBERT uses a progressive training schedule instead of training on all languages at once. It starts with 60 high-resource languages, expands to 110, and finally includes all 1,833 languages.

Artificial intelligence

fromHackernoon

1 year ago

Training Tesseract for Low-Resource Languages | HackerNoon

Trained Tesseract OCR on 1233 Kurdish text lines from pre-1950 documents to advance digitization of Kurdish historical materials.

Scala

fromHackernoon

1 year ago

Why Lua Is the Ideal Benchmark for Testing Quantized Code Models | HackerNoon

Lua presents unique challenges for quantized model performance due to its low-resource status and unconventional programming paradigms.

#low-resource-languages#low-resource-languages

Google Introduces TranslateGemma Open Models for Multilingual Translation

Studies on AI transcription and translation in journalism reveal "low-resource" language gap, new report finds

The world's best AI models operate in English. Other languages-even major ones like Cantonese-risk falling further behind

Studies on AI transcription and translation in journalism reveal "low-resource" language gap, new report finds

The world's best AI models operate in English. Other languages-even major ones like Cantonese-risk falling further behind

Hugging Face Introduces mmBERT, a Multilingual Encoder for 1,800+ Languages

Training Tesseract for Low-Resource Languages | HackerNoon

Why Lua Is the Ideal Benchmark for Testing Quantized Code Models | HackerNoon

#low-resource-languages
#low-resource-languages