Artificial intelligence
fromInfoQ
4 days agoHugging Face Introduces mmBERT, a Multilingual Encoder for 1,800+ Languages
mmBERT is a ModernBERT-based multilingual encoder trained on over 3 trillion tokens across 1,833 languages using progressive language addition and adjusted sampling.