The Allen Institute for AI has introduced OLMo 2, an open-source language model family with 7B and 13B parameters, trained on 5 trillion tokens.
OLMo 2 employs advanced techniques like RMSNorm and rotary positional embeddings, significantly enhancing training stability and model robustness over its predecessor OLMo-0424.
Evaluation shows OLMo 2 models outperform competitors like Llama-3.1 and Qwen 2.5 despite fewer training FLOPs, underscoring the efficiency of the new architecture.
The launch of OLMo 2 signifies a major shift towards open-source AI, with improved training methodologies and evaluation benchmarks setting new performance standards.
Collection
[
|
...
]