Cohere's nonprofit research lab has launched Aya Vision, a state-of-the-art multimodal AI model capable of tasks like image captioning, photo question answering, and text translation across 23 languages. Available for free via WhatsApp, Aya Vision seeks to make advanced AI resources accessible globally. It includes two variants, Aya Vision 32B and 8B, both outperforming larger existing models on key benchmarks. Trained on a diverse dataset with AI-generated annotations, Aya Vision represents a significant advancement in AI research and is distributed under a Creative Commons license, with limitations on commercial use.
While AI has made significant progress, there is still a big gap in how well models perform across different languages... Aya Vision aims to explicitly help close that gap.
Aya Vision 32B sets a new frontier, outperforming models 2x its size... on certain visual understanding benchmarks.
Both models are available from AI dev platform Hugging Face under a Creative Commons 4.0 license with Cohere's acceptable use addendum.
Cohere called Aya Vision a significant step towards making technical breakthroughs accessible to researchers worldwide.
Collection
[
|
...
]