Elon Musk's xAI previews Grok-1.5V, its first multimodal model
Briefly

Grok-1.5V has a context length of 128,000 tokens, allowing for increased memory capacity. It can understand documents, science diagrams, and translate content into Python code.
The introduction of the RealWorldQA benchmark aims to evaluate multimodal models' spatial understanding in real-world scenarios like determining car turn directions or identifying largest objects in images.
Upgrading multimodal models is essential for developing beneficial AGI that comprehends the universe. xAI team plans to enhance understanding and generation capabilities over time.
Read at ReadWrite
[
]
[
|
]