OpenAI's New AI Models o3 and o4-mini Can Now 'Think With Images'
Briefly

OpenAI has unveiled two innovative AI models, o3 and o4-mini, that enhance visual reasoning capabilities, allowing machines to manipulate images (cropping, zooming, rotating) just like humans. The models blend visual and verbal reasoning for more accurate results, outperforming previous versions across crucial benchmarks in academic and AI fields. For instance, their efficiency is evident in high accuracy rates for tasks like STEM question-answering and visual search, demonstrating a major advancement in AI's ability to interpret and analyze complex visual information comprehensively.
OpenAI o3 and o4‑mini represent a significant breakthrough in visual perception by reasoning with images in their chain of thought.
ChatGPT's enhanced visual intelligence helps you solve tougher problems by analyzing images more thoroughly, accurately, and reliably than ever before.
Our models set new state-of-the-art performance in STEM question-answering (MMMU, MathVista), chart reading and reasoning (CharXiv), perception primitives (VLMs are Blind), and visual search (V*),.
On V*, our visual reasoning approach achieves 95.7% accuracy, largely solving the benchmark.
Read at TechRepublic
[
|
]