Microsoft unveils AI model that understands image content, solves visual puzzles
Briefly

On Monday, researchers from Microsoft introduced Kosmos-1, a multimodal model that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ tests, and understand natural language instructions.The researchers believe multimodal AI-which integrates different modes of input such as text, audio, images, and video-is a key step to building artificial general intelligence (AGI) that can perform general tasks at the level of a human.
Read at Ars Technica
[
add
]
[
|
|
]