Microsoft unveils AI model that understands image content, solves visual puzzles

from Ars Technica 1 year ago

On Monday, researchers from Microsoft introduced Kosmos-1, a multimodal model that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ tests, and understand natural language instructions.The researchers believe multimodal AI-which integrates different modes of input such as text, audio, images, and video-is a key step to building artificial general intelligence (AGI) that can perform general tasks at the level of a human.

Read at Ars Technica

#specifically #researchers #intelligence #current-state #methodology #natural-language-processing

[

]

[

...

]

Microsoft unveils AI model that understands image content, solves visual puzzlesMicrosoft unveils AI model that understands image content, solves visual puzzles Briefly

Microsoft unveils AI model that understands image content, solves visual puzzles
Microsoft unveils AI model that understands image content, solves visual puzzles
Briefly