SIMA 2 Uses Gemini and Self-Improvement to Generalize Across Unseen 3D and Photorealistic Worlds
Briefly

SIMA 2 Uses Gemini and Self-Improvement to Generalize Across Unseen 3D and Photorealistic Worlds
"Google DeepMind researchers introduced SIMA 2 (Scalable Instructable Multiworld Agent), a generalist agent built on the Gemini foundation model that can understand and act across multiple 3D virtual game environments. The agent marks a departure from its predecessor SIMA 1 by moving beyond simple command execution to "reasoning about high-level goals, conversing with the user, and handling complex instructions given through language and images." Where the first version required step-by-step direction, SIMA 2 can formulate multi-step plans and discuss strategy with users."
"The agent employs a self-improvement cycle where Gemini supplies an initial task along with an estimated reward for SIMA 2's actions. The system adds this information to a bank of self-generated experience, which it then uses for training in subsequent iterations. According to the researchers, this process allows the agent to "improve on previously failed tasks entirely independently of human-generated demonstrations and intervention.""
SIMA 2 is a Gemini-based generalist agent that understands and acts across multiple 3D virtual game environments. It reasons about high-level goals, converses with users, handles complex instructions given through language and images, and formulates multi-step plans rather than requiring step-by-step commands. The system retains Gemini's reasoning capabilities and can interface with advanced Gemini variants for extended functionality. A self-improvement cycle supplies initial tasks with estimated rewards, stores self-generated experience, and uses that data for subsequent training to improve on failed tasks without human demonstrations or intervention. Evaluation shows substantial closing of the gap with human performance and robust generalization to unseen environments, with qualitative tests in The Gunk and Genie 3.
Read at InfoQ
Unable to calculate read time
[
|
]