LLaVA-CoT Shows How to Achieve Structured, Autonomous Reasoning in Vision Language ModelsLLava-CoT enhances visual language models' reasoning abilities by adopting a structured, multistage approach, leading to superior performance over larger counterparts.
Google shows off new smaller generative AI tools and an AI agent on your phoneGoogle showcased updates to its generative AI tools, including Gemini 1.5 Flash, optimized for high-volume, high-frequency tasks, such as summarization, chat applications, and image captioning.