Google is previewing a new Gemini AI model designed to navigate and interact with the web via a browser, letting AI agents do things inside interfaces designed for use by people and not robots. The model, called Gemini 2.5 Computer Use, uses "visual understanding and reasoning capabilities" to analyze a user's request and carry out a task, such as filling out and submitting a form. It can be used for UI testing or navigating interfaces made for people who don't have an API or other direct connection available.
Anthropic's first effort is a closed beta of a Chrome web browser extension. With this extension, you'll be able to chat with Claude in a persistent side panel that maintains context from active browser sessions. Beyond conversational AI, the extension can read, navigate, and take actions within websites. These actions can include tasks such as locating listings on Zillow, summarizing documents, or adding items to shopping carts -- directly from the browser sidebar.