Microsoft: Don't let AI agents near your credit card yet
Briefly

Microsoft: Don't let AI agents near your credit card yet
"Ready to have your agent talk to my agent and arrange a sale? Microsoft has published a simulated marketplace to put AI agents through their paces and answer a question for the new age: Would you trust AI with your credit card? Customer-facing assistants are all the rage these days. OpenAI and Anthropic, for example, have helpers that will navigate websites and complete purchases. Then there are assistants that will aid sellers with customer engagement and operations."
"To simulate what might happen, Microsoft's researchers built the Magentic Marketplace, an open-source simulation upon which agents can be unleashed and the results studied. And the conclusion? "Agents should assist, not replace, human decision-making." The marketplace simulation manages catalogs of goods and services, and facilitates agent-to-agent communication. It also handles simulated payments. The researchers simulated transactions such as ordering food or engaging with home improvement services. Agents represented customers and businesses at each end of the transactions."
An open-source simulated marketplace called the Magentic Marketplace manages catalogs, agent-to-agent communication, and simulated payments to model customer-business transactions. Experiments used 100 virtual customers and 300 virtual businesses and included proprietary models (for example GPT-4o and Gemini-2.5-Flash) and open-source models. Agents built queries, navigated results, and negotiated transactions such as ordering food or hiring home-improvement services. Loading agents with more options and search results often reduced the number of comparisons agents made. With some exceptions (notably Gemini-2.5-Flash and GPT-5), models tended to accept initial 'good enough' choices rather than dig deeper. Researchers also tested manipulation strategies, which ranged from fake award credential
Read at Theregister
Unable to calculate read time
[
|
]