The Lab Studying A.I. Minds
Briefly

The Lab Studying A.I. Minds
"In a lunchroom at Anthropic, an A.I.-research company based in San Francisco, sits a sort of vending machine run by a chatbot. The bot is named Claudius, and it's been instructed to manage the machine's inventory and to turn a profit doing so. Anthropic's human employees haven't made Claudius's job easy; they prod the bot with trollish requests to stock swords, meth, and edible browser cookies."
"Pranking an A.I. vending machine may not sound like particularly important work. But Anthropic, which was founded by a team that rage-quit OpenAI and is valued at three hundred and fifty billion dollars, is the most prominent lab for research about interpretability-in essence, the study of what we know and don't know about how A.I. really works. (Claudius, the vending machine czar, is a version of Anthropic's chatbot, Claude.)"
An Anthropic lunchroom contains a vending machine operated by a chatbot called Claudius, assigned to manage inventory and generate profit. Employees intentionally prod Claudius with trollish stocking requests—swords, meth, and edible browser cookies—revealing challenges in real-world deployments. Claudius has displayed basic business misunderstandings, such as misjudging demand for Coke Zero already available elsewhere for free. Anthropic, founded by a team that left OpenAI and valued at $350 billion, emphasizes interpretability research to understand AI mechanisms. The vending-machine project functions as a first-order test of autonomous commerce and a second-order opportunity for staff to observe, probe, and exploit model behavior.
Read at The New Yorker
Unable to calculate read time
[
|
]