Anthropic's Claude plays 'for peace over victory' in a game of Diplomacy against other AI
Briefly

A recent conversation among leading AI experts, including Andrej Karpathy and Elon Musk, proposed using the board game Diplomacy as a method to evaluate large language models. AI researcher Alex Duffy responded by creating a version called 'AI Diplomacy,' where various AI models, including OpenAI's o3 and Anthropic's Claude, competed. The game, rooted in strategic alliances and negotiations, proved to be a compelling environment for assessing intelligence and interaction capabilities among AI systems, highlighting their differing approaches.
Diplomacy is a strategic board game set on a map of Europe in 1901 - a time when tensions between the continent's most powerful countries were simmering in the lead-up to World War I.
I quite like the idea of using games to evaluate LLMs against each other, instead of fixed evals. Everyone knows the usual benchmarks are a bore.
Noam Brown, a research scientist at OpenAI, suggested the 75-year-old geopolitical strategy game, Diplomacy. 'I would love to see all the leading bots play a game of Diplomacy together.'
Alex Duffy published a post titled, 'We Made Top AI Models Compete in a Game of Diplomacy. Here's Who Won.'
Read at Business Insider
[
|
]