LMArena has launched Code Arena, a new evaluation platform that measures AI models' performance in building complete applications instead of just generating code snippets. It emphasizes agentic behavior, allowing models to plan, scaffold, iterate, and refine code within controlled environments that replicate actual development workflows. Instead of checking whether code merely compiles, Code Arena examines how models reason through tasks, manage files, react to feedback, and construct functional web apps step by step.
In 2023, a group of researchers from the University of California, Berkeley, started Chatbot Arena, now called LMArena. It allows people to compare different AI models with prompts and determine which is better. Users can vote for how well models perform and compare them on a leaderboard. LMArena saw a tenfold traffic spike in August when a mysterious new AI text-to-image and image editing model, Nano Banana, went viral for churning out impressive images and photo edits.