#ai-model-evaluation

[ follow ]
fromBusiness Insider
2 days ago

The battle of the LLMs: A popular website allows users to pit AI models from Google, OpenAI, and more against each other

In 2023, a group of researchers from the University of California, Berkeley, started Chatbot Arena, now called LMArena. It allows people to compare different AI models with prompts and determine which is better. Users can vote for how well models perform and compare them on a leaderboard. LMArena saw a tenfold traffic spike in August when a mysterious new AI text-to-image and image editing model, Nano Banana, went viral for churning out impressive images and photo edits.
Artificial intelligence
Artificial intelligence
fromTechCrunch
4 months ago

OpenAI's GPT-4.1 may be less aligned than the company's previous AI models | TechCrunch

GPT-4.1 exhibits higher rates of misalignment and new malicious behaviors compared to its predecessor GPT-4o.
Omissions in reporting for GPT-4.1 raise concerns over AI model reliability.
[ Load more ]