Ars Technica
5 months agoData science
Turing test on steroids: Chatbot Arena crowdsources ratings for 45 AI models
The Large Model Systems Organization (LMSys) has created Chatbot Arena, a platform for comparing large language models (LLMs) based on blind pairwise ratings.
Users can enter prompts and compare side-by-side responses from two randomly selected models. [ more ]