#ai-benchmarks

[ follow ]
TechCrunch
4 weeks ago
Artificial intelligence

The AI industry is obsessed with Chatbot Arena, but it might not be the best benchmark | TechCrunch

Chatbot Arena has emerged as a crucial platform for evaluating AI models, emphasizing real-world user preferences over traditional benchmarks. [ more ]
#generative-ai-models
TechCrunch
4 months ago
Artificial intelligence

AI training data has a price tag that only Big Tech can afford | TechCrunch

Training data is the key to sophisticated AI systems over design or architecture. [ more ]
TechCrunch
5 months ago
Artificial intelligence

Meta releases Llama 3, claims it's among the best open models available | TechCrunch

Llama 3 models are a significant advancement with high parameter counts leading to improved performance in generative AI models. [ more ]
TechCrunch
4 months ago
Artificial intelligence

AI training data has a price tag that only Big Tech can afford | TechCrunch

Training data is the key to sophisticated AI systems over design or architecture. [ more ]
TechCrunch
5 months ago
Artificial intelligence

Meta releases Llama 3, claims it's among the best open models available | TechCrunch

Llama 3 models are a significant advancement with high parameter counts leading to improved performance in generative AI models. [ more ]
moregenerative-ai-models
The Economic Times
5 months ago
Artificial intelligence

AI has hit human-level performance on some parameters: Stanford report

AI models in closed source outperform open source counterparts by 24.2% on select benchmarks. [ more ]
TechCrunch
8 months ago
Artificial intelligence

MLCommons wants to create AI benchmarks for laptops, desktops and workstations | TechCrunch

MLCommons has formed a new working group, MLPerf Client, to establish AI benchmarks for desktops, laptops, and workstations running various operating systems.
The first benchmark will focus on text-generating models, specifically Meta's Llama 2, and will be scenario-driven, focusing on real end-user use cases. [ more ]
TechCrunch
3 months ago
Artificial intelligence

Anthropic looks to fund a new, more comprehensive generation of AI benchmarks | TechCrunch

Anthropic is launching a program to fund the development of new AI benchmarks to evaluate models, focusing on safety and societal impact. [ more ]
TechCrunch
3 months ago
Data science

Anthropic claims its latest model is best-in-class | TechCrunch

Claude 3.5 Sonnet by Anthropic is a performance-improved AI model focusing on efficiency, particularly in text and image analysis. [ more ]
[ Load more ]