Will Smith eating spaghetti and other weird AI benchmarks that took off in 2024 | TechCrunch
Briefly

In 2024, bizarre benchmarks like Will Smith eating spaghetti highlight how unconventional tests resonate more with the public than traditional academic standards.
Many standard AI benchmarks focus on academic performance, yet most users interact with AI for everyday tasks rather than Ph.D.-level problems.
Crowdsourced platforms like Chatbot Arena lack representation and rely on subjective opinions, creating a disconnect between AI performance and user needs.
Ethan Mollick from Wharton emphasizes the flaws in AI benchmarks, suggesting that they often fail to compare systems effectively against real-world performance.
Read at TechCrunch
[
|
]