#cais
#cais

[ follow ]

'Humanity's Last Exam' benchmark is stumping top AI models - can you do any better?

AI models are currently underperforming on the new Humanity's Last Exam benchmark, scoring less than 10% correct answers.

[ Load more ]