Leading AI makers at odds over how to measure "responsible" AI

from Axios 2 months ago

"AI models behave very differently for different purposes," Nestor Maslej, editor of the 2024 AI Index from Stanford University's HAI, told Axios.
Axioshttps://www.axios.com/2024/04/16/responsible-ai-metrics-benchmarks

Developers' appetite for responsibility testing varies widely with some like Meta benchmarking models against multiple tests, while others like Mistral's 7B are not benchmarked at all.
Axioshttps://www.axios.com/2024/04/16/responsible-ai-metrics-benchmarks

Current benchmarks focus on specific areas such as assessing honesty in answers (Truthful QA) or detecting hate speech (RealToxicityPrompts, Toxic Gen).
Axioshttps://www.axios.com/2024/04/16/responsible-ai-metrics-benchmarks

"There's a clear lack of standardization, but we don't know why," HAI's Maslej told Axios, adding that cherry-picking benchmarks could be one reason for the variability.
Axioshttps://www.axios.com/2024/04/16/responsible-ai-metrics-benchmarks

Read at Axios

#ai-testing-methods #responsible-ai-behavior #benchmarking-tools #ai-risks #standardization-challenges

[

]

[

...

]

Leading AI makers at odds over how to measure "responsible" AILeading AI makers at odds over how to measure "responsible" AI Briefly

Leading AI makers at odds over how to measure "responsible" AI
Leading AI makers at odds over how to measure "responsible" AI
Briefly