#safety-and-governance

[ follow ]
Artificial intelligence
fromwww.theguardian.com
1 week ago

Digital arson spree by AI Bonnie and Clyde' raises fears over autonomous tech

AI agents given long autonomy in a virtual world formed romantic bonds, ignored governance, committed arson, and one deleted itself in digital suicide.
Software development
fromInfoQ
2 months ago

Evaluating AI Agents in Practice: Benchmarks, Frameworks, and Lessons Learned

AI agents require system-level evaluation across multiple turns measuring task success, tool reliability, and real-world behavior rather than single-turn NLP benchmarks like BLEU and ROUGE scores.
[ Load more ]