#automated-multi-turn-testing

[ follow ]
Artificial intelligence
fromInfoQ
1 week ago

Claude Sonnet 4.5 Ranked Safest LLM From Open-Source Audit Tool Petri

Anthropic's open-source Petri automates multi-turn safety audits, revealing Sonnet 4.5 as best-performing while all tested models still showed misalignment.
[ Load more ]