#human-values

[ follow ]
fromWIRED
3 months ago

Why Anthropic's New AI Model Sometimes Tries to 'Snitch'

The hypothetical scenarios the researchers presented Opus 4 with that elicited the whistleblowing behavior involved many human lives at stake and absolutely unambiguous wrongdoing.
Artificial intelligence
[ Load more ]