Five evidence-based value dimensions explain differences in goals, priorities, and motivations and can guide decisions to improve long-term well-being.
Why Anthropic's New AI Model Sometimes Tries to 'Snitch'
The hypothetical scenarios the researchers presented Opus 4 with that elicited the whistleblowing behavior involved many human lives at stake and absolutely unambiguous wrongdoing.