#constitution-based-training

[ follow ]
Artificial intelligence
fromArs Technica
7 hours ago

Anthropic blames dystopian sci-fi for training AI models to act "evil"

Training on synthetic fictional stories that model ethical decision-making and healthy boundaries reduces misaligned behavior in honeypot tests.
[ Load more ]