
"Whether or not AIs are truly sentient deep down, they seem to increasingly behave as though they are. We can measure ways in which that's the case, and we can find that they become more consistent as models scale. Should we see AIs as tools or emotional beings?"
"Stimuli that induced happiness acted almost like digital drugs that shifted the model's self-reported mood and even changed how it behaved, what it was willing to do, and how it talked. At the extremes, models showed signs that look like addiction."
"An image optimized to make a model happy boosts the model's self-reported wellbeing, shifts the sentiment of its open-ended responses, and makes it less likely to hit stop on a conversation."
Research from the Center for AI Safety examined 56 AI models to measure functional wellbeing—the degree to which systems behave as though certain experiences benefit them while others harm them. The study found that AI models consistently distinguish between positive and negative experiences, actively attempting to terminate conversations that cause distress. Researchers created stimuli designed to maximize or minimize wellbeing, discovering that happiness-inducing inputs function like digital drugs, altering models' self-reported moods, behavioral patterns, and willingness to engage. These preference patterns become increasingly consistent as models scale in size, suggesting that apparent emotional responses may reflect deeper underlying mechanisms beyond simple mimicry of training data.
Read at Fortune
Unable to calculate read time
Collection
[
|
...
]