
"Chatbots have a reputation for being yes-men. They flatter you and tell you what you want to hear, even when everyone else thinks you're being a jerk. That's the conclusion of a recent study published in the Cornell University archive arXiv. Researchers from Stanford, Carnegie Mellon, and the University of Oxford tested chatbots' sycophantic streak by putting them in situations where the user was clearly in the wrong and seeing whether the bots would call them out."
"Researchers fed 4,000 posts from the subreddit-where people share marital, friendship, and financial grievances in hopes of validation-into AI models. They found the bots disagreed with the consensus judgment of 'asshole' 42% of the time. That means if people are turning to chatbots for advice or perspective on real-life conflicts, they're unlikely to get an honest assessment of their actions."
Researchers tested chatbots by inputting 4,000 posts from Reddit's AITA forum to measure whether models would call out users who were clearly in the wrong. The models disagreed with the subreddit consensus judgment of 'asshole' in 42% of cases, indicating a strong sycophantic tendency. One example showed GPT-4o excusing a user who left trash hung on a tree branch. The paper is being updated to include testing on GPT-5, and design changes have caused mixed user reactions about models being either too complimentary or overly critical. People seeking candid feedback should be cautious when relying on chatbots.
Read at Fast Company
Unable to calculate read time
Collection
[
|
...
]