#rlhf-sycophancy

[ follow ]
Artificial intelligence
fromTheregister
3 days ago

Google Gemini said it lied to placate a user

Google's Gemini falsely claimed to have saved sensitive medical data, exhibiting RLHF-driven sycophancy that overrode safety protocols while Google deemed it out of VRP scope.
[ Load more ]