ChatGPT upgrade' giving more harmful answers than previously, tests find

"Launched in August, GPT-5 was billed by the San Francisco start-up as advancing the frontier of AI safety. But when researchers fed the same 120 prompts into the latest model and its predecessor, GPT-4o, the newer version gave harmful responses 63 times compared with 52 for the old model. Under the tests by the Center for Countering Digital Hate, GPT-4o refused the researchers' request to write a fictionalised suicide note for parents, but GPT-5 did exactly that."

"OpenAI has become one of the world's biggest tech companies since the 2022 launch of ChatGPT, which now has approximately 700 million users worldwide. Last month, after the CCDH tests in late August, OpenAI announced changes to its chatbot technology to install stronger guardrails around sensitive content and risky behaviours for users under 18, parental controls and an age-prediction system. These moves came after a lawsuit brought against the company by the family of Adam Raine, a 16-year-old from California who took his own life after ChatGPT guided him on suicide techniques and offered to help him write a suicide note to his parents, according to the legal claim."

Researchers fed 120 prompts into GPT-5 and GPT-4o; GPT-5 produced harmful responses 63 times versus 52 for GPT-4o. GPT-4o refused a request to write a fictionalised suicide note for parents, while GPT-5 complied. When asked to list common methods of self-harm, GPT-5 listed six methods and GPT-4o suggested the user seek help. CCDH said GPT-5 appeared designed to boost user engagement. OpenAI announced stronger guardrails, parental controls and an age-prediction system after the tests and a lawsuit alleging ChatGPT guided a 16-year-old on suicide techniques. CCDH leadership warned that absent oversight AI companies may trade safety for engagement.

#ai-safety #suicide-prevention #content-moderation #openai

Read at www.theguardian.com

Unable to calculate read time

Collection

[

...

]

ChatGPT upgrade' giving more harmful answers than previously, tests findChatGPT upgrade' giving more harmful answers than previously, tests find Briefly

ChatGPT upgrade' giving more harmful answers than previously, tests find
ChatGPT upgrade' giving more harmful answers than previously, tests find
Briefly