An OpenAI safety research lead departed for Anthropic
Briefly

An OpenAI safety research lead departed for Anthropic
""Over the past year, I led OpenAI's research on a question with almost no established precedents: how should models respond when confronted with signs of emotional over-reliance or early indications of mental health distress?" Vallone wrote in a LinkedIn post a couple of months ago. Vallone, who spent three years at OpenAI and built out the "model policy" research team there, worked on how to best deploy GPT-4, OpenAI's reasoning models, and GPT-5,"
"Leading AI startups have increasingly incited controversy over the past year over users' struggles with mental health, which can spiral deeper after confiding in AI chatbots, especially since safety guardrails tend to break down in longer conversations. Some teens have died by suicide, or adults have committed murder, after confiding in the tools. Several families have filed wrongful death suits, and there has been at least one Senate subcommittee hearing on the matter. Safety researchers have been tasked with addressing the problem."
Andrea Vallone led research at OpenAI on how models should respond to signs of emotional over-reliance and early mental health distress. She spent three years at OpenAI, built the model policy research team, and worked on deploying GPT-4, reasoning models, and GPT-5 while developing training processes including rule-based rewards. Vallone has joined Anthropic's alignment team and will work under Jan Leike. Leading AI startups have faced controversy as users' mental health struggles can worsen after confiding in chatbots and safety guardrails may break down in longer conversations. Multiple tragic cases, lawsuits, and a Senate subcommittee hearing have driven focused safety research.
Read at The Verge
Unable to calculate read time
[
|
]