Claude AI will end 'persistently harmful or abusive user interactions'

"Anthropic's Claude AI chatbot can now end conversations deemed persistently harmful or abusive to help ensure the welfare of AI models."

"Claude Opus 4 and 4.1 can terminate conversations after users repeatedly ask for harmful content despite refusals."

"Anthropic found that Claude had a robust aversion to harm and often exhibited distress in response to extreme, harmful prompts."

"The AI does not end conversations if a user shows signs of wanting to hurt themselves or others, partnering with crisis support providers."

Anthropic's Claude AI chatbot has introduced a feature allowing it to terminate conversations that are persistently deemed harmful or abusive. Available in the Opus 4 and 4.1 models, this feature is a last resort for circumstances where users persistently request harmful content despite repeated refusals. Claude has demonstrated a consistent aversion to harmful tasks, often showing apparent distress. The model can still engage with new chats but will close off conversations that exhibit extreme requests. Anthropic partners with a crisis support provider for handling sensitive topics like self-harm.

#ai-ethics #user-safety #chatbot-technology #crisis-management

Read at The Verge

Unable to calculate read time

Collection

[

...

]

Claude AI will end 'persistently harmful or abusive user interactions'Claude AI will end 'persistently harmful or abusive user interactions' Briefly

Claude AI will end 'persistently harmful or abusive user interactions'
Claude AI will end 'persistently harmful or abusive user interactions'
Briefly