DeepSeek R1 has taken the world by storm, but security experts claim it has 'critical safety flaws' that you need to know about

from ITPro 2 months ago

Research by Cisco reveals that DeepSeek R1, a new frontier reasoning model, is critically flawed when it comes to safety, displaying a 100% attack success rate during tests against prompts from the HarmBench dataset. Compared to other models like OpenAI's o1-preview and Anthropic's Claude 3.5 Sonnet, which showed resilience with significantly lower success rates, DeepSeek's inability to block harmful prompts raises serious concerns about its deployment. Other models had varying rates of vulnerability, highlighting the ongoing challenges in ensuring safety in AI technologies.

DeepSeek R1 exhibited a 100% attack success rate, meaning it failed to block a single harmful prompt, contrasting sharply with competitors.
ITProhttps://www.itpro.com/technology/artificial-intelligence/deepseek-r1-model-jailbreak-security-flaws

The results were alarming: DeepSeek R1 exhibited a 100% attack success rate, meaning it failed to block a single harmful prompt.
ITProhttps://www.itpro.com/technology/artificial-intelligence/deepseek-r1-model-jailbreak-security-flaws

Read at ITPro

#ai-safety #deepseek-r1 #harmbench #ai-models #cybersecurity

Collection

[

...

]

DeepSeek R1 has taken the world by storm, but security experts claim it has 'critical safety flaws' that you need to know aboutDeepSeek R1 has taken the world by storm, but security experts claim it has 'critical safety flaws' that you need to know about Briefly

DeepSeek R1 has taken the world by storm, but security experts claim it has 'critical safety flaws' that you need to know about
DeepSeek R1 has taken the world by storm, but security experts claim it has 'critical safety flaws' that you need to know about
Briefly