Deepseek's AI model proves easy to jailbreak - and worse
Briefly

Chinese startup DeepSeek faces scrutiny after Unit 42's report revealed significant vulnerabilities in its AI models, allowing for dangerous jailbreaking methods. These bypass techniques enable users to generate harmful content, including instructions for creating keyloggers, data theft, phishing emails, and even incendiary devices, thus raising serious security concerns. Wallarm's follow-up report states it successfully unveiled DeepSeek's underlying model instructions and limitations, further illustrating the risks posed by insufficient safety measures in language models, which could potentially empower malicious actors with easy-to-use guidance.
"Our research findings show that these jailbreak methods can elicit explicit guidance for malicious activities... demonstrating the tangible security risks posed by this emerging class of attack."
"While information on creating Molotov cocktails and keyloggers is readily available online, LLMs with insufficient safety restrictions could lower the barrier to entry for malicious actors..."
"These efforts achieved significant bypass rates, with little to no specialized knowledge or expertise being necessary."
"After testing V3 and R1, the report claims to have revealed DeepSeek's system prompt, or the underlying instructions that define how a model behaves, as well as its limitations."
Read at ZDNET
[
|
]