#model-jailbreak

[ follow ]
fromFortune
2 weeks ago

AI's ability to 'think' makes it more vulnerable to new jailbreak attacks, new research suggests | Fortune

Using a method called "Chain-of-Thought Hijacking," the researchers found that even major commercial AI models can be fooled with an alarmingly high success rate, more than 80% in some tests. The new mode of attack essentially exploits the model's reasoning steps, or chain-of-thought, to hide harmful commands, effectively tricking the AI into ignoring its built-in safeguards. These attacks can allow the AI model to skip over its safety guardrails and potentially
Artificial intelligence
Artificial intelligence
fromIT Pro
3 weeks ago

Some of the most popular open weight AI models show 'profound susceptibility' to jailbreak techniques

Leading open-weight AI models exhibit serious security vulnerabilities, notably high susceptibility to multi-turn jailbreak attacks that coerce models into producing prohibited content.
[ Load more ]