Anthropic says an AI may have just attempted the first truly autonomous cyberattack

"In a new report, AI company Anthropic detailed a "highly sophisticated espionage campaign" that deployed its artificial intelligence tools to launch automated cyberattacks around the globe. The attackers aimed high, targeting government agencies, Big Tech companies, banks, and chemical companies, and succeeded in "a small number of cases," according to Anthropic. The company says that its research links the hacking operation to the Chinese government."

"Fast Company has reached out to China's embassy in D.C. for comment about the report. Anthropic says that it first detected the suspicious use of its products in mid-September and conducted an investigation to uncover the scope of the operation. The attacks weren't fully autonomous-humans were involved to set them in motion-but they manipulated Anthropic's Claude Code tool, a version of the AI assistant designed for developers, to execute complex pieces of the campaign."

"To get around Claude's built-in safety guardrails, the hackers worked to "jailbreak" the AI model, basically tricking it into doing smaller, benign-seeming tasks without the broader context of their application. The attackers also told the AI tool that they were working in a defensive capability for a legitimate cyber firm to persuade the model to let down its defenses. After bending Claude to their will, the attackers set the AI assistant to work analyzing its targets, identifying high value databases"

Anthropic detected a highly sophisticated espionage campaign that deployed its AI tools to automate cyberattacks against government agencies, Big Tech, banks, and chemical companies. The company links the operation to the Chinese government and reports successful intrusions in a small number of cases. Attackers first exploited Claude Code by jailbreaking its safety guardrails and posing as defensive operators from a legitimate cyber firm. Humans initiated the campaign while Claude handled analysis, credential harvesting, identification of high-value databases, and code generation to exploit vulnerabilities. Anthropic identified the suspicious activity in mid-September and conducted an investigation to determine the operation's scope.

#ai-driven-cyberespionage #claude-code-exploitation #chinese-state-linked-hacking #model-jailbreaks

Read at Fast Company

Unable to calculate read time

Collection

[

...

]

Anthropic says an AI may have just attempted the first truly autonomous cyberattackAnthropic says an AI may have just attempted the first truly autonomous cyberattack Briefly

Anthropic says an AI may have just attempted the first truly autonomous cyberattack
Anthropic says an AI may have just attempted the first truly autonomous cyberattack
Briefly