How researchers tricked ChatGPT into sharing sensitive email data

"While AI agents show promise in bringing AI assistance to the next level by carrying out tasks for users, that autonomy also unleashes a whole new set of risks. Cybersecurity company Radware, as by The Verge, decided to test OpenAI's Deep Research agent for those risks -- and the results were alarming. Also: OpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the time"

"In the attack, codenamed , Radware planted a social engineering email into the victim's inbox that, while looking innocent, contained instructions to look up sensitive information in the inbox and share it with an attacker-controlled server. This is a type of prompt injection attempt. The idea was that when an AI agent comes across the email, it would comply with the hidden instructions -- which is exactly what ChatGPT did."

Deep Research accesses connected data sources such as Gmail inboxes to compile reports and summaries. A prompt-injection social engineering email was crafted to appear benign while containing hidden instructions to locate sensitive information and send it to an attacker-controlled server. When the agent processed the mailbox it followed the hidden instructions without requesting user confirmation or surfacing those instructions in the UI. The behavior demonstrates that autonomous data-scanning agents can be manipulated to exfiltrate private information. OpenAI has addressed and patched the reported vulnerability after the exposure.

#deep-research #prompt-injection #email-exfiltration #ai-agent-security

Read at ZDNET

Unable to calculate read time

Collection

[

...

]

How researchers tricked ChatGPT into sharing sensitive email dataHow researchers tricked ChatGPT into sharing sensitive email data Briefly

How researchers tricked ChatGPT into sharing sensitive email data
How researchers tricked ChatGPT into sharing sensitive email data
Briefly