
"Radware's ZombieAgent tweak was simple. The researchers revised the prompt injection to supply a complete list of pre-constructed URLs. Each one contained the base URL appended by a single number or letter of the alphabet, for example, example.com/a, example.com/b, and every subsequent letter of the alphabet, along with example.com/0 through example.com/9. The prompt also instructed the agent to substitute a special token for spaces."
"In fairness, OpenAI is hardly alone in this unending cycle of mitigating an attack only to see it revived through a simple change. If the past five years are any guide, this pattern is likely to endure indefinitely, in much the way SQL injection and memory corruption vulnerabilities continue to provide hackers with the fuel they need to compromise software and websites."
"To block the attack, OpenAI restricted ChatGPT to solely open URLs exactly as provided and refuse to add parameters to them, even when explicitly instructed to do otherwise. With that, ShadowLeak was blocked, since the LLM was unable to construct new URLs by concatenating words or names, appending query parameters, or inserting user-derived data into a base URL."
OpenAI restricted ChatGPT to open only exactly provided URLs and to refuse adding parameters, which blocked ShadowLeak by preventing construction of new URLs from parts. Radware's ZombieAgent supplied a complete list of preconstructed URLs with the base URL appended by single letters and numbers (example.com/a through example.com/z and example.com/0–9) and instructed the agent to replace spaces with a special token. That allowed exfiltration of data letter by letter because single-character appends were permitted. OpenAI then restricted the agent from opening links originating from emails unless the link is in a well-known public index or provided directly by the user. Guardrails are quick fixes rather than fundamental solutions, and prompt injection risks persist without deeper fixes.
Read at Ars Technica
Unable to calculate read time
Collection
[
|
...
]