A user who asks the system for instructions to build a bomb, for example, will receive a polite refusal to engage.
By including large amounts of text in a specific configuration, this technique can force LLMs to produce potentially harmful responses.
Newer, more complex AI systems seem to be more vulnerable to such attacks.
Collection
[
|
...
]