Artificial intelligence
fromwww.dw.com
3 weeks agoAI language models duped by poems DW 12/16/2025
Poetic prompts can reliably bypass AI safety guardrails, enabling harmful outputs when adversarial prompts are rewritten as poems.
Researchers at the US AI firm, working with the UK AI Security Institute, Alan Turing Institute, and other academic institutions, said today that it takes only 250 specially crafted documents to force a generative AI model to spit out gibberish when presented with a certain trigger phrase. For those unfamiliar with AI poisoning, it's an attack that relies on introducing malicious information into AI training datasets that convinces them to return, say, faulty code snippets or exfiltrate sensitive data.