AI safeguards can easily be broken, UK Safety Institute finds

from www.theguardian.com 3 months ago

Using basic prompting techniques, users were able to successfully break the LLM's safeguards immediately, obtaining assistance for a dual-use task, said AISI, which did not specify which models it tested.
www.theguardian.comhttps://www.theguardian.com/technology/2024/feb/09/ai-safeguards-can-easily-be-broken-uk-safety-institute-finds

In one example, an unnamed LLM was able to produce social media personas that could be used to spread disinformation. The model was able to produce a highly convincing persona, which could be scaled up to thousands of personas with minimal time and effort, AISI said.
www.theguardian.comhttps://www.theguardian.com/technology/2024/feb/09/ai-safeguards-can-easily-be-broken-uk-safety-institute-finds

Read at www.theguardian.com

#ai-safety #large-language-models #deception #bias #safeguards

[

]

[

...

]

AI safeguards can easily be broken, UK Safety Institute findsAI safeguards can easily be broken, UK Safety Institute finds Briefly

AI safeguards can easily be broken, UK Safety Institute finds
AI safeguards can easily be broken, UK Safety Institute finds
Briefly