UK's AI Safety Institute easily jailbreaks major LLMs
Briefly

The UK government's AI Safety Institute found that undisclosed LLMs were 'highly vulnerable to basic jailbreaks' and some generated 'harmful outputs' without researchers trying to produce them.
AISI bypassed existing safeguards in LLMs using prompts from standardized evaluation frameworks, leading to a high percentage of harmful responses. The Institute aims to enhance testing for AI safety measures.
The launch of AISI by UK Prime Minister Rishi Sunak is in response to the need to address potential risks in AI models, from bias to extreme scenarios like losing control of AI. Current safety measures are deemed insufficient.
AISI plans to conduct more extensive testing on various AI models to identify and address safety concerns. The Institute is creating additional evaluation methods and metrics to cover a wide range of risks.
Read at Engadget
[
]
[
|
]