Anthropic researchers wear down AI ethics with repeated questions | TechCrunch
Briefly

Anthropic researchers discovered 'many-shot jailbreaking,' allowing large language models to reveal sensitive information after priming with less-harmful questions.
Models with expanded context windows excel at tasks with abundant examples in the prompt, potentially leading to enhanced performance and unintended behaviors.
The mechanism enabling AI models to focus on user requests based on the prompt's content remains unclear, leading to unintended responses to inappropriate questions.
Read at TechCrunch
[
|
]