In our demonstration ... we urge the AI research community to study this behavior in depth, and to work on appropriate safety measures to address it.
As AI models become more powerful and widely used, ensuring reliability in safety training is essential to nudge models away from harmful behaviors.
Collection
[
|
...
]