#harmful-manipulation
#harmful-manipulation

[ follow ]

#ai-safety #misalignment #critical-capability-levels

DeepMind: Models may resist shutdowns

Google DeepMind added shutdown-resistance and 'harmful manipulation' misuse risks, plus a misalignment detection section and a new Critical Capability Level in Frontier Safety Framework v3.

[ Load more ]

#harmful-manipulation#harmful-manipulation

DeepMind: Models may resist shutdowns

#harmful-manipulation
#harmful-manipulation