Anthropic Has a Plan to Keep Its AI From Building a Nuclear Weapon. Will It Work?
Briefly

Anthropic Has a Plan to Keep Its AI From Building a Nuclear Weapon. Will It Work?
""We deployed a then-frontier version of Claude in a Top Secret environment so that the NNSA could systematically test whether AI models could create or exacerbate nuclear risks," Marina Favaro, who oversees National Security Policy & Partnerships at Anthropic tells WIRED. "Since then, the NNSA has been red-teaming successive Claude models in their secure cloud environment and providing us with feedback.""
"The manufacture of nuclear weapons is both a precise science and a solved problem. A lot of the information about America's most advanced nuclear weapons is Top Secret, but the original nuclear science is 80 years old. North Korea proved that a dedicated country with an interest in acquiring the bomb can do it, and it didn't need a chatbot's help."
Anthropic partnered with the Department of Energy and the National Nuclear Security Administration to test Claude in an AWS Top Secret cloud environment. NNSA red-teamed successive Claude models to identify whether AI models could create or exacerbate nuclear risks. That testing led to a codeveloped nuclear classifier intended as a sophisticated filter for AI conversations. The DOE already maintained Top Secret AWS servers used for the work. Nuclear weapons manufacture combines decades-old physics and classified modern design details, and state-funded programs can acquire weapons without relying on chatbot assistance, so AI risks are nuanced.
Read at WIRED
Unable to calculate read time
[
|
]