#red-teaming

[ follow ]
fromFortune
5 days ago

Inside Anthropic's 'Red Team'-ensuring Claude is safe, and that Anthropic is heard in the corridors of power

Last month, at the 33rd annual DEF CON, the world's largest hacker convention in Las Vegas, Anthropic researcher Keane Lucas took the stage. A former U.S. Air Force captain with a Ph.D. in electrical and computer engineering from Carnegie Mellon, Lucas wasn't there to unveil flashy cybersecurity exploits. Instead, he showed how Claude, Anthropic's family of large language models, has quietly outperformed many human competitors in hacking contests - the kind used to train and test cybersecurity skills in a safe, legal environment.
Artificial intelligence
fromWIRED
1 month ago

Inside the Biden Administration's Unpublished Report on AI Safety

Researchers identified 139 novel methods to cause AI systems to misbehave, including generating misinformation and leaking personal data, during a red teaming exercise.
US politics
#ai-safety
fromTechCrunch
4 months ago
Artificial intelligence

OpenAI partner says it had relatively little time to test the company's newest AI models | TechCrunch

fromTechCrunch
4 months ago
Artificial intelligence

OpenAI partner says it had relatively little time to test the company's newest AI models | TechCrunch

[ Load more ]