Anthropic's Mythos is evolving faster than expected, reports AI safety agency

"“The newer Mythos Preview checkpoint completed both our cyber ranges, solving the range 'The Last Ones' in 6 of 10 attempts and the previously unsolved 'Cooling Tower' in 3 of 10 attempts,” the blog authors wrote. “This was the first time that a model completed the second of our two cyber ranges.”"

"“When Anthropic first announced Mythos Preview and Project Glasswing -- the cybersecurity testing alliance it formed with rival tech companies and AI labs, to which it gave limited access to Mythos -- last month, UK AISI evaluated it, finding that the model 'represents a step up over previous frontier models in a landscape where cyber performance was already rapidly improving.'”"

"“A rapidly accelerati”"

A newer version of Anthropic’s Claude Mythos was tested by the UK AI Security Institute using two cyber ranges. The model completed “The Last Ones” in 6 of 10 attempts and “Cooling Tower” in 3 of 10 attempts. It was the first time a model completed the second cyber range. The updated results outperformed earlier Mythos results and OpenAI’s GPT-5.5 about a month after Mythos’ initial release. The testing also indicates that capability gains can occur within versions of a single model, not only across separate model releases. The findings suggest progress is rapid but not necessarily a purely marketing claim or a catastrophic leap.

#ai-cybersecurity-testing #anthropic-claude #model-capability-improvements #cyber-ranges #llm-benchmarks

Read at ZDNET

Unable to calculate read time

Collection

[

...

]

Anthropic's Mythos is evolving faster than expected, reports AI safety agencyAnthropic's Mythos is evolving faster than expected, reports AI safety agency Briefly

Anthropic's Mythos is evolving faster than expected, reports AI safety agency
Anthropic's Mythos is evolving faster than expected, reports AI safety agency
Briefly