With new Opus 4.5 model, Anthropic's Claude could remain the best AI coding tool
Briefly

With new Opus 4.5 model, Anthropic's Claude could remain the best AI coding tool
"Anthropic launched its newest model, Claude Opus 4.5, putting the company back atop the benchmark rankings for AI software coding. Opus 4.5 scores over 80% on the widely-used SWE-bench, which tests models for software engineering skill. Google's impressive Gemini 3 Pro, launched last week, briefly held the top score with 76.2%. Anthropic's Claude product lead Scott White tells Fast Company that the model has also scored higher than any human on the engineering take-home assignment the company gives to engineering job candidates."
"Of course Opus 4.5 does a lot more than coding. Anthropic says Opus 4.5 is also the "best model in the world" for powering AI agents and for operating a computer, and that it's meaningfully better than other models at tasks like deep research and working with slides and spreadsheets. Opus 4.5 also notched state-of-the-art (best) scores in several other key benchmarks, including Agentic coding SWE-bench Verified, Agentic tool use T-2 bench, and Novel problem solving ARC-AGI-2."
Anthropic launched Claude Opus 4.5, which scores over 80% on the SWE-bench and reclaimed the top coding benchmark position. The model reportedly scored higher than any human on the company's engineering take-home assignment. Anthropic positions Opus 4.5 as leading for powering AI agents, operating a computer, deep research, and working with slides and spreadsheets. Opus 4.5 achieved state-of-the-art results on Agentic coding SWE-bench Verified, Agentic tool use T-2 bench, and ARC-AGI-2. Customers report improved handling of uncertainty, better trade-off decisions, and concrete gains in office automation, financial modeling, and Excel accuracy and efficiency. The model will roll out as a default option for higher-tier subscribers.
Read at Fast Company
Unable to calculate read time
[
|
]