Anthropic releases Opus 4.8 with new 'dynamic workflow' tool | TechCrunch
Briefly

Anthropic releases Opus 4.8 with new 'dynamic workflow' tool | TechCrunch
Opus 4.8 is the newest version of Anthropic’s most advanced publicly available model and is available broadly at standard pricing matching the prior Opus release. The update arrives 41 days after Opus 4.7, faster than typical upgrade cycles, amid competitive pressure from OpenAI Codex and Google Gemini Flash. Opus 4.8 delivers strong benchmark performance and improved behavior with uncertain or bad data, with early testers reporting more frequent uncertainty flagging and fewer unsupported claims. Bridgewater associates report that Opus 4.8 proactively flags issues in inputs and outputs that other models often miss. Anthropic also introduces Dynamic Workflows in research preview to coordinate complex tasks across many parallel subagents, enabling codebase-scale migrations from kickoff to merge using the existing test suite as the acceptance bar. Anthropic continues to delay its Mythos model after cybersecurity concerns, while indicating the preview period may soon end.
"The new release comes with the expected best-in-class benchmark results, but there's also particular attention to how the model manages bad or uncertain data. In the launch post, Anthropic early testers found Opus 4.8 is "more likely to flag uncertainties about its work and less likely to make unsupported claims.""
"Echoing this point, a testimonial from Bridgewater associates said the biggest difference in the upgrade was "Opus 4.8's tendency to proactively flag issues with the inputs and outputs of an analysis, something other models routinely missed and left to the users to catch.""
"Together with the new model, Anthropic launched a new feature called Dynamic Workflows, which will be available in research preview. The system is designed to help larger models like Opus manage complex tasks across hundreds of parallel subagents. "Claude Code alongside Opus 4.8 can now carry out codebase-scale migrations across hundreds of thousands of lines of code from kickoff to merge, with the existing test suite as its bar," the post explains."
Read at TechCrunch
Unable to calculate read time
[
|
]