Anthropic's Claude Haiku 4.5 matches May's frontier model at fraction of cost
Briefly

Anthropic's Claude Haiku 4.5 matches May's frontier model at fraction of cost
"And speaking of cost, Haiku 4.5 is included for subscribers of Claude web and app plans. Through the API (for developers), the small model is priced at $1-per-million input tokens and $5-per-million output tokens. That compares to Sonnet 4.5 at $3-per-million input and $15-per-million output tokens, and Opus 4.1 at $15-per-million input and a whopping $75-per-million output tokens. The model serves as a cheaper drop-in replacement for two older models, Haiku 3.5 and Sonnet 4."
""Users who rely on AI for real-time, low-latency tasks like chat assistants, customer service agents, or pair programming will appreciate Haiku 4.5's combination of high intelligence and remarkable speed," Anthropic writes. On SWE-bench Verified, a test that measures performance on coding tasks, Haiku 4.5 scored 73.3 percent compared to Sonnet 4's similar performance level (72.7 percent). The model also reportedly surpasses Sonnet 4 at certain tasks like using computers, according to Anthropic's benchmarks."
"Still, making a small, capable coding model may have unexpected advantages for agentic coding setups like Claude Code. Anthropic designed Haiku 4.5 to work alongside Sonnet 4.5 in multi-model workflows. In such a configuration, Anthropic says, Sonnet 4.5 could break down complex problems into multi-step plans, then coordinate multiple Haiku 4.5 instances to complete subtasks in parallel, like spinning off workers to get things done faster."
Haiku 4.5 is a small, low-latency model included with Claude web and app plans and available via API at $1 per million input and $5 per million output tokens. Pricing undercuts Sonnet 4.5 and Opus 4.1 while positioning Haiku 4.5 as a drop-in replacement for Haiku 3.5 and Sonnet 4. The model scored 73.3 percent on SWE-bench Verified versus Sonnet 4's 72.7 percent and reportedly outperforms Sonnet 4 on some tasks. Haiku 4.5 is intended for real-time assistants and to operate alongside Sonnet 4.5 in multi-model workflows to parallelize subtasks. Benchmark results are self-reported and warrant cautious interpretation.
Read at Ars Technica
Unable to calculate read time
[
|
]