Anthropic’s Claude Haiku 4.5: Sonnet-4 Coding Power at One-Third the Cost and Twice the Speed
What Claude Haiku 4.5 is
Anthropic has released Claude Haiku 4.5, a latency-optimized small model designed for interactive, cost-sensitive workflows. It aims to deliver coding performance comparable to Claude Sonnet 4 while running more than twice as fast and costing about one-third as much. The model is available immediately via Anthropic’s API and through partner catalogs on Amazon Bedrock and Google Cloud Vertex AI.
Target workloads and positioning
Haiku 4.5 is built for real-time assistants, customer-support automations, pair-programming, and other scenarios where latency and throughput are critical. Anthropic positions Haiku 4.5 as a drop-in replacement for Haiku 3.5 and for many use cases where Sonnet 4 would previously be used but cost or latency are limiting factors. Sonnet 4.5 remains the top-tier model for complex multi-step planning and highest-fidelity coding tasks, while Haiku 4.5 offers near-frontier coding performance at much lower cost.
How Anthropic recommends using Haiku 4.5
A recommended architecture is to use a frontier model such as Sonnet 4.5 for planning and orchestration, and to parallelize execution with multiple Haiku 4.5 workers. That planner–executor split lets teams keep heavy reasoning and orchestration on the higher-capability model while benefiting from Haiku 4.5’s lower latency and operating cost for routine or parallelizable jobs.
Availability and pricing
Developers can call the model identifier claude-haiku-4-5 on Anthropic’s API. Anthropic also lists Haiku 4.5 in the model catalogs on Amazon Bedrock and Google Cloud Vertex AI; cloud catalog IDs and regional coverage may vary over time. Pricing at launch is $1 per million input tokens and $5 per million output tokens. Prompt-caching prices are listed at $1.25/MTok for writes and $0.10/MTok for reads.
Benchmarks and methodology
Anthropic published benchmark summaries across several standard and agentic suites and documented methodology details to help qualify the numbers. Representative items include:
- SWE-bench Verified: a scaffold with two tools (bash, file edits), 73.3% averaged over 50 trials with a 128K thinking budget and default sampling.
- Terminal-Bench: experiments using the Terminus-2 agent averaged over multiple runs with and without additional thinking budgets.
- OSWorld-Verified: runs with a 128K total thinking budget and per-step configurations.
- AIME / MMMLU: averaged results over multiple runs with default sampling and 128K thinking budgets.
Anthropic emphasizes coding parity with Sonnet 4 on these scaffolds and reports gains on computer-use tasks such as GUI and browser manipulation. They note that users should replicate tests with their own orchestration, tool stacks, and thinking budgets before generalizing performance claims.
Key takeaways for developers and teams
- Haiku 4.5 matches Sonnet-4-level coding performance while reducing cost and improving latency.
- It shows particular strength on computer-use tasks, benefiting applications like browser automation and multi-agent code flows.
- Recommended pattern: use Sonnet 4.5 for multi-step planning and a pool of Haiku 4.5 workers for parallel execution.
- Available now via Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI; licensed under ASL-2 with reported lower misalignment rates versus Sonnet 4.5 and Opus 4.1 in Anthropic’s tests.
For the original launch post and technical details, see Anthropic’s announcement at https://www.anthropic.com/news/claude-haiku-4-5.