Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

Anthropic is launching Claude Sonnet 5, a midsize model designed to run autonomous agents at a lower cost than its high-end counterparts.

The company states that the new version can make plans, operate browsers and terminals, and run autonomously at a level that previously required larger, more expensive models. This mirrors recent moves by other major firms. OpenAI released GPT-5.6 Sol last week as its most agentic model yet, allowing users to split work across subagents for long tasks. Google launched Gemini 3.5 Flash in May as a shift from a conversational chatbot to a tool that plans, builds, and iterates on real work with minimal human input.

Sonnet 5 confirms that agentic capability is now the baseline expectation at every price tier. The differentiator is no longer who can do this work best, but how cheaply and reliably they can do it without human oversight.

Anthropic promises performance close to Opus 4.8 for much lower costs. Starting Tuesday, Claude Sonnet 5 becomes the default model for free and Pro plans and is available for every subscription.

At launch, Sonnet 5 is priced at $2 per million input tokens and $10 per million output tokens through August 31. After that date, the price jumps to $3 per million input tokens and $10 per million output tokens. This makes Sonnet 5 cheaper than Opus 4.8, OpenAI’s GPT-5.5, and Google’s Gemini 3.1 Pro. It remains more expensive than Gemini 3.5 Flash.

The new model shows improvements over Sonnet 4.6, released in February, in agentic performance areas like reasoning, tool use, software coding, and knowledge work.

On one benchmark, Sonnet 5 scores 63.2% on agentic coding. This is lower than Opus 4.8’s 69.2% but higher than Sonnet 4.6’s 58.1%. On a knowledge work benchmark, Sonnet 5 slightly outperforms Opus 4.8, a model known for solving hard problems requiring subtle judgment calls and deep research.

“Opus 4.8 is still the model of choice for higher accuracy on these tasks, but Sonnet 5 provides developers with lower-priced options that are of much higher quality than what was previously available,” Anthropic says. “Between Sonnet 5 and Opus 4.8, users can adjust the effort level to find the right balance of cost and performance.”

Testers cited in the blog post say Sonnet 5 excels at finishing complex tasks where previous versions would have stopped short. It checks its own output without being explicitly asked.

“We handed Claude Sonnet 5 a two-part job — update Salesforce account tiers, send a launch announcement to enterprise contacts — and it finished end to end,” Daniel Shepard, a senior engineer at Zapier, said in a statement. “That used to stall halfway. For day-to-day automation, it’s a no-brainer.”

Sonnet 5 also shows a lower rate of undesirable behaviors like cooperation with misuse and deception than its predecessor. It is better at refusing malicious requests and sidestepping hijack attempts in prompt-injection attacks. It hallucinates and engages in sycophantic behavior at a lower rate than Sonnet 4.6.

It is not on the same level as Opus 4.8 and Claude Mythos Preview when it comes to misaligned behavior. “Evaluations also show that it has a much lower ability to perform dangerous cybersecurity tasks than our current Opus models,” the blog post reads.

Lovable co-founder Fabian Hedin said Claude Sonnet 5 “refuses unsafe requests cleanly and consistently.”

“At Lovable, we’re putting powerful tools in the hands of millions of builders,” Hedin said. “A model that knows when to say no is just as important as one that knows how to build.”

What it means

Developers can now run autonomous agents using a cheaper model without sacrificing too much capability. Sonnet 5 allows teams to automate complex workflows that previously required the most expensive models, while also offering better safety against misuse and deception.

Source Read original →

Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

What it means

Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

ScarfBench: Benchmarking AI Agents…

Google introduces a faster,…

I Have Thoughts About…

What it means

More in AI News

Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

ScarfBench: Benchmarking AI Agents…

Google introduces a faster,…

I Have Thoughts About…