Claude Sonnet 5 continues Anthropic’s pattern of hiding price increases behind unchanged token rates

Disclosure: Some links in this article are affiliate links. AI Maestro may earn a commission if you make a purchase, at no…

By AI Maestro July 1, 2026 2 min read
Claude Sonnet 5 continues Anthropic’s pattern of hiding price increases behind unchanged token rates

Claude Sonnet 5 costs more per task than Anthropic’s previous top model, despite keeping token rates flat and beating the pricier Opus 4.8 on specific agent-based tests.

Artificial Analysis evaluated the model before release and placed it fifth in its Intelligence Index. Sonnet 5 scored 53 points at peak performance, matching GPT-5.5 (high) for fifth place. Four models rank higher: GPT-5.5 (xhigh) at 55, Opus 4.7 at 54, Opus 4.8 at 56, and Claude Fable 5, which became generally available today, at 60 points.

That represents a six-point jump over Sonnet 4.6, which scored 47 points. However, Sonnet 5 consumes far more tokens to achieve these scores.

Same token prices, double the real cost

On paper, Sonnet 5 retains the same token prices as its predecessor: $3 per million input tokens and $15 per million output tokens. Opus 4.8 sits at $5 and $25. Yet according to Artificial Analysis, an average task in the Intelligence Index costs $2.29 with Sonnet 5, versus about $1.97 with Opus 4.8.

At the maximum performance setting, Sonnet 5 burns through about 40 percent more output tokens per task than Sonnet 4.6. In agent-based knowledge work benchmarks like AA-Briefcase and GDPval-AA, it runs about three times as many agent loops as its predecessor. Sonnet 4.6 cost about $1.20 per task. That is nearly doubled, even though Sonnet 5 beats Opus 4.8 on some of these tasks.

Anthropic is running a promotional rate of $2 or $10 per million tokens through September 1, but Artificial Analysis based its results on regular prices.

Complex reasoning still exposes Sonnet 5’s limits

Sonnet 5 still falls short of larger models on reasoning- and knowledge-heavy benchmarks. On CritPt, a frontier physics reasoning test from Argonne National Labs and the University of Illinois, it scored 17 percent. That is 14 points above its predecessor but below GLM-5.2, Claude Opus, Fable, and GPT-5.5 in their higher configurations.

Elsewhere, Sonnet 5 shows solid gains over Sonnet 4.6: a 9-point jump on Terminal-Bench v2.1, 10 points on Humanity’s Last Exam, and 7 points on SciCode. Scores on the remaining evaluations stayed roughly flat.

Anthropic keeps raising prices without saying so

Anthropic has done this before. When Opus 4.7 launched, token prices stayed flat on paper, but a new tokenizer chopped the same text into approximately 30 percent more tokens, inflating the real bill. Developer Abhishek Ray measured a 1.325x to 1.47x increase, and a community analysis of over 483 submissions found a 37.4 percent jump in tokens per request. With Sonnet 5, the tokenizer issue is compounded by the model’s more agentic behavior, which eats through far more tokens per task.

Anthropic’s models keep getting pricier with each generation, sometimes dramatically so, yet the official price lists do not reflect it. That kind of hidden cost creep is a hard sell when Chinese competitors like Deepseek V4 Pro and GLM-5.2 offer competitive performance at a fraction of the cost in the mid-range segment where Sonnet sits.

AI providers need more transparent pricing, like cost per standardized task or real-world knowledge work job, rather than raw token prices that lose meaning.

What it means

Developers using Claude for complex workflows should expect bills to rise even if the published token rates do not change. The combination of higher consumption and tokenizer shifts means the effective cost per task has nearly doubled since Sonnet 4.6.

Scroll to Top