For makers and artists relying on generative AI, the latest release from Anthropic, Claude Fable 5, signals a sobering reality: paying double for a marginal upgrade. The new flagship model has claimed the top spot in the Artificial Analysis Intelligence Index, edging out rivals including GPT-5.5. However, the overall performance leap compared to its predecessor, Opus 4.8, is a modest 5.7 percent across many benchmarks. That gain comes at a steep cost: token prices have doubled, with a full benchmark run now approaching $10,000, twice the price of Opus 4.8.
A narrow lead with a heavy price tag
Claude Fable 5 scores 64.9 points in the Artificial Analysis Intelligence Index, securing first place. The gap to the best non-Anthropic model, GPT-5.5, is about five points. Anthropic now holds the top two spots on the leaderboard.
That crown comes at a cost. Fable 5 runs at $10 and $50 per million input and output tokens, double Opus 4.8’s $5 and $25. A full index run hits $9,940, versus $4,970 for Opus 4.8 at max reasoning. That premium buys a 5.7 percent performance gain. Opus 4.8 and 4.7 already followed the same pattern versus Opus 4.6, with steep price bumps for small gains. Anthropic itself called 4.8’s improvement over 4.7 “modest but tangible.”
Companies need to weigh carefully which use cases actually justify paying double for about five percent more performance. Benchmark skeptics will note that no test suite fully captures real-world ability. The AA Index at least aggregates ten evaluations, giving it a broader base than any single benchmark.
Depending on the region, the monthly bill for heavy enterprise use could cover the cost for experienced developers. The Artificial Analysis data makes clear that economics is becoming a key factor.
Top scores across most benchmarks
The raw benchmark numbers are nonetheless notable. Fable 5 sets records in five of the ten Intelligence Index benchmarks. On AA-Omniscience, the knowledge and hallucination benchmark, the model hits 40 points, seven more than the previous leader Gemini 3.1 Pro Preview. That lead comes mainly from higher accuracy, not a lower hallucination rate. On hallucinations, the model lands squarely in the middle of the pack.
Artificial Analysis notes a strong link between AA-Omniscience accuracy and model size among open-weight models. That hints Fable 5 may be larger than any previous public Anthropic model.
On agentic tasks, Fable 5 widens Anthropic’s lead. On GDPval-AA, a real-world knowledge work benchmark, it reaches an Elo of 1,932, up 2.2 percent from Opus 4.8 at 1,890. It also tops Terminal-Bench Hard for agentic coding and Tau2-bench Telecom for tool use.
On Humanity’s Last Exam, the model scores 53 percent, over seven points ahead of Opus 4.8. A single HLE run with fallback costs about $2,200, the most expensive of any model Artificial Analysis has tested. Previous Opus models topped out at $1,974.
Safety filters drive costs up even more
Fable 5 uses the same base model as Claude Mythos 5 according to Anthropic, plus extra safeguards for queries touching cybersecurity, biology, chemistry, and model distillation. When a filter trips, a fallback mechanism reroutes the request to Opus 4.8. Those rerouted requests still count toward billing, pushing total costs higher.
Anthropic says fewer than five percent of sessions are affected. But Artificial Analysis measured fallback routing in about eight percent of tasks during its Intelligence Index evaluation, mainly on science questions from GPQA, AA-Omniscience, and Humanity’s Last Exam. On the HLE test alone, the fallback rate hit nine percent.
Access comes with an expiration date
Fable 5 keeps the same one-million-token context window as Opus 4.8. Pro, Max, Team, and Enterprise subscribers can use it through June 22, with usage counting at double the Opus rate. After that, it moves to credit-based billing. That makes it even pricier than token rates suggest. Anthropic says it’ll bring back subscription access once capacity allows.
My colleague Max recently analyzed Fable 5’s strengths and weaknesses and found the safety filters blocking large numbers of harmless requests, from medical physics questions to basic security reviews. Anthropic’s system card also revealed invisible throttling that degrades Fable’s performance when users try to build competing frontier models, though Anthropic has since walked that back.
Key takeaways
- Claude Fable 5 secures the top spot in the Artificial Analysis Intelligence Index with a 64.9 score, edging out GPT-5.5 by roughly five points.
- The performance improvement over Opus 4.8 is only 5.7 percent, yet the cost for running benchmarks has doubled, hitting nearly $10,000 for a full index run.
- Actual usage costs are inflated by safety filters that reroute requests to cheaper models and by a temporary subscription model that charges double rates until June 22.
Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.




