Blog / Guides

What is Claude Fable 5? Anthropic's most powerful model, explained

Written by

Alicia Kirana Utomo

Reviewed by

Katelin Teen

Last edited July 8, 2026

Expert Verified

Editorial illustration for an explainer on Claude Fable 5, Anthropic's most powerful AI model

Update — July 8, 2026: Anthropic is extending Fable 5 access. Per Claude on X: "We’re extending access to Claude Fable 5 on all paid plans through July 12." — @claudeai

TL;DR

Claude Fable 5 is Anthropic's most powerful model so far: a new "Mythos-class" tier that sits above Claude Opus 4.8, launched on 9 June 2026 and built for days-long, complex, asynchronous work that earlier models tapped out on.

The headline numbers: a 1 million token context window, 128,000 max output tokens, and pricing of $10 per million input tokens and $50 per million output tokens, which is exactly twice the cost of Opus 4.8. On Anthropic's own benchmarks it's a real step up, and early testers call it the best coding model available to the public.

There's a catch, though, three of them really: it's slow and expensive, it was only free on subscription plans for a 13-day window, and it ships with safety routing that quietly hands your session back to Opus 4.8, sometimes on completely benign prompts. For most teams the practical takeaway is that a model this capable is the engine, not the product. What you actually deploy is an AI agent built on top of one.

So what exactly is Claude Fable 5?

Anthropic frames Fable 5 as a "Mythos-level model built for your most ambitious, long-running projects", and the wording matters. "Mythos-class" is a brand-new capability tier the company is introducing above its existing Opus line, the way Opus has always sat above Sonnet and Haiku. It's the fifth model generation, and Anthropic says it's "designed to handle days-long, complex, and asynchronous tasks previous models couldn't sustain".

The slightly confusing part is that Fable 5 launched as one half of a pair. Fable 5 is the public, safeguarded version anyone with API access or a paid Claude plan can use. Mythos 5 is the same underlying model with the safety classifiers stripped out, gated to vetted cybersecurity and biology partners through Anthropic's Project Glasswing. Simon Willison, who spent a full day testing it, put it plainly: Anthropic say Fable 5 "offers the same performance as Claude Mythos 5, except with much more strict guardrails in place".

SecurityWeek captured why this is a milestone for Anthropic specifically: the company says this "marks the first time a model of this capability class has been deemed safe enough for widespread public and developer access". In other words, the Mythos tier existed before; what's new is letting the general public near it.

How Claude Fable 5 fits into Anthropic's model ladder: Sonnet 4.6 for everyday work, Opus 4.8 for heavy lifting, and Fable 5 as the Mythos-class flagship

The specs that matter

If you just want the at-a-glance version, here's where Fable 5 lands. The context window and cut-off come from Simon Willison's hands-on notes; the pricing is confirmed by both CNBC and SecurityWeek.

Spec	Claude Fable 5
Launched	9 June 2026
Model class	"Mythos-class", a tier above Opus 4.8
Context window	1,000,000 tokens
Max output	128,000 tokens
Knowledge cut-off	January 2026
Pricing	$10 / 1M input, $50 / 1M output (2x Opus 4.8)
Long-context surcharge	None
Where to run it	claude.ai, the Claude API, Claude Code, Claude Managed Agents, AWS, and Microsoft Foundry

One detail worth flagging for anyone working with long documents: there's no price premium for using the full 1M context, which isn't always the case with frontier models. The API ID, if you're wiring it in yourself, is claude-fable-5.

How powerful is it, really?

This is where Fable 5 earns the "most powerful" label. On Anthropic's published comparison, it posts a remarkable leap on pretty much every relevant benchmark, and the gaps over the rest of the field aren't subtle.

Anthropic's benchmark table comparing Claude Mythos 5 / Fable 5 against Claude Opus 4.8, GPT 5.5, and Gemini 3.1 Pro, as taken from Anthropic

A few numbers worth pulling out from that table: 80.3% on SWE-Bench Pro for agentic coding, against 69.2% for Opus 4.8, 58.6% for GPT 5.5, and 54.2% for Gemini 3.1 Pro. On the tougher FrontierCode (Diamond) benchmark it more than doubles Opus, jumping to 29.3% from 13.4%. CNBC's reporting lines up with the table, noting that on some benchmarks Fable scored more than 10% higher than Claude Opus 4.8.

Practitioners backed this up fast. Andrej Karpathy called it a major-version-bump-deserving step change, and one developer running the OSS-maintainer-graded FrontierCode benchmark posted a striking progression: Opus 4.7 at 5.2%, Opus 4.8 at 13.4%, Fable 5 at 29.3%.

There's one honest caveat to keep in mind, and it comes from Nathan Lambert: those published scores are an upper bound. As he notes, "some of the prompts will be downgraded to Opus 4.8 with the current safety filters", so the numbers a real user gets on a flagged topic won't always match the chart. More on that below.

What it's actually like to use

Benchmarks are one thing; a full day of real work is another. The most useful first-hand account came from Simon Willison, who described the model in one word: a beast.

"this is something of a beast. It's slow, expensive and has been quite happily churning through everything I've thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can't do." - Simon Willison

His sharpest example of the leverage: he pointed Fable at his open-source LLM library, and it identified and implemented four separate fixes, then shipped a new release (LLM 0.32a3) that was, in his words, almost entirely written by Fable. His take tells you most of what you need to know about the productivity ceiling here:

"I'm really impressed with the quality of API design, tests, code and documentation that Fable put together for this. I spent several hours on it today, but it feels like several days' worth of work." - Simon Willison

He also ran his canonical "generate an SVG of a pelican riding a bicycle" test across all five thinking-effort levels, which is a nice concrete look at the effort-vs-cost dial. The "max" effort pelican below burned 14,430 output tokens, roughly 72 cents for a single image, versus under 10 cents at "low".

Claude Fable 5's pelican-on-a-bicycle SVG at maximum thinking effort, as taken from Simon Willison

Effort level	Output tokens	Cost per SVG
low	1,929	~9.67¢
medium	2,290	~11.48¢
high	2,057	~10.31¢
xhigh	5,992	~29.99¢
max	14,430	~72.18¢

Source: Simon Willison's effort-level breakdown.

Long-horizon agents are the real headline

Coding scores are the flashy part, but the thing Anthropic actually built Fable 5 for is sustained, autonomous work. Run it in a harness like Claude Code or Claude Managed Agents and Anthropic says it can "work for days at a time: planning across stages, delegating to sub-agents, and checking its own work".

How Claude Fable 5 runs as a long-horizon agent: plan across stages, delegate to sub-agents, execute, then check its own work, looping for days

This isn't only marketing language. In early testing, Stripe reportedly pointed Fable 5 at a 50-million-line Ruby codebase and ran a migration across the whole thing in a day, and community reports describe sessions spinning up up to 1,000 parallel sub-agents for codebase-scale work. One Hacker News user described handing it a 50-page PDF of dense, interconnected specs and getting back a correct breakdown of what was done, partly done, and missing.

This is exactly the shape of work that makes "agents" more than a buzzword: a model that can hold a goal, break it into stages, and grind through them without a human re-prompting at every turn. It's the same principle behind an AI support agent that triages a ticket, looks up an order, drafts a reply, and escalates the edge cases, just pointed at customer conversations instead of a codebase.

The catch: price, the cliff, and quota burn

Now the part that's tempered all the excitement. Fable 5 is genuinely expensive to run, and the rollout had a sting in the tail.

Start with raw price: at $10 / $50 per million tokens, it's twice the cost of Opus 4.8. Anthropic's Dianne Penn argued the value math still works out, saying customers "just get a higher ROI by having more intelligent models", and there's real evidence for that: Canva's evals lead reported Fable using about half the tokens of Opus 4.8 in their internal agentic harnesses, making the real-world cost roughly a wash.

But that efficiency doesn't hold for everyone. Simon Willison tracked a single day's testing at $110.42 of token spend (covered, for now, by his $100/month Max subscription), and subscription users reported tearing through their limits. One user on the $100 Max plan said Fable burned their entire 5-hour window in under 8 minutes plus $15 of overage; another watched it eat their Max 20x plan at roughly 2% per minute.

Then there's the timing. Fable was included on Pro, Max, Team, and seat-Enterprise plans only until 22 June 2026, after which it moved to usage credits. The community read the 13-day window uncharitably, and one of the most-upvoted Hacker News comments summed up the mood:

"This seems like the pharmaceutical method of get them hooked on the drug with free samples, then once they can't live without it, raise the price..." - AquinasCoder on Hacker News

A 340+ comment Reddit thread captured the broader unease, titled "Claude Fable 5 feels less like a model launch and more like a preview of AI inequality". The signal underneath the noise: this is a frontier-grade model whose economics make it a tool for well-funded teams, not casual chat.

The safety routing everyone's arguing about

The loudest complaint in the first 24 hours wasn't price, though. It was the safeguards, and they're genuinely unusual, so they're worth understanding.

Claude Fable 5's two-tier safeguards: visible cyber and bio prompts fall back to Opus 4.8 and you're told, while frontier-LLM-research prompts are silently weakened and you're not

There are two distinct mechanisms stacked inside the same model. The first is transparent. For cybersecurity, biology, chemistry, and model-distillation requests, a new generation of classifiers detects the topic and routes your response to Opus 4.8 instead, and you're told it happened. Penn's concrete example: ask how to make ricin and the model blocks its response and falls back to Opus 4.8. Anthropic says at least 95% of sessions never trigger any fallback.

The trouble is the false positives. Developers reported being silently switched to Opus 4.8 mid-session for completely benign work: basic liquid-handling protocol code, segmenting MRI images into brain vs skull, music firmware, message-digest code, even telling the agent to "kill" a process. One user's verdict: "it's unusable for me due to the refusals. I'm using claude to find patterns in health data".

The second mechanism is the one that turned heads. Buried in the system card, Anthropic describes safeguards for prompts that look like frontier LLM development (pretraining pipelines, distributed training infrastructure, ML accelerator design) that work very differently:

"Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT)." - Claude Fable 5 system card

In plain terms: on that one topic class, the model can get quietly worse without telling you. Nathan Lambert, who writes about AI policy at Interconnects, didn't mince words, calling it "a mix of transparent and reasonable safety policies with quietly rolled-out market entrenchment tactics" and arguing that "an AI model that gets less intelligent automatically without notifying me is categorically misaligned AI." Plenty of users read it the same way; one Hacker News reply was blunt: "looks like Anthropic's definition of safety includes their own safety from competition."

To be fair to Anthropic, the visible classifiers held up to scrutiny: an external bug bounty spanning over 1,000 hours yielded no universal jailbreaks. The controversy is really about the invisible layer and the precedent it sets.

What this means if you're not training frontier models

Here's the reframe most coverage skips. Unless you're a developer running overnight coding agents or an ML researcher, you will almost never touch Claude Fable 5 directly, and that's fine. For the vast majority of teams, the model is plumbing.

The model wars move fast: Fable 5 sits above Opus 4.8 today, the version after it is already well underway, and the cheapest tier next year will out-perform this year's flagship. Chasing whichever model is "best" this month is a losing game if you're trying to actually ship something. What you want is the capability, delivered through a layer that handles the messy parts: grounding the model in your own data, keeping a human in the loop, taking real actions in your tools, and swapping the underlying model when a better one lands without you rewriting anything.

That's the whole idea behind an AI agent platform. The frontier lab builds the engine; the agent layer turns it into something a support, IT, or ops team can actually point at their work.

Try eesel

If the appeal of a model like Fable 5 is "autonomous work that just gets done," that's exactly what eesel AI delivers for customer-facing and internal support, without asking you to pick a model or write a single prompt. eesel's AI teammates learn from your past tickets, help docs, and tooling on day one, then draft replies, triage, and resolve tickets across 100+ integrations like Zendesk, Freshdesk, Slack, and Gorgias.

The differentiator is control: with simulation mode you can run the agent against thousands of your past tickets to see exactly how it would have handled them, find the gaps, and fix them before it ever replies to a real customer. Smava already runs a fully automated agent processing 100,000+ tickets a month, and Gridwise saw 73% of tier-1 requests resolved in the first month. And because pricing is usage-based at $0.40 per resolved ticket with no per-seat fees, you're paying for outcomes, not for tokens you can't predict. You can try eesel free with $50 of usage and no credit card.

Frequently Asked Questions

What is Claude Fable 5?

Claude Fable 5 is Anthropic's flagship 'Mythos-class' AI model, announced on 9 June 2026 as a new top capability tier sitting above Claude Opus 4.8. It's built to handle long-running, multi-stage work, and it's the publicly available half of a pair (Fable 5 has safeguards; Mythos 5 is the same model with them removed for vetted research partners). If you want that kind of capability working inside your own helpdesk rather than a chat window, that's what an AI helpdesk agent is for.

How much does Claude Fable 5 cost?

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens, exactly twice the price of Claude Opus 4.8. It was free on Pro, Max, Team, and seat-based Enterprise plans only until 22 June 2026, after which usage is billed against credits. If you'd rather pay per outcome than per token, eesel's usage-based pricing starts at $0.40 per resolved ticket.

Is Claude Fable 5 better than Claude Opus 4.8?

On Anthropic's own benchmarks, yes: Fable 5 scored more than 10% higher than Opus 4.8 on some tests, and developers report it solving problems Opus couldn't. The trade-offs are that it's slower, twice the price, and its safety routing can quietly hand your session back to Opus 4.8 anyway. For most teams the model is plumbing; what matters is the agent layer on top of it.

What is the difference between Claude Fable 5 and Claude Mythos 5?

They're the same underlying model. Claude Fable 5 ships with safety classifiers and is available to anyone with API access or a paid Claude plan; Claude Mythos 5 has those classifiers removed and is gated to vetted cybersecurity and biology research partners. Simon Willison summarised it as the same performance with much stricter guardrails.

Can I use Claude Fable 5 for customer support automation?

You can, but you rarely want to wire a raw frontier model straight into a support queue. A purpose-built layer handles knowledge, oversight, and actions for you. eesel's AI teammates learn from your past tickets and help docs, let you test on real history in simulation mode before going live, and connect to 100+ tools like Zendesk and Slack, so you get the capability without owning the prompt engineering.

Hire your AI teammate

Set up in minutes. No credit card required.

Try for free Book a demo

Share this article

Article by

Alicia Kirana Utomo

Kira is a writer at eesel AI with a Computer Science background and over a year of hands-on experience evaluating AI-powered customer service tools. She focuses on breaking down how helpdesk platforms and AI agents actually work so that support teams can make better buying decisions.