What Happens When Anthropic's Mythos Class Models go Public?

Share this article
Share this article
Prioritise Us on Google
Dario Amodei, CEO of Anthropic | Credit: Anthropic
Claude Mythos 5, touted as the world’s most powerful AI cybersecurity model, launches via Project Glasswing as Fable 5 goes public

Anthropic kept Claude Mythos Preview hidden away from public access due to its superior capability to identify vulnerabilities in software infrastructure.

Despite this, the company is now releasing Claude Fable 5, a Mythos class model that Anthropic says has been made safe for general use.

"Fable is a Mythos-class model. The most capable class of systems we've built and the first one we've made generally available," says Mike Kriegar, Chief Product Officer (CPO) at Anthropic.

"It's state of the art on nearly every benchmark (SWE-bench Pro went from 69.2 with Opus 4.8 to 80.3), with a lead that grows as tasks get longer and harder.

Mike Krieger, Chief Product Officer at Anthropic | Photo: Viva Tech

"With earlier models, you broke a project into model-sized tasks and stitched the results together. Fable holds the whole project. It plans, runs for hours or days, checks its own work and comes back when it's done." 

Safeguards redirect certain requests

Anthropic does not shy away from acknowledging the risks of the model.

In fact, it has built safeguards that redirect certain kinds of requests to Claude Opus 4.8 – its next most capable model.

These safeguards hope to prevent the model from being used in dangerous use cases like offensive cyber or bioweapon related queries. 

The trade-off here is that some harmless requests may sometimes be caught by the safeguards. Although according to Anthropic, this happens in less than 5% of sessions, with safeguards designed to delicately balance capability with security concerns.

Mythos 5 targets defenders

Claude Mythos 5 will be initially released through the industry coalition Project Glasswing in collaboration with the US Government.

Youtube Placeholder

Eclipsing the capabilities of Claude Mythos Preview, Anthropic says that Mythos 5 has "the strongest cybersecurity capabilities of any model in the world".

Mythos 5 shares the same base model as Fable 5, with safeguards lifted in some areas.

Anthropic has indicated that Mythos 5 may be rolled out through a broader trusted access programme in future.

Extended autonomy and applications

Mythos 5 and Fable 5 can work autonomously for longer periods than previous Claude models.

Applications are numerous and spans scientific research, including drug discovery and molecular biology hypothesis synthesis.

Comparison of capabilities | Credit: Anthropic

The models could also conduct their own research in genomics.

Heavy duty software engineering tasks that would normally take months, with Fable can be compressed to a matter of days.

Fable 5 and Mythos 5 are priced at US$10 per million input tokens and US$50 per million output tokens. 

Agentic hacking capabilities tested

Mythos class models can perform agentic hacking across multiple stages including reconnaissance, vulnerability discovery, exploit chain creation and lateral movement.

This is why Anthropic uses classifiers to identify potential adversarial uses such as jailbreaking, being specially trained to spot suspicious activity patterns.

Model capabilities in terms of agentic coding | Credit: Anthropic

According to Anthropic, the company red teamed its new models extensively and no jailbreaks were successful in over 1,000 hours of testing.

Biological research restrictions expanded

Anthropic previously used classifiers that blocked a narrow range of queries related to bioweapons. The company now says this may no longer be sufficient.

According to Anthropic, there is reason to believe that "well-resourced malicious actors" were attempting to use their models for what it describes as "highly risky biological research". 

One tested and successful use of these models involved deploying the model to solve a challenging step in designing adeno-associated viruses (AAVs). These are components used in gene therapy, but in the wrong hands can be used to create dangerous viruses.

"Our priority was to safely release Fable as soon as we could, even at the cost of overly broad safeguards," says Anthropic.

"Therefore, for the time being we have arranged for Fable to fall back to Opus 4.8 on most requests related to biology and chemistry."

Attempts to distil the model are also blocked by classifiers. 

Industry responses vary widely

Reactions to the model release ranged from approval of the capabilities to concerns about safety measures. Some industry figures question whether the safeguards adequately address the risks.

Andrew Rubin, Founder and CEO of Illumio | Credit: Illumio

"The introduction of guardrails isn't evidence that the problem is solved – it's an admission that even the companies building these models don't fully trust where the capability leads," notes Andrew Rubin, Founder and CEO of Illumio.

"Constraints at the interface don’t change the underlying math. Attackers won’t operate at that layer. 

"They'll go straight after the capability itself. And as these tools become more broadly available, the speed and scale of attacks will only increase.

"The real question is whether defenders are prepared to operate at the same speed." 

Company portals

Executives