In April, Anthropic told the United States government that a category of AI model was dangerous enough to keep away from the public. In June, it released that category of model to the public. Three days later, the government ordered it offline.


This report is about the two months in between, and the three days after.


What Anthropic said in April


Project Glasswing began as a restricted preview of Claude Mythos, a model Anthropic described at the time as so powerful it had to be tightly controlled. Access was limited to twelve partners, later expanded to roughly sixty research and infrastructure organizations, under a credit program built specifically for this purpose.


The reason given was specific. Mythos Preview could autonomously discover and weaponize software vulnerabilities. Anthropic privately briefed senior US officials that releasing this capability without restriction could make large scale cyberattacks significantly more likely within the year. The restriction was framed as a deliberate, narrow window, a chance to build industry norms before other labs reached the same capability, potentially without the same safeguards.


This was not a vague gesture toward caution. It was a named risk, tied to a named capability, communicated directly to the government that would later issue the directive this report is about.

What Anthropic did in June


On June 9, Anthropic released Claude Fable 5, described as a Mythos class model made safe for general use. The announcement was direct about the tension. Releasing a model this capable comes with risks, Anthropic wrote. Without safeguards, Fable 5’s capabilities in areas like cybersecurity could be misused to cause serious damage.


The safeguards were real and specific. Classifiers screen queries on cybersecurity, biology and chemistry, and distillation, routing flagged requests to the less capable Claude Opus 4.8 instead. Anthropic said fallback triggers in under five percent of sessions. The company also described thousands of hours of red teaming conducted with the US government, the UK AI Safety Institute, and external partners, aimed specifically at jailbreak resistance.


Alongside Fable 5, Anthropic released Mythos 5 itself, the same underlying model with safeguards lifted in some areas, distributed through Project Glasswing to a small group of cyberdefenders and infrastructure providers. Anthropic described Mythos 5 as having the strongest cybersecurity capabilities of any model in the world, and said it intended to expand access through a broader trusted program soon.


In April, this capability category was the reason for restriction. In June, it was launched, with the restriction reframed as a safeguard layer on top of general availability rather than a gate in front of it.

Three Days


On June 12, at 5:21pm Eastern time, Anthropic received an export control directive from the US government. Citing national security authorities, the directive ordered the suspension of all access to Fable 5 and Mythos 5 by any foreign national, anywhere, including Anthropic’s own foreign national employees.


Anthropic said it could not filter foreign nationals from other users on the timeline required. The company disabled both models for every customer, the first time a leading AI company has taken a publicly deployed model offline due to direct intervention from the federal government.

The directive’s stated basis, as Anthropic described it, was a jailbreak method involving requests to analyze code and identify flaws. This is the same capability domain, vulnerability discovery, that Anthropic spent two months telling officials and partners was the central reason Mythos class models required restriction in the first place.


Three days passed between general release and government ordered suspension. As of this writing, Anthropic has not announced a timeline for restoration or the conditions under which access might resume.


The same risk, twice


Anthropic’s public response to the suspension argues that a narrow potential jailbreak should not be cause to recall a model used by hundreds of millions of people, and that the capability in question is routinely used by cybersecurity professionals and already available through other AI systems. The company said it believes the action stems from a misunderstanding and is working to restore access.


These characterizations are not necessarily false. A narrow jailbreak and a general category of dual use risk are different claims, and Anthropic’s own framing distinguishes between them. But the practical sequence is this. The same institution that spent two months building a documented case, to officials, to partners, to the public, that vulnerability discovery by Mythos class models was a risk serious enough to justify a twelve partner restricted preview, a dedicated credit program, and private briefings to senior government officials, is now describing an instance of that same risk, materializing within the predicted timeframe Anthropic itself had warned about, as narrow and disproportionate to the response it triggered.


Both assessments were made by Anthropic. Both were about the same capability. The distance between them is one decision, made in the time between April and June 9.


What is not in dispute, and what is


This report does not take a position on whether the export control directive was lawful, proportionate, or appropriately specific. That question sits inside an active legal dispute between Anthropic and the federal government that predates this incident. In February, the Department of Defense designated Anthropic a supply chain risk, a label without precedent for a US company, after the administration and Anthropic clashed over how Anthropic’s models could be used by the Pentagon. Anthropic sued. A federal judge ruled the designation could not be enforced while the case proceeds. That litigation is ongoing, and last week’s directive arrived inside that same relationship, whatever its formal independence from it.


This report also does not take a position on whether Fable 5’s actual safeguards are adequate. Anthropic’s red teaming process was substantial and was conducted in part with the same government that issued the suspension. Whether that process should have surfaced this jailbreak before launch, and whether it could have, are questions this report cannot answer from the outside.


What this report observes is narrower, and does not depend on resolving either of the above. Anthropic held two different thresholds for the same risk, separated by two months and one launch decision. The threshold used to justify Glasswing’s restrictions in April was crossed by Fable 5’s general release in June. The threshold being argued for now, in response to the suspension, is one Anthropic did not appear to be operating under when it made that release decision.


Anthropic is positioned to address this directly. It has not yet done so, at least not in a way that engages with Glasswing’s own stated rationale for restriction in the first place.


Closing


Anthropic spent two months telling governments and partners that this category of risk was serious enough to restrict. It spent three days finding out what happens when it isn’t restricted. This report does not resolve which assessment was correct. It notes that both were made by the same institution, about the same risk, separated by one decision.

SOURCES

Reply

Avatar

or to participate

Keep Reading