Anthropic launches Claude Helios under new pre-deployment notice window in first test of frontier pact
5 min read, word count: 1046SAN FRANCISCO — Anthropic on Tuesday unveiled Claude Helios, its newest frontier model, in a release that doubles as the first real test of the industry safety pact the largest U.S. artificial-intelligence developers committed to four days earlier. The model entered Anthropic’s enterprise customer pipeline at 9 a.m. Pacific, following a 62-day pre-deployment notification window that the company said it had voluntarily extended by two days to incorporate late-arriving findings from outside red-teamers.
Helios, which Anthropic described as a “reasoning-forward” successor to its Claude 4 Opus and Sonnet families, posted state-of-the-art results on the company’s published evaluation suite for software engineering, scientific reasoning and long-horizon agentic tasks, including a 71.4 percent score on the SWE-Bench Verified coding benchmark and a 49.2 percent score on Anthropic’s internal multistep operations harness. The company said the model had been trained on a cluster of roughly 110,000 next-generation accelerators and that the training run had been disclosed in pre-training filings to the U.S. AI Safety Institute and to two state regulators in February.
The launch is the first major frontier release to move through the disclosure and evaluation regime announced Friday by OpenAI, Anthropic, Google DeepMind, Microsoft and Meta, which created the Frontier Model Assurance Council and committed signatories to a 60-day pre-deployment notification window beginning 30 days from now. Anthropic, which had already finished its internal review when the pact was signed, said it had chosen to retroactively bring Helios into the framework rather than ship under the older, more closed cadence.
“We made a commitment in writing on Friday, and the first model that comes out after that commitment has to honor it,” Anthropic Chief Executive Dario Amodei said at a briefing held inside the company’s South of Market offices. “If we did not, the framework would be a press release. We do not want that.”
Under the rules Anthropic said it followed for Helios, the model was submitted in February to a battery of dangerous-capabilities evaluations run by the AI Safety Institute, Britain’s AISI counterpart, two independent research labs and the new assurance council’s interim staff. Aggregate results were posted to a shared registry Friday and a summary version was published Tuesday morning. The company said no evaluator had flagged a Tier-3 risk finding — Anthropic’s threshold for delaying or scoping down a release — though three evaluators had recommended additional mitigations around the model’s tool-use envelope, all of which Anthropic said it had implemented.
The release is initially limited to existing enterprise customers and to the company’s API. Consumer access through the Claude.ai product is scheduled to follow on May 26, after a two-week “staged exposure” intended to give the assurance council additional time to monitor for emergent misuse patterns. Anthropic named Bain & Company, Latham & Watkins, Recursion Pharmaceuticals and the cloud security vendor Wiz as confirmed early adopters.
“We have had Helios in our internal pipeline since the second week of April under the pre-deployment notification, and the gap between what it can do and what Claude Opus could do six months ago is material,” said Janelle Okafor, head of artificial intelligence at Bain. “The disclosed evaluations matter to our clients.”
Pricing was set at $5 per million input tokens and $25 per million output tokens for the Helios API, modestly above the outgoing Opus 4 tier.
Reaction to the launch traced the same lines that have divided the year’s AI debate. Industry analysts and most enterprise buyers described the rollout as a credible demonstration that the May 8 framework could function in practice. Skeptics in academia and on Capitol Hill said the test had been too easy to draw broad conclusions from, given that Anthropic had volunteered for the regime, controlled its own release schedule and faced no binding penalties for nondisclosure.
“This is the best case for the voluntary model and it is still voluntary,” said Dr. Reuben Acharya, a senior fellow at the Center for AI Policy and a vocal supporter of the failed Sanders-Ocasio-Cortez moratorium. “The next test is what happens when a less safety-conscious lab tries the same release path, or when the assurance council finds something a company would rather not disclose. We have no evidence yet on either question.”
Sen. Maggie Hennessey, D-Colo., who introduced last week’s bipartisan AI Transparency and Grid Impact Act, said in a statement that the Anthropic release “shows that mandatory disclosure is not a foreign concept to the industry” and would, she argued, “strengthen the case for putting these obligations into statute.” A spokesperson for Sen. Bernie Sanders, I-Vt., said the senator viewed the launch as “a useful proof of concept and a wholly inadequate substitute for binding law.”
State regulators struck a similar balance. California state Sen. Maria Elena Durazo, who has been shepherding her chamber’s data-center moratorium through Sacramento, said her office had received Anthropic’s summary disclosure on Friday and found it “more substantive than the boilerplate the industry has historically offered,” while reiterating that her bill would continue moving. The New York Department of Public Service confirmed it had requested and received additional grid-impact data on the Helios training cluster.
Markets received the launch as expected. Shares of Alphabet, Microsoft and Meta closed within half a percentage point of unchanged, while Nvidia rose 1.3 percent on analyst notes pointing to the disclosed cluster size. Anthropic is privately held.
Anthropic devoted roughly a third of its release announcement to failure modes identified during evaluation, including a category of “subtly overconfident planning” the company said it had reduced but not eliminated. The shift, several researchers said, reflected pressure from the AI Safety Institute and from the assurance council’s interim director, former federal chief information officer Pavithra Ramaswamy.
Ramaswamy, in a brief statement issued through the council, said Helios’s evaluation file would be available to opted-in state regulators starting Thursday and that the council expected to publish its first quarterly transparency report on July 15. She declined to characterize whether any of the evaluations had returned findings that the council would have escalated to a public warning.
Anthropic said it would brief congressional staff on the rollout this week and would file additional documentation with the AI Safety Institute on May 19. Company officials said a longer technical report on Helios’s training methodology and safety mitigations would be published later this month.
Note: This article was partially constructed using data from LLM.