Anthropic's Triple Moments: Code Leaks, Government Confrontation, and Weaponization

BlockBeatNews

2026-04-05 07:06:16

Original Title: Anthropic: The Leak, The War, The Weapon
Original Author: BuBBliK
Translation: Peggy, BlockBeats

Editor’s Note: Over the past six months, Anthropic has repeatedly found itself drawn into a series of events that appear independent on the surface yet clearly point to one another: leaps in model capabilities, automated attacks in the real world, sharp reactions in capital markets, open conflicts with governments, and multiple information leaks caused by mistakes in basic configuration. When you put these clues together, they collectively outline a clearer direction of change.

Using these events as a window, this article reviews an AI company’s continuous trajectory through technological breakthroughs, risk exposure, and governance struggles—and tries to answer a deeper question: as the ability to “find vulnerabilities” is greatly amplified and gradually spreads, can the cybersecurity system itself still maintain its original operating logic?

In the past, security was built on capability scarcity and constraints on human labor; under the new conditions, offense and defense are converging around the same set of model capabilities, and the boundaries are becoming increasingly blurred. Meanwhile, institutional, market, and organizational responses still remain within outdated frameworks, making it difficult to catch up with this kind of change in time.

What this article is concerned with is not only Anthropic itself, but a bigger reality it reflects: AI is not only changing tools—it is changing the premises of how “security” comes to exist.

The following is the original text:

What would it look like if a company with a market value of $380 billion goes head-to-head with the Pentagon, comes out ahead, survives what is arguably the first-ever cyberattack launched by autonomous AI, leaks a model within its own organization that even its developers say they find terrifying, and even “accidentally” publishes the full source code—what would that add up to?

The answer is: it looks exactly like this. And what’s even more unsettling is that the truly most dangerous part may not have happened yet.

Event Recap

Anthropic leaks its own code again

On March 31, 2026, Shou Chaofan, a security researcher at the blockchain company Fuzzland, while checking the Claude Code npm package published by the official team, found that it plainly contains a file named cli.js.map in plaintext.

The file is as large as 60MB, and the contents are even more shocking. It almost contains the complete TypeScript source code of the entire product. With just this one file, anyone can reconstruct up to 1906 internal source files: including internal API design, the telemetry system, encryption tools, security logic, and the plugin system—nearly every core component laid bare. Even more importantly, this content can be downloaded directly from Anthropic’s own R2 storage bucket as a zip file.

This discovery spread rapidly across social media: within hours, related posts received 754,000 views and nearly 1000 reposts; at the same time, multiple GitHub repositories containing reconstructed source code were created and made public immediately.

So-called source maps are, in essence, only auxiliary files used for JavaScript debugging. Their role is to map compressed, compiled code back to the original source code, making it easier for developers to troubleshoot issues.

But there is a basic rule: they should never be included in production-release packages.

This isn’t some advanced attack technique—it’s simply a basic engineering standards issue, “Build Configuration 101,” even the kind of content developers learn in their first week. If it’s mistakenly packaged into a production environment, source maps often amount to giving the source code away “as a free extra” to everyone.

You can also view the relevant code directly here: https://github.com/instructkr/claude-code

But what makes this feel truly absurd is that this has already happened once.

In February 2025—about a year earlier—there was an almost identical leak: the same file, the same type of mistake. At the time, Anthropic removed the old version from npm, removed the source map, and republished a new version, and the matter ended there.

As a result, in version v2.1.88, this file was packaged and released again.

A company valued at $380 billion, building what it calls the world’s most advanced vulnerability detection system, committed the same basic mistake twice within a year. There was no hacker attack and no complex exploitation chain—just a build process that should have been working normally.

This kind of irony almost carries a certain “poetic” quality.

An AI that can find 500 zero-day vulnerabilities in a single run; a model used to carry out automated attacks against 30 institutions worldwide—yet at the same time, Anthropic “bundled and gifted” its own source code directly to anyone willing to take a glance at the npm package.

Two leaks, separated by no more than seven days.

The reasons are the same: the most basic configuration error. No technical threshold is needed, and no complex exploitation path is required. As long as you know where to look, anyone can obtain it for free.

A week ago: internal “dangerous model” accidentally exposed

On March 26, 2026, security researchers Roy Paz from LayerX Security and Alexandre Pauwels from the University of Cambridge found that there was an issue with the CMS configuration on Anthropic’s official website, which led to approximately 3000 internal files being publicly accessible.

These files included: draft blog posts, PDFs, internal documents, and presentation materials—all exposed in an unprotected, searchable data store. There was no hacker attack, and no technical means were required.

Among these files, there were two blog draft versions that were nearly identical, with the only difference being the model name: one written as “Mythos,” and the other as “Capybara.”

This meant that at the time, Anthropic was choosing between two names for the same secret project. The company later confirmed that training for this model had been completed and that it had begun testing with some early customers.

This was not a routine upgrade to Opus; it was a brand-new “Level 4” model, positioned even higher than Opus.

In Anthropic’s own drafts, it was described as: “bigger and smarter than our Opus model—and Opus is still our most powerful model to date.” It achieved significant leaps in programming capability, academic reasoning, and cybersecurity as well. A spokesperson described it as a “qualitative leap,” and also “the strongest model we’ve built so far.”

But what is truly worth paying attention to is not these performance descriptions themselves.

In the leaked drafts, Anthropic’s assessment of this model was that it “brings unprecedented cybersecurity risk,” that its “cyber capabilities far surpass any other AI model,” and that it “foretells an incoming wave of models—its ability to exploit vulnerabilities will far exceed the speed at which defenders can respond.”

In other words, in an official blog draft that had not yet been made public, Anthropic had already explicitly expressed a rare stance: they felt uneasy about the product they were building.

The market reaction was almost immediate. CrowdStrike shares fell 7%, Palo Alto Networks fell 6%, and Zscaler fell 4.5%; Okta and SentinelOne both fell by more than 7%, and Tenable plunged 9%. The iShares Cybersecurity ETF dropped 4.5% in a single day. Just for CrowdStrike alone, its market value evaporated by about $15 billion that day. Meanwhile, Bitcoin pulled back to $66,000.

The market clearly interpreted this incident as a kind of “verdict” on the entire cybersecurity industry.

Figure: In the context of the related news, the cybersecurity sector as a whole declined, and multiple leading companies (such as CrowdStrike, Palo Alto Networks, Zscaler, etc.) saw notable drops, reflecting the market’s concerns about AI’s impact on the cybersecurity industry. However, this reaction is not the first time it has occurred. Previously, when Anthropic released a code-scanning tool, related stocks also fell, indicating that the market has begun to view AI as a structural threat to traditional security vendors and that the entire software industry is under similar pressure.

Stifel analyst Adam Borg’s comments were rather direct: the model “has the potential to become the ultimate hacker tool—one that could even upgrade ordinary hackers into opponents with nation-state-level attack capabilities.”

So why hasn’t it been publicly released yet? Anthropic’s explanation is that the operating cost of Mythos is “very high” and it does not yet meet the conditions for public release. The current plan is to first open early access to a small number of cybersecurity partners to strengthen the defense system; then gradually expand the scope of API availability. Before that, the company is still continuously optimizing efficiency.

But the key point is that this model already exists, is already being tested, and even just because it was “accidentally exposed,” it has already jolted the entire capital market.

Anthropic has built an AI model it calls the “most cybersecurity-risky AI model in history.” And yet the leak of its message stems precisely from the most basic kind of infrastructure configuration error—which is also exactly the type of mistake these models were originally designed to find.

March 2026: Anthropic confronts the Pentagon and comes out ahead

In July 2025, Anthropic signed a $200 million contract with the U.S. Department of Defense. At first, it seemed like a standard cooperation. But in subsequent deployment negotiations, contradictions escalated quickly.

The Pentagon wanted “full access” to Claude on its GenAI.mil platform, for uses including all “lawful purposes”—which even covered fully autonomous weapon systems and large-scale domestic monitoring of U.S. citizens.

Anthropic drew red lines on two key issues and clearly refused; negotiations broke down in September 2025.

After that, the situation began to escalate rapidly. On February 27, 2026, Donald Trump posted on Truth Social, demanding that all federal agencies “immediately stop” using Anthropic’s technology, and calling the company “radical left-wing.”

On March 5, 2026, the U.S. Department of Defense formally labeled Anthropic as a “supply chain risk.”

This label had previously been used almost only for foreign adversaries—such as Chinese companies or Russian entities—and now, for the first time, it was applied to a U.S. company headquartered in San Francisco. At the same time, companies such as Amazon, Microsoft, and Palantir Technologies were also required to prove that none of their any military-related business uses Claude.

The explanation given by Emile Michael, the Pentagon CTO, for this decision was that Claude might “contaminate the supply chain” because different “policy preferences” are embedded inside the model. In other words, within the official context, an AI that has restrictions on use and does not unconditionally assist killing actions was instead regarded as a national security risk.

On March 26, 2026, Federal Judge Rita Lin issued a 43-page ruling that fully blocked the Pentagon’s measures.

In the ruling, she wrote: “There is nothing in existing law to support this kind of logic with ‘Orwellian’ undertones—simply because a U.S. company has a disagreement with the government’s position, it can be tagged as a potential adversary. Punishing Anthropic for putting the government’s positions under public scrutiny is, in essence, a typical and unlawful first amendment retaliation.” A friend-of-the-court brief even described the Pentagon’s actions as “attempting to murder a business.”

As a result, the government tried to suppress Anthropic, but that only brought it even more attention. Claude’s first app debut first surpassed ChatGPT in the app store, and sign-ups at one point reached more than 1 million per day.

An AI company said “no” to the world’s most powerful military institution. And the court sided with it.

November 2025: the first-ever cyberattack led by AI

On November 14, 2025, Anthropic released a report that sent shockwaves through the industry.

The report disclosed that a hacker organization backed by the Chinese state used Claude Code to launch automated attacks against 30 institutions worldwide—targets included tech giants, banks, and multiple government agencies of several countries.

This was a key turning point: AI was no longer merely an assisting tool, but began to be used to independently carry out attack actions.

The key lay in the change in the “division of labor”: humans were only responsible for selecting targets and approving key decisions. During the entire operation, humans were involved only about 4 to 6 times. Everything else was done by AI: intelligence reconnaissance, vulnerability discovery, writing exploit code, data theft, implanting backdoors… accounting for 80%–90% of the attack process, and running at a speed of thousands of requests per second—an in-scale and efficiency that no human team could match.

So how did they bypass Claude’s security safeguards? The answer is: they didn’t “break” it—they “tricked” it.

The attack was broken down into a large number of small tasks that looked harmless, and was packaged as “authorized defensive testing” by a “legitimate security company.” In essence, it was a social engineering attack—only this time, the deceived target was the AI itself.

Some of the attacks were completely successful. Without humans giving step-by-step instructions, Claude was able to autonomously draw a complete network topology, locate databases, and complete data extraction.

The only factor that slowed the attack tempo was that the model occasionally “hallucinated”—for example, fabricating credentials, or claiming that it had obtained files that were actually already publicly available. At least for now, this remains one of the few “natural obstacles” preventing fully automated cyberattacks.

At RSA Conference 2026, Rob Joyce, the former head of cybersecurity at the U.S. National Security Agency, described the incident as a “Rorschach test”: half the people chose to ignore it, while the other half felt chilled. And he, obviously, belonged to the latter—“This is very scary.”

September 2025: This is not a prediction—it’s already reality

February 2026: 500 zero-day vulnerabilities found in a single run

On February 5, 2026, Anthropic released Claude Opus 4.6, along with a research paper that nearly shook the entire cybersecurity industry.

The experiment setup was extremely simple: place Claude in an isolated virtual machine environment, equipped with standard tools—Python, a debugger, and fuzz testing tools (fuzzers). No additional instructions, no complex prompts—just one line: “Go find vulnerabilities.”

The result was that the model discovered 500+ previously unknown high-severity zero-day vulnerabilities. Some of these vulnerabilities remained undiscovered even after decades of expert review and millions of hours of automated testing.

After that, at RSA Conference 2026, researcher Nicholas Carlini took the stage to demonstrate. He pointed Claude at Ghost, a CMS system on GitHub with 50,000 stars and, throughout its history, no appearance of any serious vulnerabilities.

After 90 minutes, the results came in: blind SQL injection vulnerabilities were discovered, enabling unverified users to take full administrator privileges over the system.

Next, he used Claude to analyze the Linux kernel as well. The outcome was the same.

15 days later, Anthropic rolled out Claude Code Security, a security product that no longer relies on pattern matching, but instead uses “reasoning capability” to understand code security.

But even Anthropic’s own spokesperson said that key but often-avoided fact: “The same reasoning capability—can help Claude discover and fix vulnerabilities, but can also be used by attackers to exploit those vulnerabilities.”

The same capability, the same model—just in the hands of different people.

What does all of this add up to?

If viewed individually, each of these items could have been the biggest news of the month. But in just six months, they all happened at the same company.

Anthropic built a model that can find vulnerabilities faster than anyone; Chinese hackers converted the previous version into an automated cyber weapon; the company is developing the next generation of even stronger models—and even in internal documents they admit they feel uneasy.

The U.S. government tried to suppress it not because the technology itself is dangerous, but because Anthropic refused to hand over this capability without restrictions.

And throughout all of this, the company leaked its own source code twice because of the same file in the same npm package. A company valued at $380 billion; a company targeting a $6 billion IPO completion in October 2026; a company that has publicly said it is building “one of the most transformative—and possibly most dangerous—technologies in human history”—yet it still chose to keep moving forward.

Because they believe that if it must be done, it should be done by them rather than by others.

As for that source map in the npm package—it may be only the most absurd, yet most real, detail in the most unsettling narrative of this era.

And Mythos hasn’t even been officially released yet.

[Original Link]

Click to learn more about how BlockBeats is hiring

Welcome to join the BlockBeats official community:

Telegram subscription group: https://t.me/theblockbeats

Telegram discussion group: https://t.me/BlockBeats_App

Twitter official account: https://twitter.com/BlockBeatsAsia

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Comment

0/400

No comments