April 7, 2026
Anthropic Leaked Twice in One Week — And It Says More About the Industry Than Either Leak
Anthropic exposed internal Mythos files, then accidentally published Claude Code. The bigger story is how thin AI labs' safety operations still are.
On March 26, 2026, roughly 3,000 internal Anthropic files appeared publicly via a CMS misconfiguration. Among them: a draft document describing a model called Claude Mythos, described internally as "by far the most powerful AI model we've ever developed" -- a full tier above Opus, with dramatic improvements in coding, academic reasoning, and cybersecurity. The document also noted that Mythos carries "unprecedented cybersecurity risks." Anthropic confirmed the model's existence but has given no release date.
Five days later, on March 31, Anthropic accidentally published 512,000 lines of TypeScript in an npm package update. The package was @anthropic-ai/claude-code v2.1.88. The source included the full client-side CLI agent harness, permission enforcement logic, sandboxing architecture, and 44 hidden feature flags.
Two leaks. One week.
What the leaks actually reveal
The Mythos leak is alarming for a specific reason: it's a safety-first company accidentally publishing a safety warning about its own unreleased product. Anthropic exists because its founders believed they were building something potentially dangerous and wanted to be the ones doing it carefully. A leaked document warning of "unprecedented cybersecurity risks" attached to an unreleased model is not a reassuring signal about how careful the process is.
Context worth noting: Opus 4.6, their current flagship, can already autonomously find zero-day software vulnerabilities. In February 2026, 16 Opus agents collaborated to write a C compiler in Rust capable of compiling the Linux kernel. The cost: roughly $20,000. Mythos is described as dramatically more capable than Opus.
The Claude Code leak is a different kind of problem. The source of an agentic coding tool, now public, gives researchers and adversaries a detailed map of how the system enforces permissions, what it sandboxes, and what the 44 feature flags control. Security through obscurity was never a real strategy, but losing it early is still a cost.
The pattern underneath both leaks
Both leaks share a root cause: infrastructure and process didn't keep pace with the speed of development. A CMS misconfiguration. An npm publish without a pre-flight check. These aren't exotic attacks. They're the kind of mistakes any engineering team makes -- the difference is the material being exposed.
OpenClaw users in particular should take note: the Claude Code source confirms that the agentic harness has a meaningful permission enforcement layer. That's not theoretical -- the code is now public. If you're running Claude-backed agents with broad permissions, understanding that layer is now possible and worth doing.
The honest summary
Anthropic is building faster than their operational security practices. That's not a unique criticism of Anthropic -- it's a systemic description of the current AI industry. But Anthropic's specific positioning as the safety-conscious lab makes the gap more visible.
The leaks don't tell us Mythos is dangerous. They tell us the system that's supposed to manage its risk had two significant failures in one week.
That's worth watching.
Sources & further reading