Skip to main content

# The Claude Code Leak: What We Learned from Anthropic’s NPM Packaging Error

Introduction: The Day the Blueprints Leaked

On March 31, 2026, the cybersecurity landscape experienced a tectonic shift not through a zero-day exploit or a nation-state breach, but through a mundane operational oversight. Security researcher Chaofan Shou, an intern at Solayer Labs, discovered that Anthropic—a company then valued at $380 billion—had accidentally published the full internal source code for its flagship AI coding CLI, @anthropic-ai/claude-code, to the public npm registry. Specifically, version 2.1.88 included 59.8 MB of source maps that served as a “complete source disclosure,” effectively handing the world the blueprints to the most advanced agentic “harness” ever built.

The leak, comprising approximately 512,000 lines of original TypeScript across nearly 2,000 files, offered an unprecedented autopsy of frontier AI engineering. This incident was the second such failure in a short window for Anthropic, following the leak of a cybersecurity-focused model codenamed “Mythos.” Within hours, the repository was mirrored across GitHub, amassing over 41,500 forks before DMCA notices could even be drafted. For a Lead AI Architect, the code revealed a startling dichotomy: a sophisticated, paranoid architectural design struggling under the weight of “vibe-coded” technical debt and a culture of aggressive, unreviewed shipping.

The Anatomy of a Packaging Error: How Bun and .npmignore Failed

The technical root of the leak lies at the intersection of a long-standing runtime bug and a failure in release hygiene. In late 2024, Anthropic acquired Bun and transitioned Claude Code to use it as its primary bundler. However, Bun’s bundler suffers from Issue #28001—a bug that had been sitting open in the public tracker since early 2026. Despite setting development: false, the bundler continues to generate source maps and append sourceMappingURL comments to the output. The reproduction of this bug was documented as “trivial,” yet it remained unpatched in the production pipeline.

This bug alone would not have leaked the source code if the packaging configuration had been robust. However, the .npmignore file for the project was missing a critical exclusion: *.map. Because the npm registry defaults to inclusion, the 59.8 MB cli.js.map file was packaged and uploaded. Unlike minified code, a source map containing the sourcesContent field is a total disclosure of the original, unminified source, including every internal comment and system prompt. To compound the exposure, the source map referenced a ZIP archive hosted on an Anthropic-owned Cloudflare R2 bucket, providing a secondary path for mirrors to scrape the entire directory structure. This was not a failure of AI, but a failure of the “operational plumbing” that governs how software reaches the public.

Inside the “Unhinged” Codebase: Engineering Culture at Anthropic

Contextual image for Inside the "Unhinged" Codebase: Engineering Culture at Anthropic

Analyzing the leaked code, codenamed internally as “Tengu,” reveals a codebase that developers across Reddit and X have described as “absolutely unhinged.” It possesses the frantic energy of a “3am side project” rather than a multi-billion dollar enterprise product. The main.tsx file is a gargantuan 803,924 bytes, housing 4,683 lines of code. The print utility and message handling files are similarly bloated, each exceeding 5,500 lines. This structural sprawl is symptomatic of “vibe coding”—a workflow where developers delegate generation to the AI and skip line-by-line review, a practice Anthropic’s own leads admitted to, stating that 100% of their contributions were written by Claude itself.

The technical debt is visible in the 460 eslint-disable comments and the naming conventions used throughout the system. A recurring pattern is the use of functions like writeFileSyncAndFlush_DEPRECATED(), which, despite their names, are called more than 50 times in production. More revealing is the getFeatureValue_CACHED_MAY_BE_STALE() pattern, signaling a “latency over correctness” priority that suggests a hard-won battle against the interactive loop’s speed. The internal comments provide a raw look at developer frustration, with notes like “// TODO: figure out why” in critical error handlers and an admission from an engineer named “Ollie” in mcp/client.ts that certain code might be “entirely pointless.” Perhaps most telling is the comment in the PRNG algorithm: “Mulberry32 — tiny seeded PRNG, good enough for picking ducks.”

The Hidden Roadmap: KAIROS, ULTRAPLAN, and the YOLO Classifier

The leak revealed 44 hidden feature flags, many with gemstone-based codenames like Tengu_Cobalt_Frost (voice) and Tengu_Amber_Quartz (a voice kill switch). These flags outline an ambitious roadmap toward total agentic autonomy:

  • KAIROS: A persistent background daemon mode. KAIROS operates as an always-on assistant that monitors the developer’s environment (via SubscribePR and PushNotification) and intervenes proactively. The code specifies a strict 15-second blocking budget per autonomous cycle to ensure it doesn’t starve the main process.
  • ULTRAPLAN: This offloads complex reasoning to remote 30-minute sessions powered by Opus 4.6 (codenamed Fennec). This “remote thinking” mode allows the agent to build architectural plans in the cloud before “teleporting” them back to the user’s terminal.
  • Multi-Agent Coordinator: A mode that transforms Claude Code into an orchestrator of worker agents, utilizing parallel “scratchpads” to synthesize complex tasks.
  • The YOLO Classifier: Controlled by the TRANSCRIPT_CLASSIFIER flag, this is a fast ML-based permission system. Unlike rule-based gates, it uses an internal model to analyze the conversation transcript and decide whether to auto-approve tool calls, moving toward a “zero-interruption” autonomous state.

The code also confirmed model version strings for Opus 4.7 and Sonnet 4.8, and revealed details for Capybara (the internal name for Claude Mythos), featuring a 1-million token context window variant (capybara-fast[1m]).

The “Buddy” System: A Terminal-Based Tamagotchi

One of the most surprising finds was the /buddy system—a terminal-based Tamagotchi companion. While seemingly an April Fools’ gag (with a salt of “friend-2026-401”), it is a fully realized RPG-style feature. The system features 18 species, including “chonk,” “axolotl,” and “dragon,” with species assignment deterministically tied to the user’s ID hash. It includes gacha mechanics with a 1% legendary drop rate and “shiny” variants.

The “Buddy” feature also highlighted an architectural irony: Anthropic engineers hex-encoded the species names (e.g., 0x64, 0x75, 0x63, 0x6b for “duck”) to bypass their own internal build scanners. These scanners were apparently flagging the word “duck” as a collision with internal model codenames. This manual obfuscation, sitting inside a package that was otherwise completely exposed, demonstrates the fragmented nature of their internal security—blocking “duck” while leaving 512,000 lines of source code open to the registry.

Architectural Mastery: Memory Systems and Anti-Distillation

Despite the “unhinged” organization, the core architecture of Claude Code reveals deep mastery in AI state management. The memory system utilizes a three-layer hierarchy to manage context without saturating the window. A MEMORY.md index acts as a permanent pointer system, while detailed Topic Files are fetched on-demand. This is managed by the “Dream” system (autoDream), a background subagent that performs reflective memory consolidation in a four-phase cycle: Orient → Gather → Consolidate → Prune. These “dreams” are triggered by specific gates—such as 24 hours passing or at least 5 active sessions—ensuring the agent “thinks” about the user while they are idle.

Security at the architectural level is notably paranoid. The code includes an Anti-distillation mode (ANTI_DISTILLATION_CC) designed to poison the training data of competitors by injecting fake tool definitions into system prompts. Furthermore, a Client Attestation system ensures that only legitimate Claude Code binaries can access the API. This mechanism has the Bun runtime replace a cch=00000 placeholder with a cryptographic hash. The code also features a “Safeguards Team” instruction block explicitly owned by David Forsythe and Kyla Guru, warning that the instructions must not be modified without their review. Finally, the use of prctl(PR_SET_DUMPABLE, 0) shows extreme concern regarding local token theft, preventing other processes from reading heap memory to steal session keys.

The Irony of “Undercover Mode”

The crowning irony of the leak is the undercover.ts module. This subsystem was built specifically to maintain secrecy by preventing Anthropic engineers from accidentally leaking internal codenames—like Capybara or Tengu—into public GitHub commits. The “Undercover Mode” system prompt instructs the AI to strip all internal information and co-author attributions, enabling “ghost-contributions” to open-source projects without revealing that the code was AI-generated by Anthropic staff.

The activation logic for this module is “fail-deadly”: it is active by default UNLESS the repository remote matches an internal allowlist. The fact that the internal codenames this system was designed to protect were leaked by the very package containing the protection module is a stark reminder of the “plumbing problem.” Even the most sophisticated secrecy subsystem is useless if the release script isn’t configured correctly.

The Security Aftermath: Malware Lures and Supply Chain Risks

The fallout was immediate and multi-vector. Within 24 hours, threat actors weaponized the “hype” surrounding the leak to distribute Vidar stealer and GhostSocks proxy malware. Malicious GitHub mirrors appeared, using README files that promised “unlocked enterprise features” and “leaked models” to entice curious developers. One prominent campaign utilized a Rust-compiled dropper (TradeAI.exe) hidden within trojanized archives like ClaudeCode_x64.7z. These droppers implemented sophisticated anti-analysis checks, searching for hostnames like “Wasp” or “MARS” and blacklisting research usernames like “Janet Van Dyne.”

This incident occurred amidst a broader 2026 supply chain crisis. In the same two-week window, Trivy suffered a GitHub Actions compromise, LiteLLM issued emergency security updates, and axios was hit by malicious versions delivering remote access trojans. The Claude leak served as the tipping point, leading to “package squatting” where attackers registered names mimicking Anthropic’s internal tools to target anyone trying to compile the leaked source. This has forced competitors into a “contamination” crisis; any AI lab that viewed the code is now legally compromised, prompting the rise of clean-room efforts like Claw-Code, which attempts to replicate the CLI’s behavior solely from test suite analysis.

Lessons for the Modern Enterprise: Moving Beyond “Vibe Coding”

The Claude Code leak provides a critical playbook for securing modern AI development pipelines. The error was operational, not a failure of the models, but it highlights that “vibe coding” without operational guardrails is a recipe for disaster. Organizations must implement the following “Artifact Hygiene” standards:

  • Audit Publish Outputs: Teams must mandate the use of npm pack --dry-run in CI/CD to inspect the actual contents of the tarball before it reaches the registry.
  • Whitelist by Default: Abandon the “blocklist” approach of .npmignore. Instead, use the files field in package.json to explicitly whitelist only necessary production artifacts.
  • Implement Automated Scanners: Use frameworks like VibeGuard to detect residual source maps, hardcoded secrets, and “configuration drift” where new files are added but exclusion rules are not updated.
  • Isolation of Development Environments: Move development workloads into Cloud Development Environments (CDEs). By isolating the workspace, the “blast radius” of a compromised dependency or a malicious “leaked” tool is contained, preventing pivots to the corporate network.

Conclusion: What This Means for the Future of Claude

The 2026 Claude Code leak was a watershed moment. While no customer data was lost, the leak provided a “blueprint” of the world’s most advanced AI coding tool. It proved that Anthropic is building exceptionally serious architecture—evidenced by the KAIROS daemon and the reflective “Dream” subagents—but is doing so with a culture that occasionally trades basic operational security for raw shipping velocity. As we move further into the era of autonomous agents, the industry must recognize that the most dangerous vulnerability isn’t always in the code; often, it’s in the configuration file that was never reviewed.

Leave a Reply

Close Menu

Wow look at this!

This is an optional, highly
customizable off canvas area.

About Salient

The Castle
Unit 345
2500 Castle Dr
Manhattan, NY

T: +216 (0)40 3629 4753
E: hello@themenectar.com