Understanding the Source Code of Anthropic's Claude Code in One Article: Why Is It Just Better Than Others?

Yuker|Published on: April 9, 2026

On March 31, 2026, security researcher Chaofan Shou discovered that the source map files in the Claude Code package published by Anthropic to npm were not stripped.

This means: The complete TypeScript source code of Claude Code, 512,000 lines, 1903 files, was exposed on the public internet just like that.

Of course, I couldn't possibly read that much code in just a few hours. Therefore, I approached reading this source code with three questions:

What is the fundamental difference between Claude Code and other AI programming tools?

Why does its "feel" for writing code just feel better than others?

What exactly is hidden within the 510,000 lines of code?

After reading it, my first reaction was: This is not an AI programming assistant, this is an operating system.

Part One: First, a story: If you were to hire a remote programmer

Imagine you hire a remote programmer and give them remote access to your computer.

What would you do?

If you follow Cursor's approach: You have them sit next to you, and every time they want to type a command, you take a look and click "Allow." Simple and crude, but you have to keep watching.

If you follow GitHub Copilot Agent's approach: You give them a brand new virtual machine and let them mess around in it freely. When they're done, they submit the code, and you review and merge it. Safe, but they can't see your local environment.

If you follow Claude Code's approach:

You let them use your computer directly—but you equip them with an extremely precise security inspection system. What they can do, what they can't do, which operations require your nod, which they can do on their own, even if they want to use rm -rf, it has to go through 9 layers of review before execution.

These are three completely different security philosophies:

Why did Anthropic choose the hardest path?

Because only in this way can the AI use your terminal, your environment, your configuration to work—this is "truly helping you write code," not "writing a piece of code in a clean room and then copying it over."

But what is the cost? They wrote 510,000 lines of code for this.

Part Two: The Claude Code you imagine vs. the actual Claude Code

Most people think AI programming tools are like this:

plaintext
User input → Call LLM API → Return result → Display to user

Claude Code is actually like this:

plaintext
User input
  → Dynamically assemble 7-layer system prompts
  → Inject Git status, project conventions, historical memory
  → 42 tools each come with a user manual
  → LLM decides which tool to use
  → 9-layer security review (AST parsing, ML classifier, sandbox check...)
  → Permission conflict resolution (local keyboard / IDE / Hook / AI classifier competing simultaneously)
  → 200ms anti-misclick delay
  → Execute tool
  → Result streamed back
  → Context approaching limit? → Three-layer compression (micro-compression → auto-compression → full compression)
  → Need parallelism? → Generate sub-Agent swarm
  → Loop until task completed

I believe everyone is curious about what the above is. Don't worry, let's break it down one by one.

Part Three: The first secret: Prompts are not written, they are "assembled"

Open src/constants/prompts.ts, and you'll see this function:

typescript
export async function getSystemPrompt(
  tools: Tools,
  model: string,
  additionalWorkingDirectories?: string[],
  mcpClients?: MCPServerConnection[],
): Promise<string[]> {
  return [
    // --- Static content (cacheable) ---
    getSimpleIntroSection(outputStyleConfig),
    getSimpleSystemSection(),
    getSimpleDoingTasksSection(),
    getActionsSection(),
    getUsingYourToolsSection(enabledTools),
    getSimpleToneAndStyleSection(),
    getOutputEfficiencySection(),

    // === Cache boundary ===
    ...(shouldUseGlobalCacheScope() ? [SYSTEM_PROMPT_DYNAMIC_BOUNDARY] : []),

    // --- Dynamic content (different each time) ---
    ...resolvedDynamicSections,
  ].filter(s => s !== null)
}

Notice that SYSTEM_PROMPT_DYNAMIC_BOUNDARY?

This is a cache demarcation line. The content above the line is static; the Claude API can cache them to save token costs. The content below the line is dynamic—your current Git branch, your CLAUDE.md project configuration, the preference memory you told it before... each conversation is different.

What does this mean?

Anthropic optimizes prompts as compiler output. The static part is the "compiled binary," the dynamic part is the "runtime parameters." The benefits of doing this are:

Saves money: The static part uses caching, not billed repeatedly.

Fast: Cache hits skip processing these tokens directly.

Flexible: The dynamic part allows each conversation to perceive the current environment.

⛏️Each tool has an independent "user manual"

What shocked me even more: Each tool directory has a prompt.ts file—this is a user manual written specifically for the LLM to read.

Look at BashTool's (src/tools/BashTool/prompt.ts, about 370 lines):

plaintext
Git Safety Protocol:
- NEVER update the git config
- NEVER run destructive git commands (push --force, reset --hard,
  checkout .) unless the user explicitly requests
- NEVER skip hooks (--no-verify) unless the user explicitly requests
- CRITICAL: Always create NEW commits rather than amending

This is not documentation written for humans; these are behavioral rules written for the AI. Every time Claude Code starts, these rules are injected into the system prompt.

This is why Claude Code never does a git push --force on its own, while some other tools do—it's not that the model is smarter, it's that the rules have been clearly explained in the prompt.

And the internal Anthropic version is different from the one you use

Code like this appears frequently:

typescript
const minimalUniquenessHint =
  process.env.USER_TYPE === 'ant'
    ? '\n- Use the smallest old_string that\'s clearly unique'
    : ''

ant refers to Anthropic internal employees. Their version has more detailed code style guidance ("don't write comments unless the WHY is not obvious"), more aggressive output strategies ("inverted pyramid writing"), and some experimental features still in A/B testing (Verification Agent, Explore & Plan Agent).

This shows that Anthropic itself is Claude Code's biggest user. They are using their own product to develop their own product.

Part Four: The second secret: 42 tools, but you only see the tip of the iceberg

Open src/tools.ts, and you'll see the tool registry:

typescript
export function getAllBaseTools(): Tools {
  return [
    AgentTool,
    BashTool,
    FileReadTool, FileEditTool, FileWriteTool,
    GlobTool, GrepTool,
    WebFetchTool, WebSearchTool,
    TodoWriteTool, NotebookEditTool,
    // ... A large number of conditionally loaded tools ...
    ...(isToolSearchEnabledOptimistic() ? [ToolSearchTool] : []),
  ]
}

42 tools, but most of them you've never seen directly. Because many tools are lazily loaded—only when the LLM needs them are they injected on-demand via the ToolSearchTool.

Why do this?

Because for each additional tool, the system prompt needs another paragraph of description, costing more tokens. If you just want Claude Code to help you change one line of code, it doesn't need to load the "scheduled task scheduler" and "team collaboration manager."

There's an even smarter design:

typescript
if (isEnvTruthy(process.env.CLAUDE_CODE_SIMPLE)) {
  const simpleTools: Tool[] = [BashTool, FileReadTool, FileEditTool]
  return filterToolsByDenyRules(simpleTools, permissionContext)
}

Set CLAUDE_CODE_SIMPLE=true, and Claude Code is left with only three tools: Bash, read file, edit file. This is a backdoor for minimalists.

1️⃣All tools come from the same factory

typescript
const TOOL_DEFAULTS = {
  isEnabled: () => true,
  isConcurrencySafe: (_input?) => false,    // Default: unsafe
  isReadOnly: (_input?) => false,            // Default: will write
  isDestructive: (_input?) => false,
}

export function buildTool<D extends AnyToolDef>(def: D): BuiltTool<D> {
  return { ...TOOL_DEFAULTS, userFacingName: () => def.name, ...def }
}

Note those default values: isConcurrencySafe defaults to false, isReadOnly defaults to false.

This is a fail-closed design—if a tool author forgets to declare safety properties, the system assumes it is "unsafe, will write." Better to be overly conservative than to miss a risk.

2️⃣The iron rule of "read before edit"

typescript
function getPreReadInstruction(): string {
  return '\n- You must use your `Read` tool at least once in the
  conversation before editing. This tool will error if you attempt
  an edit without reading the file.'
}

The FileEditTool checks if you have already read this file using the FileReadTool. If not, it directly reports an error and won't allow editing.

This is why Claude Code won't "write a piece of code out of thin air to overwrite your file" like some other tools do—it is forced to understand first, then modify.

Part Five: The third secret: The memory system—why it can "remember you"

People who have used Claude Code have a feeling: It seems to really know you.

You tell it "don't mock the database in tests," and in the next conversation, it won't mock again. You tell it "I'm a backend engineer, a React newbie," and when explaining frontend code, it will use backend analogies.

Behind this is a complete memory system.

1️⃣Using AI to retrieve memories

typescript
const SELECT_MEMORIES_SYSTEM_PROMPT =
  `You are selecting memories that will be useful to Claude Code.
   Return a list of filenames for the memories that will clearly
   be useful (up to 5).
   - If you are unsure if a memory will be useful, do not include it.
   - If a list of recently-used tools is provided, do not select
     memories that are usage reference for those tools. DO still
     select memories containing warnings, gotchas, or known issues.`

Claude Code uses another AI (Claude Sonnet) to decide "which memories are relevant to the current conversation."

Not keyword matching, not vector search—it's letting a smaller model quickly scan the titles and descriptions of all memory files, select up to 5 most relevant ones, and then inject their full content into the context of the current conversation.

The strategy is "precision over recall"—better to miss a potentially useful memory than to stuff in an irrelevant memory that pollutes the context.

⏰KAIROS mode: "Dreaming" at night

This is the part that feels most sci-fi to me.

There is a feature flag in the code called KAIROS. In this mode, memories from long sessions are not stored in structured files but in append-only logs by date. Then, a /dream skill runs during "nighttime" (low activity periods) to distill these raw logs into structured topic files.

plaintext
logs/2026/03/2026-03-30.md  ← Today's raw log
        ↓ /dream distillation
memory/user_preferences.md  ← Structured user preference file
memory/project_context.md   ← Structured project context file

The AI organizes memories while "sleeping." This is no longer just engineering; this is bionics.

Part Six: The fifth secret: It's not one Agent, it's a swarm

When you ask Claude Code to do a complex task, it might quietly do this:

typescript
// AgentTool's input schema
z.object({
  description: z.string().describe('A short (3-5 word) description'),
  prompt: z.string().describe('The task for the agent to perform'),
  subagent_type: z.string().optional(),
  model: z.enum(['sonnet', 'opus', 'haiku']).optional(),
  run_in_background: z.boolean().optional(),
})

It generated a sub-Agent.

Moreover, the sub-Agent has strict "self-awareness" injected to prevent it from recursively generating more sub-Agents:

typescript
export function buildChildMessage(directive: string): string {
  return `STOP. READ THIS FIRST.

You are a forked worker process. You are NOT the main agent.

RULES (non-negotiable):
1. Your system prompt says "default to forking." IGNORE IT —
   that's for the parent. You ARE the fork.
   Do NOT spawn sub-agents; execute directly.
2. Do NOT converse, ask questions, or suggest next steps
3. USE your tools directly: Bash, Read, Write, etc.
4. Keep your report under 500 words.
5. Your response MUST begin with "Scope:". No preamble.`
}

This code is saying: "You are a worker, not a manager. Don't think about hiring more people, do the work yourself."

👤Coordinator mode: Manager mode

In coordinator mode, Claude Code becomes a pure task orchestrator, not doing the work itself, only allocating:

plaintext
Phase 1: Research    → 3 workers parallel search the codebase
Phase 2: Synthesis   → Main Agent synthesizes understanding of all findings
Phase 3: Implementation → 2 workers modify different files respectively
Phase 4: Verification   → 1 worker runs tests

The core principle is written in the code comments:

"Parallelism is your superpower" Read-only research tasks: Run in parallel. File writing tasks: Run serially grouped by file (to avoid conflicts).

🗣️Extreme optimization of Prompt Cache

To maximize the cache hit rate for sub-Agents, all forked sub-agent tool results use the same placeholder text:

plaintext
'Fork started — processing in background'

Why? Because Claude API's prompt cache is based on byte-level prefix matching. If the prefix bytes of 10 sub-Agents are completely identical, then only the first one needs a "cold start," and the next 9 directly hit the cache.

This is an optimization that saves a few cents per call, but at large scale, it can save significant costs.

Part Seven: The sixth secret: Three-layer compression, making conversations "never exceed limits"

All LLMs have context window limits. The longer the conversation, the more historical messages, and eventually, it will definitely exceed the limit.

Claude Code designed three layers of compression for this:

1️⃣First layer: Micro-compression—minimum cost

typescript
export async function microcompactMessages(messages, toolUseContext, querySource) {
  // Time-triggered: If the last interaction was a long time ago, server cache is cold
  const timeBasedResult = maybeTimeBasedMicrocompact(messages, querySource)
  if (timeBasedResult) return timeBasedResult

  // Cache edit path: Directly delete old content via API's cache edit feature
  if (feature('CACHED_MICROCOMPACT')) {
    return await cachedMicrocompactPath(messages, querySource)
  }
}

Micro-compression only touches old tool call results—replacing "the content of that 500-line file read 10 minutes ago" with [Old tool result content cleared].

Prompts and the main conversation thread are completely preserved.

2️⃣Second layer: Auto-compression—active shrinking

Triggered automatically when token consumption approaches 87% of the context window (window size - 13,000 buffer). There is a circuit breaker: stops trying after 3 consecutive compression failures to avoid infinite loops.

3️⃣Third layer: Full compression—AI summarization

Let the AI generate a summary of the entire conversation, then replace all historical messages with the summary. When generating the summary, there is a strict pre-instruction:

typescript
const NO_TOOLS_PREAMBLE = `CRITICAL: Respond with TEXT ONLY.
Do NOT call any tools.
- Do NOT use Read, Bash, Grep, Glob, Edit, Write, or ANY other tool.
- Tool calls will be REJECTED and will waste your only turn.`

Why so strict? Because if the AI calls tools again during the summarization process, it will generate more token consumption, counterproductive. This prompt is saying: "Your task is to summarize, don't do anything else."

Token budget after compression:

File recovery: 50,000 tokens

Per file limit: 5,000 tokens

Skill content: 25,000 tokens

These numbers are not arbitrary—they are a balance point between "retaining enough context to continue working" and "freeing up enough space to receive new messages."

Part Eight: What I learned after reading this source code

1️⃣90% of AI Agent's workload is outside the "AI"

Out of 510,000 lines of code, the part that actually calls the LLM API might be less than 5%. What is the remaining 95%?

Security checks (18 files just for one BashTool)

Permission system (allow/deny/ask/passthrough four-state decision-making)

Context management (three-layer compression + AI memory retrieval)

Error recovery (circuit breakers, exponential backoff, Transcript persistence)

Multi-Agent coordination (swarm orchestration + mailbox communication)

UI interaction (140 React components + IDE Bridge)

Performance optimization (prompt cache stability + parallel prefetching at startup)

If you are working on an AI Agent product, these are the real problems you need to solve. It's not about whether the model is smart enough, it's about whether your scaffolding is sturdy enough.

2️⃣Good prompt engineering is systems engineering

It's not just about writing a nice prompt and being done. Claude Code's prompts are:

7 layers dynamically assembled

Each tool comes with an independent user manual

Cache boundaries precisely divided

Internal and external versions have different instruction sets

Tool order fixed to maintain cache stability

This is engineered prompt management, not craftsmanship.

3️⃣Designed for failure

Every external dependency has a corresponding failure strategy:

4️⃣Anthropic is building Claude Code as an operating system

42 tools = System calls

Permission system = User permission management

Skill system = App store

MCP protocol = Device drivers

Agent swarm = Process management

Context compression = Memory management

Transcript persistence = File system

This is not a "chatbot plus a few tools," this is an operating system with the LLM as its kernel.

Summary

510,000 lines of code. 1903 files. 18 security files just for one Bash tool.

9 layers of review just to let the AI safely help you type one command.

This is Anthropic's answer: To make AI truly useful, you can't lock it in a cage, nor can you let it run naked. You have to build a complete trust system for it.

And the cost of this trust system is 510,000 lines of code.