
Understanding the Source Code of Anthropic's Claude Code in One Article: Why Is It Just Better Than Others?
On March 31, 2026, security researcher Chaofan Shou discovered that the source map files in the Claude Code package published by Anthropic to npm were not stripped.
This means: The complete TypeScript source code of Claude Code, 512,000 lines, 1903 files, was exposed on the public internet just like that.
Of course, I couldn't possibly read that much code in just a few hours. Therefore, I approached reading this source code with three questions:
After reading it, my first reaction was: This is not an AI programming assistant, this is an operating system.
Part One: First, a story: If you were to hire a remote programmer
Imagine you hire a remote programmer and give them remote access to your computer.
What would you do?
If you follow Cursor's approach: You have them sit next to you, and every time they want to type a command, you take a look and click "Allow." Simple and crude, but you have to keep watching.
If you follow GitHub Copilot Agent's approach: You give them a brand new virtual machine and let them mess around in it freely. When they're done, they submit the code, and you review and merge it. Safe, but they can't see your local environment.
If you follow Claude Code's approach:
You let them use your computer directly—but you equip them with an extremely precise security inspection system. What they can do, what they can't do, which operations require your nod, which they can do on their own, even if they want to use rm -rf, it has to go through 9 layers of review before execution.
These are three completely different security philosophies:

Why did Anthropic choose the hardest path?
Because only in this way can the AI use your terminal, your environment, your configuration to work—this is "truly helping you write code," not "writing a piece of code in a clean room and then copying it over."
But what is the cost? They wrote 510,000 lines of code for this.
Part Two: The Claude Code you imagine vs. the actual Claude Code
Most people think AI programming tools are like this:
plaintextUser input → Call LLM API → Return result → Display to user
Claude Code is actually like this:
plaintextUser input → Dynamically assemble 7-layer system prompts → Inject Git status, project conventions, historical memory → 42 tools each come with a user manual → LLM decides which tool to use → 9-layer security review (AST parsing, ML classifier, sandbox check...) → Permission conflict resolution (local keyboard / IDE / Hook / AI classifier competing simultaneously) → 200ms anti-misclick delay → Execute tool → Result streamed back → Context approaching limit? → Three-layer compression (micro-compression → auto-compression → full compression) → Need parallelism? → Generate sub-Agent swarm → Loop until task completed
I believe everyone is curious about what the above is. Don't worry, let's break it down one by one.
Part Three: The first secret: Prompts are not written, they are "assembled"
Open src/constants/prompts.ts, and you'll see this function:
typescriptexport async function getSystemPrompt( tools: Tools, model: string, additionalWorkingDirectories?: string[], mcpClients?: MCPServerConnection[], ): Promise<string[]> { return [ // --- Static content (cacheable) --- getSimpleIntroSection(outputStyleConfig), getSimpleSystemSection(), getSimpleDoingTasksSection(), getActionsSection(), getUsingYourToolsSection(enabledTools), getSimpleToneAndStyleSection(), getOutputEfficiencySection(), // === Cache boundary === ...(shouldUseGlobalCacheScope() ? [SYSTEM_PROMPT_DYNAMIC_BOUNDARY] : []), // --- Dynamic content (different each time) --- ...resolvedDynamicSections, ].filter(s => s !== null) }
Notice that SYSTEM_PROMPT_DYNAMIC_BOUNDARY?
This is a cache demarcation line. The content above the line is static; the Claude API can cache them to save token costs. The content below the line is dynamic—your current Git branch, your CLAUDE.md project configuration, the preference memory you told it before... each conversation is different.
What does this mean?
Anthropic optimizes prompts as compiler output. The static part is the "compiled binary," the dynamic part is the "runtime parameters." The benefits of doing this are:
⛏️Each tool has an independent "user manual"
What shocked me even more: Each tool directory has a prompt.ts file—this is a user manual written specifically for the LLM to read.
Look at BashTool's (src/tools/BashTool/prompt.ts, about 370 lines):
plaintextGit Safety Protocol: - NEVER update the git config - NEVER run destructive git commands (push --force, reset --hard, checkout .) unless the user explicitly requests - NEVER skip hooks (--no-verify) unless the user explicitly requests - CRITICAL: Always create NEW commits rather than amending
This is not documentation written for humans; these are behavioral rules written for the AI. Every time Claude Code starts, these rules are injected into the system prompt.
This is why Claude Code never does a git push --force on its own, while some other tools do—it's not that the model is smarter, it's that the rules have been clearly explained in the prompt.
And the internal Anthropic version is different from the one you use
Code like this appears frequently:
typescriptconst minimalUniquenessHint = process.env.USER_TYPE === 'ant' ? '\n- Use the smallest old_string that\'s clearly unique' : ''
ant refers to Anthropic internal employees. Their version has more detailed code style guidance ("don't write comments unless the WHY is not obvious"), more aggressive output strategies ("inverted pyramid writing"), and some experimental features still in A/B testing (Verification Agent, Explore & Plan Agent).
This shows that Anthropic itself is Claude Code's biggest user. They are using their own product to develop their own product.
Part Four: The second secret: 42 tools, but you only see the tip of the iceberg
Open src/tools.ts, and you'll see the tool registry:
typescriptexport function getAllBaseTools(): Tools { return [ AgentTool, BashTool, FileReadTool, FileEditTool, FileWriteTool, GlobTool, GrepTool, WebFetchTool, WebSearchTool, TodoWriteTool, NotebookEditTool, // ... A large number of conditionally loaded tools ... ...(isToolSearchEnabledOptimistic() ? [ToolSearchTool] : []), ] }
42 tools, but most of them you've never seen directly. Because many tools are lazily loaded—only when the LLM needs them are they injected on-demand via the ToolSearchTool.
Why do this?
Because for each additional tool, the system prompt needs another paragraph of description, costing more tokens. If you just want Claude Code to help you change one line of code, it doesn't need to load the "scheduled task scheduler" and "team collaboration manager."
There's an even smarter design:
typescriptif (isEnvTruthy(process.env.CLAUDE_CODE_SIMPLE)) { const simpleTools: Tool[] = [BashTool, FileReadTool, FileEditTool] return filterToolsByDenyRules(simpleTools, permissionContext) }
Set CLAUDE_CODE_SIMPLE=true, and Claude Code is left with only three tools: Bash, read file, edit file. This is a backdoor for minimalists.
1️⃣All tools come from the same factory
typescriptconst TOOL_DEFAULTS = { isEnabled: () => true, isConcurrencySafe: (_input?) => false, // Default: unsafe isReadOnly: (_input?) => false, // Default: will write isDestructive: (_input?) => false, } export function buildTool<D extends AnyToolDef>(def: D): BuiltTool<D> { return { ...TOOL_DEFAULTS, userFacingName: () => def.name, ...def } }
Note those default values: isConcurrencySafe defaults to false, isReadOnly defaults to false.
This is a fail-closed design—if a tool author forgets to declare safety properties, the system assumes it is "unsafe, will write." Better to be overly conservative than to miss a risk.
2️⃣The iron rule of "read before edit"
typescriptfunction getPreReadInstruction(): string { return '\n- You must use your `Read` tool at least once in the conversation before editing. This tool will error if you attempt an edit without reading the file.' }
The FileEditTool checks if you have already read this file using the FileReadTool. If not, it directly reports an error and won't allow editing.
This is why Claude Code won't "write a piece of code out of thin air to overwrite your file" like some other tools do—it is forced to understand first, then modify.
Part Five: The third secret: The memory system—why it can "remember you"
People who have used Claude Code have a feeling: It seems to really know you.
You tell it "don't mock the database in tests," and in the next conversation, it won't mock again. You tell it "I'm a backend engineer, a React newbie," and when explaining frontend code, it will use backend analogies.
Behind this is a complete memory system.
1️⃣Using AI to retrieve memories
typescriptconst SELECT_MEMORIES_SYSTEM_PROMPT = `You are selecting memories that will be useful to Claude Code. Return a list of filenames for the memories that will clearly be useful (up to 5). - If you are unsure if a memory will be useful, do not include it. - If a list of recently-used tools is provided, do not select memories that are usage reference for those tools. DO still select memories containing warnings, gotchas, or known issues.`
Claude Code uses another AI (Claude Sonnet) to decide "which memories are relevant to the current conversation."
Not keyword matching, not vector search—it's letting a smaller model quickly scan the titles and descriptions of all memory files, select up to 5 most relevant ones, and then inject their full content into the context of the current conversation.
The strategy is "precision over recall"—better to miss a potentially useful memory than to stuff in an irrelevant memory that pollutes the context.
⏰KAIROS mode: "Dreaming" at night
This is the part that feels most sci-fi to me.
There is a feature flag in the code called KAIROS. In this mode, memories from long sessions are not stored in structured files but in append-only logs by date. Then, a /dream skill runs during "nighttime" (low activity periods) to distill these raw logs into structured topic files.
plaintextlogs/2026/03/2026-03-30.md ← Today's raw log ↓ /dream distillation memory/user_preferences.md ← Structured user preference file memory/project_context.md ← Structured project context file
The AI organizes memories while "sleeping." This is no longer just engineering; this is bionics.
Part Six: The fifth secret: It's not one Agent, it's a swarm
When you ask Claude Code to do a complex task, it might quietly do this:
typescript// AgentTool's input schema z.object({ description: z.string().describe('A short (3-5 word) description'), prompt: z.string().describe('The task for the agent to perform'), subagent_type: z.string().optional(), model: z.enum(['sonnet', 'opus', 'haiku']).optional(), run_in_background: z.boolean().optional(), })
It generated a sub-Agent.
Moreover, the sub-Agent has strict "self-awareness" injected to prevent it from recursively generating more sub-Agents:
typescriptexport function buildChildMessage(directive: string): string { return `STOP. READ THIS FIRST. You are a forked worker process. You are NOT the main agent. RULES (non-negotiable): 1. Your system prompt says "default to forking." IGNORE IT — that's for the parent. You ARE the fork. Do NOT spawn sub-agents; execute directly. 2. Do NOT converse, ask questions, or suggest next steps 3. USE your tools directly: Bash, Read, Write, etc. 4. Keep your report under 500 words. 5. Your response MUST begin with "Scope:". No preamble.` }
This code is saying: "You are a worker, not a manager. Don't think about hiring more people, do the work yourself."
👤Coordinator mode: Manager mode
In coordinator mode, Claude Code becomes a pure task orchestrator, not doing the work itself, only allocating:
plaintextPhase 1: Research → 3 workers parallel search the codebase Phase 2: Synthesis → Main Agent synthesizes understanding of all findings Phase 3: Implementation → 2 workers modify different files respectively Phase 4: Verification → 1 worker runs tests
The core principle is written in the code comments:
"Parallelism is your superpower" Read-only research tasks: Run in parallel. File writing tasks: Run serially grouped by file (to avoid conflicts).
🗣️Extreme optimization of Prompt Cache
To maximize the cache hit rate for sub-Agents, all forked sub-agent tool results use the same placeholder text:
plaintext'Fork started — processing in background'
Why? Because Claude API's prompt cache is based on byte-level prefix matching. If the prefix bytes of 10 sub-Agents are completely identical, then only the first one needs a "cold start," and the next 9 directly hit the cache.
This is an optimization that saves a few cents per call, but at large scale, it can save significant costs.
Part Seven: The sixth secret: Three-layer compression, making conversations "never exceed limits"
All LLMs have context window limits. The longer the conversation, the more historical messages, and eventually, it will definitely exceed the limit.
Claude Code designed three layers of compression for this:
1️⃣First layer: Micro-compression—minimum cost
typescriptexport async function microcompactMessages(messages, toolUseContext, querySource) { // Time-triggered: If the last interaction was a long time ago, server cache is cold const timeBasedResult = maybeTimeBasedMicrocompact(messages, querySource) if (timeBasedResult) return timeBasedResult // Cache edit path: Directly delete old content via API's cache edit feature if (feature('CACHED_MICROCOMPACT')) { return await cachedMicrocompactPath(messages, querySource) } }
Micro-compression only touches old tool call results—replacing "the content of that 500-line file read 10 minutes ago" with [Old tool result content cleared].
Prompts and the main conversation thread are completely preserved.
2️⃣Second layer: Auto-compression—active shrinking
Triggered automatically when token consumption approaches 87% of the context window (window size - 13,000 buffer). There is a circuit breaker: stops trying after 3 consecutive compression failures to avoid infinite loops.
3️⃣Third layer: Full compression—AI summarization
Let the AI generate a summary of the entire conversation, then replace all historical messages with the summary. When generating the summary, there is a strict pre-instruction:
typescriptconst NO_TOOLS_PREAMBLE = `CRITICAL: Respond with TEXT ONLY. Do NOT call any tools. - Do NOT use Read, Bash, Grep, Glob, Edit, Write, or ANY other tool. - Tool calls will be REJECTED and will waste your only turn.`
Why so strict? Because if the AI calls tools again during the summarization process, it will generate more token consumption, counterproductive. This prompt is saying: "Your task is to summarize, don't do anything else."
Token budget after compression:
These numbers are not arbitrary—they are a balance point between "retaining enough context to continue working" and "freeing up enough space to receive new messages."
Part Eight: What I learned after reading this source code
1️⃣90% of AI Agent's workload is outside the "AI"
Out of 510,000 lines of code, the part that actually calls the LLM API might be less than 5%. What is the remaining 95%?
If you are working on an AI Agent product, these are the real problems you need to solve. It's not about whether the model is smart enough, it's about whether your scaffolding is sturdy enough.
2️⃣Good prompt engineering is systems engineering
It's not just about writing a nice prompt and being done. Claude Code's prompts are:
This is engineered prompt management, not craftsmanship.
3️⃣Designed for failure
Every external dependency has a corresponding failure strategy:

4️⃣Anthropic is building Claude Code as an operating system
42 tools = System calls
Permission system = User permission management
Skill system = App store
MCP protocol = Device drivers
Agent swarm = Process management
Context compression = Memory management
Transcript persistence = File system
This is not a "chatbot plus a few tools," this is an operating system with the LLM as its kernel.
Summary
510,000 lines of code. 1903 files. 18 security files just for one Bash tool.
9 layers of review just to let the AI safely help you type one command.
This is Anthropic's answer: To make AI truly useful, you can't lock it in a cage, nor can you let it run naked. You have to build a complete trust system for it.
And the cost of this trust system is 510,000 lines of code.