Five-Layer Platform Architecture

Claude Code's architecture is organized in five layers, each with distinct responsibilities and trust boundaries. This layering explains many otherwise-puzzling design choices — why the 46K-line QueryEngine monolith exists, why safety is a layer rather than distributed, and why tools are registered in a specific order.

The Layers

Layer	Responsibility	Key Components
1. Core Agent Loop	Message processing, API calls, tool execution	query.ts (~30 lines of core loop), tt() (3,167-line orchestrator)
2. Tool Layer	Interface between LLM intentions and real world	40+ tools, buildTool factory, StreamingToolExecutor
3. Safety Layer	Permission checks, risk classification, deny rules	Permission pipeline (4-layer stack), auto-mode classifier, bash security validator (2,592 lines)
4. Platform Services	Background daemons, memory, MCP, sessions	handleStopHooks, auto-dream, extract memories, session persistence, MCP client
5. Interface Layer	Terminal UI, IDE bridge, voice	React+Ink renderer, yoga-layout, bridge system (v2 WebSocket), voice system

Why This Matters

The agent loop is tiny. The core message loop in query.ts is ~30 lines — an async generator that yields tokens, processes tool calls, and checks budgets. The 46K-line QueryEngine monolith exists because all the INTERESTING complexity (retries, cache management, rate limits, streaming state) must be co-located to reason about their interactions.

Safety is a testable, bypassable layer. The permission pipeline, auto-mode classifier, and bash security validator form a distinct layer that can be tested independently, bypassed for trusted contexts (--dangerously-skip-permissions), and audited separately. This is deliberate — distributing safety checks across layers makes them harder to verify.

Background services share a single scheduling point. All nine background services funnel through handleStopHooks, executing during the gap between user messages. This coupling is the cost of cache preservation — background agents must share the parent's prompt cache prefix.

Architectural Decisions That Follow

Model as planner — no external DAG, routing, or planning layer. The LLM drives the loop directly.
Grep over RAG — agentic search (read files, grep for patterns) over vector retrieval. More accurate, more secure, no staleness.
Cache as load-bearing infrastructure — the five-layer architecture is shaped by cache economics. Tool registration order, background scheduling, and agent forking are all cache-preservation mechanisms.
Dead-code elimination via compilation — Bun builds eliminate unused code paths, keeping the shipped binary focused.

five-layer-architecture — the entity page
queryengine-ts — the 46K-line monolith
tt-function — the 3,167-line orchestrator
permission-pipeline — the safety layer
handleStopHooks — the platform services scheduler
streaming-tool-executor — the tool layer executor
terminal-renderer — the interface layer

Five-Layer Platform Architecture

The Layers

Why This Matters

Architectural Decisions That Follow

Related Entities