Forked Agent Pattern

The shared architectural primitive underlying every background operation in claude-code. Session memory extraction, auto-memory writing, auto-dream consolidation, context compaction, and agent summaries all use the same implementation from src/utils/forkedAgent.ts.

How It Works

A fork constructs a child agent using CacheSafeParams:

CacheSafeParams = {
  systemPrompt: SystemPrompt,      // byte-identical to parent
  userContext: { [k: string]: string },
  systemContext: { [k: string]: string },
  toolUseContext: ToolUseContext,   // contains tools, model, options
  forkContextMessages: Message[],  // parent's full conversation as cache prefix
}

What's Shared vs. Cloned

Shared (for cache hits) Cloned (for isolation)
System prompt (byte-identical) readFileState LRU cache
Prompt cache prefix AbortController (linked to parent)
Tool pool denialTracking state
Model selection Mutable conversation state

The fork shares the prompt cache prefix because it keeps identical cache-critical parameters. But it gets cloned mutable state to prevent cross-contamination between parent and child.

Cache Economics

To maximize cache sharing across concurrent forks, every fork child produces byte-identical API request prefixes: the full parent assistant message (all tool_use blocks, thinking, and text), plus a single user message containing identical placeholder results for every tool_use block. Only a per-child directive text block differs between them.

The economic consequence: 92% overall prefix reuse, producing measured cost savings of $4.85 (81% reduction) over a single representative task. Spawning five parallel background agents costs nearly the same as spawning one.

This is why cache-economics is described as "load-bearing infrastructure" — the forked agent pattern's entire cost model depends on cache hits.

Anti-Recursion

Fork children can see the AgentTool in their tool pool but reject any attempt to fork recursively. Enforcement: checking for a <fork_boilerplate_tag> marker in the conversation history. If present, the fork attempt is blocked.

Who Uses It

System Fork Configuration
auto-memory querySource: 'session_memory', FileEdit only
auto-dream Read-only on code, memory write access
Context compaction Summarization API call
Agent summaries Read-only exploration
teammate-tool Full tool access with permission scoping

The omitClaudeMd optimization on read-only agents (Explore, Plan) saves an estimated 5-15 GTok/week across Anthropic's fleet by avoiding loading CLAUDE.md for agents that only read code.

Key Claims

Sources