Forked Agent Pattern

The shared architectural primitive underlying every background operation in claude-code. Session memory extraction, auto-memory writing, auto-dream consolidation, context compaction, and agent summaries all use the same implementation from src/utils/forkedAgent.ts.

How It Works

A fork constructs a child agent using CacheSafeParams:

CacheSafeParams = {
  systemPrompt: SystemPrompt,      // byte-identical to parent
  userContext: { [k: string]: string },
  systemContext: { [k: string]: string },
  toolUseContext: ToolUseContext,   // contains tools, model, options
  forkContextMessages: Message[],  // parent's full conversation as cache prefix
}

What's Shared vs. Cloned

Shared (for cache hits)	Cloned (for isolation)
System prompt (byte-identical)	`readFileState` LRU cache
Prompt cache prefix	`AbortController` (linked to parent)
Tool pool	`denialTracking` state
Model selection	Mutable conversation state

The fork shares the prompt cache prefix because it keeps identical cache-critical parameters. But it gets cloned mutable state to prevent cross-contamination between parent and child.

Cache Economics

To maximize cache sharing across concurrent forks, every fork child produces byte-identical API request prefixes: the full parent assistant message (all tool_use blocks, thinking, and text), plus a single user message containing identical placeholder results for every tool_use block. Only a per-child directive text block differs between them.

The economic consequence: 92% overall prefix reuse, producing measured cost savings of $4.85 (81% reduction) over a single representative task. Spawning five parallel background agents costs nearly the same as spawning one.

This is why cache-economics is described as "load-bearing infrastructure" — the forked agent pattern's entire cost model depends on cache hits.

Anti-Recursion

Fork children can see the AgentTool in their tool pool but reject any attempt to fork recursively. Enforcement: checking for a <fork_boilerplate_tag> marker in the conversation history. If present, the fork attempt is blocked.

Who Uses It

System	Fork Configuration
auto-memory	`querySource: 'session_memory'`, FileEdit only
auto-dream	Read-only on code, memory write access
Context compaction	Summarization API call
Agent summaries	Read-only exploration
teammate-tool	Full tool access with permission scoping

The omitClaudeMd optimization on read-only agents (Explore, Plan) saves an estimated 5-15 GTok/week across Anthropic's fleet by avoiding loading CLAUDE.md for agents that only read code.

Key Claims

clm-20260409-77a12ba7f99b: 92% prompt cache reuse, $4.85 (81%) cost savings per task

Sources

src-20260409-a14e9e98c3cd — Internals: Auto-Memory, Auto-Dream, and Agent Teams