Forked Agent Pattern
The shared architectural primitive underlying every background operation in claude-code. Session memory extraction, auto-memory writing, auto-dream consolidation, context compaction, and agent summaries all use the same implementation from src/utils/forkedAgent.ts.
How It Works
A fork constructs a child agent using CacheSafeParams:
CacheSafeParams = {
systemPrompt: SystemPrompt, // byte-identical to parent
userContext: { [k: string]: string },
systemContext: { [k: string]: string },
toolUseContext: ToolUseContext, // contains tools, model, options
forkContextMessages: Message[], // parent's full conversation as cache prefix
}
What's Shared vs. Cloned
| Shared (for cache hits) | Cloned (for isolation) |
|---|---|
| System prompt (byte-identical) | readFileState LRU cache |
| Prompt cache prefix | AbortController (linked to parent) |
| Tool pool | denialTracking state |
| Model selection | Mutable conversation state |
The fork shares the prompt cache prefix because it keeps identical cache-critical parameters. But it gets cloned mutable state to prevent cross-contamination between parent and child.
Cache Economics
To maximize cache sharing across concurrent forks, every fork child produces byte-identical API request prefixes: the full parent assistant message (all tool_use blocks, thinking, and text), plus a single user message containing identical placeholder results for every tool_use block. Only a per-child directive text block differs between them.
The economic consequence: 92% overall prefix reuse, producing measured cost savings of $4.85 (81% reduction) over a single representative task. Spawning five parallel background agents costs nearly the same as spawning one.
This is why cache-economics is described as "load-bearing infrastructure" — the forked agent pattern's entire cost model depends on cache hits.
Anti-Recursion
Fork children can see the AgentTool in their tool pool but reject any attempt to fork recursively. Enforcement: checking for a <fork_boilerplate_tag> marker in the conversation history. If present, the fork attempt is blocked.
Who Uses It
| System | Fork Configuration |
|---|---|
| auto-memory | querySource: 'session_memory', FileEdit only |
| auto-dream | Read-only on code, memory write access |
| Context compaction | Summarization API call |
| Agent summaries | Read-only exploration |
| teammate-tool | Full tool access with permission scoping |
The omitClaudeMd optimization on read-only agents (Explore, Plan) saves an estimated 5-15 GTok/week across Anthropic's fleet by avoiding loading CLAUDE.md for agents that only read code.
Key Claims
clm-20260409-77a12ba7f99b: 92% prompt cache reuse, $4.85 (81%) cost savings per task
Sources
src-20260409-a14e9e98c3cd— Internals: Auto-Memory, Auto-Dream, and Agent Teams