Cache-Aware Tool Registration

Entity ID: ent-20260410-4c255c3a4a95
Type: mechanism
Scope: shared
Status: active

Description

Claude Code assembles its tool list for every API call using a partition-then-sort strategy that keeps the prompt cache stable when MCP servers connect, disconnect, or change their tool sets. The core invariant: built-in tools form a contiguous, alphabetically sorted prefix, and MCP tools form a separate, alphabetically sorted suffix. The two groups never interleave.

This matters because Anthropic's server-side cache policy (claude_code_system_cache_policy) places a cache breakpoint after the last prefix-matched built-in tool. If a naive flat sort were used, an MCP tool whose name sorted between two built-in tools would interleave into the built-in block, shifting downstream positions and invalidating the entire cached prefix. With partition-sort, MCP tools can only appear after all built-ins, so adding or removing an MCP tool only affects the suffix -- the expensive built-in prefix stays byte-identical and cache-hot.

The mechanism is implemented in two places that must stay in sync:

assembleToolPool() in src/tools.ts -- the primary assembly function used by both the REPL and runAgent. It calls getTools() for built-ins and filterToolsByDenyRules() for MCP tools, then applies the partition-sort via [...builtInTools].sort(byName).concat(allowedMcpTools.sort(byName)). Deduplication (lodash uniqBy('name')) preserves insertion order, so built-ins win on name conflicts.
mergeAndFilterTools() in src/utils/toolPool.ts -- used by the useMergedTools React hook in the REPL path. It re-partitions the merged pool using lodash partition with the isMcpTool predicate, then applies the same [...builtIn.sort(byName), ...mcp.sort(byName)] pattern. The comment explicitly says "Partition-sort for prompt-cache stability (same as assembleToolPool)".

Both functions use (a, b) => a.name.localeCompare(b.name) as the comparator and avoid Array.toSorted() for Node 18 compatibility.

Supporting mechanisms

Three additional systems reinforce the cache stability that partition-sort enables:

Tool schema cache (src/utils/toolSchemaCache.ts): A session-scoped Map<string, CachedSchema> that memoizes the rendered API schema (name, description, input_schema, strict, eager_input_streaming) for each tool at first render. This prevents mid-session GrowthBook feature flag flips (tengu_tool_pear, tengu_fgts) or dynamic tool.prompt() drift from changing the serialized bytes of tools that haven't structurally changed. The cache is cleared only on logout or auth change.
Deferred tool loading (defer_loading): When tool search is enabled, MCP tools that haven't been explicitly requested by the model are sent with defer_loading: true, which means the server doesn't include their full schema in the cached prompt. This further reduces the surface area for MCP-induced cache churn -- a deferred tool changing its schema doesn't break anything.
Global cache strategy downgrade: When non-deferred MCP tools are present in the tool list, the system sets skipGlobalCacheForSystemPrompt: true, downgrading the system prompt from global-scope caching to org-scope caching. MCP tools are per-user and can't be globally cached, so the system avoids a cache strategy mismatch by shifting the cache boundary to the tool block instead.

Prompt cache break detection

src/services/api/promptCacheBreakDetection.ts monitors for cache breaks across API calls by hashing all request components (system prompt, tool schemas, model, betas, cache_control, etc.) and comparing them turn-over-turn. When a cache break is detected (cache read tokens drop >5% and >2000 tokens), it logs a detailed tengu_prompt_cache_break event with field-level attribution, including per-tool schema hashes to identify exactly which tool's description changed. This telemetry is what drives architectural decisions about cache stability.

Key claims

clm-20260410-a001: Built-in tools are always sorted alphabetically as a contiguous prefix; MCP tools are always sorted alphabetically as a contiguous suffix. The two groups never interleave.
clm-20260410-a002: The partition-sort is implemented identically in both assembleToolPool() (tools.ts) and mergeAndFilterTools() (toolPool.ts), kept in sync by convention and an explicit comment.
clm-20260410-a003: uniqBy('name') deduplication preserves insertion order, so when a built-in and an MCP tool share a name, the built-in wins.
clm-20260410-a004: Tool schemas are memoized per-session in toolSchemaCache.ts to prevent feature-flag flips or dynamic prompt drift from breaking the cache mid-session.
clm-20260410-a005: When non-deferred MCP tools are present, the global cache strategy downgrades to org-scope to avoid caching per-user tool content at global scope.
clm-20260410-a006: Deferred MCP tools (defer_loading: true) reduce cache churn by keeping their schemas out of the cached prefix until the model requests them via ToolSearch.

Relations

Serves cache-economics -- cache-aware tool registration is one of the design decisions that bends toward prompt cache preservation
Feeds into mcptool -- MCP tools are the dynamic element that partition-sort is designed to isolate from the cached prefix
Uses mcp-subsystem -- MCP tool discovery and lifecycle management produces the tool list that this mechanism sorts

Sources

src/tools.ts lines 345-367 -- assembleToolPool() implementation with partition-sort comment
src/utils/toolPool.ts lines 55-79 -- mergeAndFilterTools() implementation with matching partition-sort
src/utils/toolSchemaCache.ts -- session-scoped tool schema memoization
src/utils/api.ts lines 130-230 -- toolToAPISchema() with schema cache integration
src/services/api/claude.ts lines 1207-1229 -- global cache strategy downgrade when MCP tools are present
src/services/api/promptCacheBreakDetection.ts -- cache break detection with per-tool hash tracking
src/services/mcp/utils.ts line 245-246 -- isMcpTool() predicate used for partitioning