Speculation Engine

A speculative execution system hidden in the claude-code binary, discovered by "Zero to Pete" through binary patching. When Claude finishes responding and generates a suggestion (the autocomplete feature), the speculation engine silently forks a background API call and begins executing the predicted prompt before the user presses Enter.

How It Works

  1. Claude generates a suggestion (the text shown as autocomplete)
  2. Before user accepts, a background API call starts executing that suggestion
  3. Execution runs in a sandboxed filesystem overlay:
~/.claude/speculation/<pid>/<speculation_id>/
  1. On first write to any file, the original is copied to the overlay
  2. All subsequent reads and writes use the overlay copy
  3. The actual codebase is never modified during speculation

On User Decision

Permission Tiers

Tool Behavior During Speculation
Read, Glob, Grep, LSP, ToolSearch Runs freely
Edit, Write, NotebookEdit Redirected to overlay (stops if not in acceptEdits mode)
Bash Runs only if already auto-approved
Everything else Denied — speculation halts

Recursive Pipelining

The most striking detail: when a speculation completes successfully, it immediately generates the next suggestion and starts speculatively executing that too with isPipelined = true. The system tries to stay multiple steps ahead of the user: predict, execute, predict, execute.

Safety Limits

Telemetry

Every speculation emits to tengu_speculation: outcome, duration, tools executed, time saved, and is_pipelined. Anthropic is measuring whether this delivers meaningful speedups before wider rollout.

Feature Gate

Gated behind tengu_speculation in growthbook with no environment variable override. Not yet available to general users.

Key Claims

Sources