Sleep-Time Compute

Entity ID: ent-20260410-1925d34d7e64
Type: concept
Scope: shared
Status: active

Description

The UC Berkeley paper "Sleep-time Compute for LLMs" (arXiv:2504.13171, April 2025) demonstrated that LLMs can pre-compute inferences during user idle time, reducing test-time compute by approximately 5x at equal accuracy. The core insight is that models can do useful work between user interactions, amortizing expensive reasoning across idle periods.

Claude Code's auto-dream system implements sleep-time compute but inverts the direction: where the paper's approach pre-computes answers to predicted future queries, auto-dream consolidates past memory -- merging, pruning, and reorganizing the accumulated knowledge from prior sessions so that future sessions boot faster and with higher-quality context. The paper looks forward; auto-dream looks backward.

The paper's key finding that consolidation requires sufficient accumulated data maps directly to auto-dream's minSessions: 5 threshold -- the system does not trigger consolidation until at least 5 sessions have accumulated since the last dream, ensuring there is enough signal to justify the compute cost.

Key claims

Sleep-time compute reduces test-time compute by ~5x at equal accuracy by pre-computing during idle periods (paper finding).
Auto-dream inverts the direction: consolidates past memory instead of predicting future queries. Both exploit the same idle-time window.
The minSessions: 5 gate mirrors the paper's finding that consolidation needs sufficient accumulated data to be worthwhile.
Auto-dream runs as a forked subagent via runForkedAgent() -- a background process that shares the parent's prompt cache but has isolated mutable state, read-only Bash access, and write access restricted to the memory directory.
The four-gate trigger system (time >= 24h, sessions >= 5, scan throttle >= 10min, filesystem lock) is ordered by computational cost, cheapest first, so most invocations bail early with minimal overhead (one stat call per turn when enabled).
The consolidation prompt instructs the agent through four phases: Orient (read memory index), Gather Signal (grep transcripts narrowly), Consolidate (merge/deduplicate/correct), and Prune & Index (keep MEMORY.md under 200 lines / 25KB).
The feature is gated behind tengu_onyx_plover (GrowthBook) with a user-level override via autoDreamEnabled in settings.json. Also gated off in KAIROS mode (which uses its own disk-skill dream), remote mode, and when auto-memory is disabled.
On failure, the consolidation lock's mtime is rolled back to its pre-acquire value so the time gate passes again on the next eligible turn; the 10-minute scan throttle acts as a natural backoff.
The dream task is surfaced in the UI via the existing task registry (footer pill + Shift+Down dialog) through DreamTask.ts, which tracks phases (starting -> updating), files touched, and assistant turns for live progress display.

Relations

implements auto-dream -- auto-dream is Claude Code's concrete implementation of the sleep-time compute concept
depends-on consolidation-lock -- the .consolidate-lock file whose mtime serves as lastConsolidatedAt and whose body holds the locker's PID
depends-on forked-agent-pattern -- the runForkedAgent() primitive that provides cache sharing, isolated state, and usage tracking
uses four-phase-consolidation-prompt -- the buildConsolidationPrompt() output that structures the dream agent's work
related-to auto-memory -- auto-dream consolidates what auto-memory (extractMemories) writes; the two share createAutoMemCanUseTool() for sandboxed memory-directory writes
related-to kairos -- KAIROS mode disables auto-dream's in-place rewrites (its sessions are perpetual); instead KAIROS writes append-only daily logs that auto-dream later distills

Sources

arXiv:2504.13171 -- "Sleep-time Compute for LLMs" (UC Berkeley, April 2025)
src/services/autoDream/autoDream.ts -- main scheduling logic, gate checks, forked agent orchestration
src/services/autoDream/config.ts -- isAutoDreamEnabled() with GrowthBook + settings.json resolution
src/services/autoDream/consolidationLock.ts -- lock file mechanics, mtime-as-timestamp, PID-based conflict detection
src/services/autoDream/consolidationPrompt.ts -- buildConsolidationPrompt() with four-phase structure
src/tasks/DreamTask/DreamTask.ts -- UI task registration, progress tracking, kill/rollback handling
src/query/stopHooks.ts -- integration point: executeAutoDream() fires after each main-thread turn in non-bare mode
src/utils/backgroundHousekeeping.ts -- initAutoDream() called at startup alongside extractMemories and MagicDocs