Four Memory Systems

Description

Claude Code persists knowledge across sessions through four interconnected memory systems, each operating at a different timescale and authorship model. CLAUDE.md is the human-written layer: static instruction files loaded from a four-tier hierarchy (managed, user, project, local) and reloaded on every API turn. Auto-Memory is the agent-written layer: a background extraction subagent (extractMemories) that forks from the main conversation after each query loop, analyzes recent messages, and writes structured notes to ~/.claude/projects/<slug>/memory/. These notes use a four-type taxonomy (user, feedback, project, reference) with frontmatter metadata, organized into topic files indexed by a 200-line MEMORY.md entrypoint.

Session Memory is a within-session summarization system that maintains a 10-section markdown file (title, current state, task spec, files/functions, workflow, errors, codebase docs, learnings, key results, worklog). It runs as a post-sampling hook, triggering after the context window reaches 10,000 tokens and then re-extracting every 5,000 tokens or 3 tool calls. Its primary consumer is the compaction pipeline: when the context window fills, session memory replaces the traditional LLM-summarize-the-conversation approach with a pre-built summary, eliminating the expensive compaction API call. Auto-Dream is the periodic consolidation layer: a forked subagent that fires when at least 24 hours and 5 sessions have accumulated since the last consolidation. It reads transcripts, merges redundant memories, resolves contradictions, converts relative dates to absolute, prunes stale entries, and rebuilds the MEMORY.md index.

The four systems form a pipeline with increasing time horizons: CLAUDE.md provides stable project rules (edited manually, rarely), Session Memory captures the current conversation (minutes to hours), Auto-Memory extracts durable observations from each session (hours to days), and Auto-Dream consolidates accumulated memories across sessions (days to weeks). Data flows upward: session transcripts feed Auto-Memory extraction, which feeds Auto-Dream consolidation, which produces the clean topic files and index that inform future sessions. The systems are designed for mutual exclusion where needed -- if the main agent writes memories directly during a turn, the extractMemories subagent skips that turn entirely (hasMemoryWritesSince check). Similarly, KAIROS (assistant/perpetual mode) uses append-only daily logs instead of direct MEMORY.md edits, with Auto-Dream serving as the nightly distillation process.

Key claims

  1. Auto-Memory and extractMemories are mutually exclusive per turn. When the main agent writes to the memory directory during a conversation turn, hasMemoryWritesSince() in extractMemories.ts detects the write and skips the forked extraction entirely, advancing the cursor past those messages. This prevents the background agent from creating duplicate or contradictory entries for information the main agent already saved. (src/services/extractMemories/extractMemories.ts, lines 121-148)

  2. Session Memory's extraction trigger uses a dual-threshold gate: token growth AND tool calls. Extraction fires only when the context window has grown by at least minimumTokensBetweenUpdate tokens (default 5,000) since the last extraction, combined with either reaching the tool-call threshold (default 3 calls) or hitting a natural conversation break (no tool calls in the last assistant turn). The token threshold is always required -- even if tool calls exceed the threshold, extraction waits for sufficient context growth. (src/services/SessionMemory/sessionMemory.ts, lines 134-181)

  3. Auto-Dream uses a four-gate trigger ordered by computational cost. The gates are checked cheapest-first: (1) time gate -- hours since last consolidation >= minHours (default 24), via a single stat on the lock file; (2) scan throttle -- at least 10 minutes since the last session scan; (3) session gate -- at least minSessions (default 5) transcripts modified since last consolidation, excluding the current session; (4) lock acquisition -- PID-based filesystem lock with stale-holder detection (1-hour timeout) and race-condition protection via write-then-verify. (src/services/autoDream/autoDream.ts, lines 56-190; src/services/autoDream/consolidationLock.ts)

  4. CLAUDE.md files are loaded from four priority tiers, reloaded every turn. The hierarchy is: (1) Managed (/etc/claude-code/CLAUDE.md) for org-wide policy, (2) User (~/.claude/CLAUDE.md) for personal global instructions, (3) Project (CLAUDE.md, .claude/CLAUDE.md, .claude/rules/*.md in each directory up to project root) for checked-in project rules, (4) Local (CLAUDE.local.md) for private per-project overrides. Files are loaded in ascending priority order so the model pays more attention to later (higher-priority) entries. The @include directive allows files to reference other files, with circular-reference prevention. (src/utils/claudemd.ts, lines 1-26)

  5. Memory recall uses LLM-based relevance selection, not vector search. When the user's query arrives, findRelevantMemories() scans all .md files in the memory directory, reads their frontmatter (description, type, filename), formats them as a manifest, and sends the manifest plus the user's query to Sonnet via sideQuery. Sonnet selects up to 5 relevant files based on description matching, returning filenames in a JSON schema response. Selected files are then injected into the conversation with staleness caveats for files older than 1 day. (src/memdir/findRelevantMemories.ts; src/memdir/memoryAge.ts)

Relations

Sources