Memory Design
How claude-code manages persistent knowledge across sessions, agents, and time.
Five Systems, Three Scopes
memory-hierarchy documents five distinct memory systems:
| System | Writer | Scope | Persistence |
|---|---|---|---|
| CLAUDE.md | User | Global + Project | Permanent |
| auto-memory | Claude | Project | Consolidated by auto-dream |
| Session memory | Claude | Session | Per-session transcript |
| Team memory | Agents | Team | Shared across teammate-tool |
| Agent memory | Agent | Per-agent | Private |
The Auto-Memory → Auto-Dream Pipeline
auto-memory writes continuously (triggers at 10K tokens, then every 5K/3 tool calls). Over time, this produces redundant, contradictory notes. auto-dream consolidates: reads all session transcripts, resolves contradictions, merges duplicates, rewrites memory into clean topic files. The dream runs as a forked-agent-pattern subagent with read-only code access.
The four-gate trigger (24h, 5 sessions, 10min scan throttle, filesystem lock) ensures dreams only run when there's enough new material and the system is idle.
KAIROS Memory: Append-Only Logs
kairos can't rewrite memory in-place (perpetual sessions would corrupt intermediate states). Instead it uses append-only daily logs (logs/YYYY/MM/YYYY-MM-DD.md). Auto-dream distills these into MEMORY.md and topic files nightly.
Retrieval: LLM Reasoning, Not Embeddings
The system deliberately rejects vector search. Claude calls ls(), reasons about which files are relevant, then reads them. This is part of the grep-over-rag philosophy: LLM reasoning over filenames outperforms opaque vector matching for structured, human-readable files.
Hidden Cost
The extractMemories mechanism fires a background Opus API call per turn. This fire-and-forget call doubles effective token consumption (26M tokens instead of 13M per session). The cost is hidden from users.
Context Loading
system-prompt-assembly loads memory on every turn (not just session start): global CLAUDE.md, project CLAUDE.md, modular rules, auto-memory index (first 200 lines only). The index acts as a pointer structure — topic files are loaded on-demand.