Memory and Agent Systems

Episode ID: epi-20260409-abf1f376005e
Scope: shared
Created: 2026-04-09T20:23:27Z

Question

How do Claude Code's memory and multi-agent systems work together?

Summary

Claude Code has 5 memory systems at 3 scopes. Auto-memory extracts session notes (triggers at 10K tokens, then every 5K/3 tool calls, capped at 12K tokens). Auto-dream consolidates memory during idle time via 4-phase process (Orient, Gather Signal, Consolidate, Prune). KAIROS is the unreleased always-on daemon using tick-loop prompts, 15-second blocking budget, and append-only daily logs designed as a pair with auto-dream. Memory retrieval deliberately rejects vector search — uses LLM reasoning over filenames instead. The forked agent pattern underlies all background operations, achieving 92% prompt cache reuse. TeammateTool exposes 13 operations for full team lifecycle with file-lock task claiming. ULTRAPLAN offloads 30-minute planning to remote CCR, forming a fast/slow thinking pair with KAIROS.

Findings

Memory retrieval by LLM reasoning over filenames outperforms vector search for structured files
Forked agents achieve 92% cache reuse by sharing byte-identical API request prefixes
KAIROS + ULTRAPLAN form a System 1/System 2 architecture for ambient vs deep deliberation
TEAMMATE_MESSAGES_UI_CAP=50 was added after 292-agent test consumed 36.8 GB

Lessons

Auto-dream is the garbage collector — without it, memory grows stale, contradictory, and unnavigable
Every concrete engineering limit in the codebase has a production failure story behind it

References

none