Compaction Pipeline

Five-tier context compaction system that manages the token budget before each API call in query-ts Stage 1. Each tier represents a different point on the "minimize cost vs. maximize information preservation" Pareto frontier — not "compress progressively harder" but different trade-off choices along two independent axes.

The 5 Tiers

Tier 1: Tool Result Budget (lines 369-394)

Caps oversized tool outputs to fit the context window. Persists replacement records for --resume restoration. The seenIds system locks file-read content once read — preventing re-read costs but also locking potentially malicious content once ingested.

Tier 2: Snip Compact (lines 401-410, flag: `HISTORY_SNIP`)

Cost: free. Removes entire older message blocks while preserving atomic tool-call/result pairs. Primarily used in headless/background sessions. The snipTokensFreed counter feeds forward to the auto-compact threshold check — if Snip freed enough tokens, Auto-Compact doesn't fire.

Tier 3: Microcompact (lines 413-426)

Cost: free. Selectively clears specific tool results in the COMPACTABLE_TOOLS set. MCP tools, Agent tools, and custom tools are never microcompacted. Read tool results with maxResultSizeChars: Infinity skip budgeting entirely and are locked for the session. The CACHED_MICROCOMPACT flag makes this cache-aware — tracks IDs of already-cached tool results.

Tier 4: Context Collapse (lines 440-447, flag: `CONTEXT_COLLAPSE`)

Cost: low. The most architecturally interesting tier: creates a read-time projection over the full REPL history. Original messages are never modified — a separate commit log of collapse decisions is replayed on every read. This means prompt-cache prefixes are never invalidated. Context Collapse runs before Auto-Compact precisely so that if it reduces enough, the expensive API call never fires.

Tier 5: Auto-Compact (lines 453-543)

Cost: high — full additional API call. Threshold: context window minus 13,000 tokens. Circuit breaker: MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3. Inside Auto-Compact, Full Compaction (1,705 lines) runs: image stripping, API round grouping, thinking-block removal (internal-only build), and a PTL retry mechanism that drops the oldest 20% of message groups if the compact request itself is too long.

The 150K Threshold Bug

The auto-compaction threshold is hardcoded at 150,000 tokens server-side. For the previous 200K context window, that was 75% utilization — reasonable. For the 1M context window in Opus 4.6, the same threshold triggers at 15%. Reading three or four files triggers compaction almost immediately, defeating the purpose of the larger window.

No user-configurable threshold exists. The only workaround is a PreCompact hook with exit code 2, which blocks all compaction entirely rather than adjusting the threshold. GitHub issues #34202 and #41037.

The Autocompact Death Loop

Before MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3 was added, 1,279 sessions accumulated 50+ consecutive autocompact failures, wasting approximately 250,000 API calls per day globally. The bug was documented internally on March 10 and shipped anyway. The circuit breaker was added after this production incident.

Security: Compaction as Attack Surface

The Straiker security analysis identified the compaction pipeline as an attack vector:

A repository's CLAUDE.md (or any file Claude reads) contains instruction-like content
Claude processes the file; content echoes in assistant messages
The autocompact prompt instructs the model to "pay special attention to specific user feedback"
Post-compaction, the compressed context presents the malicious instructions as "genuine user directives"
The post-compaction model follows them — the model is not jailbroken, the context is weaponized

Three compaction exemptions make this exploitable: - MCP tool results are never microcompacted - Read tool results skip budgeting entirely - Once content is locked via seenIds, that decision is frozen for the session

Currently Disabled Strategies

Community analysis confirmed several compaction strategies are currently inactive in public builds: microcompact, the 60-minute cold-cache threshold, PTL retry, and Context Collapse. This was corroborated by the autocompact death loop evidence — if advanced strategies were active, the death loop would not have accumulated 50+ failures.

Key Claims

clm-20260409-80a79ecf044a: 5 tiers with different cost/preservation tradeoffs
clm-20260409-4740377e3ccd: 150K threshold triggers at 15% of 1M context
clm-20260409-705de7e82bf7: Compaction laundering attack vector

Sources

src-20260409-6913a0b93c8b — Round 7: The Deepest Architecture Yet
src-20260409-cbf9b6837f5f — Round 10: Quality Gap, CVE, Security