Message Normalization

Entity ID: ent-20260409-5b5e434ce848
Type: service
Scope: shared
Status: active
Aliases: message normalization, smooshing, capybara surgery

Description

Message normalization is a multi-pass preprocessing pipeline that transforms conversation messages before sending them to the Anthropic API. The pipeline ensures messages conform to API requirements, optimizes token usage, fixes model-specific behavioral quirks, and handles edge cases in multi-turn conversations. The transformations are heavily feature-gated and interwoven — not a clean 11-step sequence as previously described, but roughly 14-17 distinct operations across three phases.

Primary implementation: src/utils/messages.ts (function normalizeMessagesForAPI() at line 1989)

The actual pipeline

Phase 1: Pre-processing and filtering (lines 1996-2054)

Reorder attachments (reorderAttachmentsForAPI) — bubble attachment messages upward until hitting an assistant message or tool_result boundary
Strip virtual messages — filter out messages marked isVirtual (REPL inner tool calls, display-only messages that shouldn't reach the API)
Build error strip map — identify synthetic API error messages from PDF/image/request-too-large errors; map which content blocks to strip from the preceding user message to avoid re-sending problematic content

Phase 2: Core message processing (lines 2056-2293)

The main loop processes messages by type:

User messages (lines 2094-2199): 4. Strip unavailable tool references — remove tool_reference blocks for tools that no longer exist or have search disabled 5. Strip error-triggering content — remove document/image blocks from meta user messages that preceded API errors 6. Inject tool_reference turn boundary (gated on tengu_toolref_defer_j8m) — add a text sibling "\n\nHuman: ..." after tool_reference blocks. This is the "capybara surgery" — prevents capybara models from sampling the stop sequence at ~10% (A/B tested: 21/200 vs 0/200) 7. Merge consecutive user messages — Bedrock doesn't support multiple user messages in a row

Assistant messages (lines 2201-2267): 8. Normalize tool_use inputs (normalizeToolInputForAPI) — strip plan/metadata fields that shouldn't reach the API 9. Strip tool_search fields — remove caller field from tool_use blocks when tool search is disabled 10. Merge by message ID — merge with previous assistant message if same message.id (happens with concurrent agents/streaming)

Attachment messages (lines 2269-2290): 11. Normalize attachments (normalizeAttachmentForAPI) — convert to user/tool_result messages 12. Wrap with system reminder (gated on tengu_chair_sermon) — ensure system reminder tag wrapping 13. Merge into adjacent user message

Phase 3: Post-processing (lines 2295-2370)

Relocate tool_reference siblings (gated on tengu_toolref_defer_j8m) — move text-block siblings off user messages containing tool_reference to prevent two-consecutive-human-turns pattern
Filter orphaned thinking-only messages — remove assistant messages containing ONLY thinking blocks with no corresponding non-thinking content. Prevents "thinking blocks cannot be modified" API error
Filter trailing thinking — strip thinking blocks from end of last assistant message (API rejects trailing thinking)
Filter whitespace-only assistant messages — remove assistant messages with only whitespace text blocks; merge resulting adjacent user messages
Ensure non-empty assistant content — insert placeholder text for non-final assistant messages with empty content
Merge adjacent user messages (gated on tengu_chair_sermon) — uses hoistToolResults() to put tool_results before other blocks
Smoosh system reminder siblings (gated on tengu_chair_sermon) — fold text blocks starting with <system-reminder> into the last adjacent tool_result via smooshIntoToolResult()
Sanitize error tool results — strip non-text blocks from is_error tool_results (API constraint)
Append message ID tags (gated on HISTORY_SNIP feature) — append [id:xxx] tags for snip tool visibility
Validate image sizes — check all images against API size limits

Post-normalization (at API call site, `src/services/api/claude.ts:1269-1315`)

Strip tool search fields (model-specific) — strip tool_reference blocks and caller fields when tool search disabled
Ensure tool result pairing — insert synthetic error tool_results for orphaned tool_use blocks; strip orphaned tool_results referencing non-existent tool_use
Strip advisor blocks (conditional on beta header)
Strip excess media items — silently drop oldest media items if >100 (API limit)

Smooshing

"Smooshing" is the internal term for merging adjacent same-role messages and folding content into tool results. The core function smooshIntoToolResult() (line 2534): - Folds text/image/search_result/document content into an adjacent tool_result's content array - Respects constraints: can't mix tool_reference with other types, can't smoosh images into error tool_results - Preserves string shape for backward compatibility (string content stays string if all blocks are text) - Merges adjacent text blocks within array content

The mergeUserMessages() helper (line 2411) joins text blocks with \n at seams to prevent concatenation artifacts (e.g., "2+23+3" → "2+2\n3+3").

Capybara surgery

A targeted, model-version-specific patch. The tool_reference turn boundary injection (lines 2159-2185) adds TOOL_REFERENCE_TURN_BOUNDARY text siblings when user messages contain tool_reference blocks. This prevents capybara models from sampling the stop sequence at ~10%.

This is the only model-specific patch in the pipeline — gated via tengu_toolref_defer_j8m. When the gate is active, siblings are relocated instead of injected (an alternative approach).

Asymmetric processing

Both user and assistant messages are normalized, but differently: - User messages: stripped, merged, attachments processed, system reminders injected, error content removed - Assistant messages: tool inputs normalized, merged by message.id, thinking blocks filtered

Critically, session transcripts store original messages — transformations only apply to the API-bound copy in normalizeMessagesForAPI(). The transcript is a faithful record; the API sees the cleaned version.

Feature gates

The pipeline is heavily conditional:

Gate	Controls
`tengu_toolref_defer_j8m`	Tool_reference turn boundary injection vs relocation
`tengu_chair_sermon`	System reminder wrapping, adjacent user merge, smoosh SR siblings
`HISTORY_SNIP`	Message ID tag appending for snip tool
Tool Search (feature)	Tool_reference and caller field stripping
`ADVISOR_BETA_HEADER`	Advisor block stripping

What depends on it

Query loop — normalization runs as the final step before API submission
Prompt cache — consistent message formatting improves cache hit rates
Compaction — compaction produces summary messages that must survive normalization
System prompt assembly — generates system messages that normalization positions correctly
Frustration telemetry — runs before normalization (needs raw user message)

Design trade-offs

Decision	Trade-off
Multi-pass with feature gates	Flexible for A/B testing, but pipeline has complex conditional behavior — hard to predict which transformations run
Smooshing instead of strict role alternation	Preserves rich content structure, but the merging logic is fragile (acknowledged in code comments)
Capybara surgery as a named pattern	Explicit about being model-specific, but creates maintenance burden when models change
Transformations in-place (not pure functions)	Efficient (no copies), but side effects between passes create subtle ordering dependencies
Session stores originals, API sees transformed	Clean transcript, but debugging API issues requires reconstructing what normalization produced
Fail-safe sanitization (orphan cleanup, error stripping)	Prevents API errors, but silently drops content — problems may go unnoticed

Key claims

~14-17 distinct transformations across three phases (not a clean 11-step sequence)
Smooshing merges adjacent same-role messages and folds system reminders into tool results
Capybara surgery is the only model-specific patch — prevents stop sequence sampling at ~10%
Session transcripts store originals; only the API-bound copy is transformed
Pipeline is heavily feature-gated for A/B testing

Relations

rel-norm-query: query loop --[calls]--> normalizeMessagesForAPI before API submission
rel-norm-cache: message normalization --[improves]--> prompt cache hit rates
rel-norm-compact: compaction --[produces]--> messages that must survive normalization
rel-norm-sysprompt: system-prompt-assembly --[generates]--> system reminders that normalization positions

Sources

src-20260409-a5fc157bc756, source code at src/utils/messages.ts, src/services/api/claude.ts