Message Normalization
- Entity ID:
ent-20260409-5b5e434ce848 - Type:
service - Scope:
shared - Status:
active - Aliases: message normalization, smooshing, capybara surgery
Description
Message normalization is a multi-pass preprocessing pipeline that transforms conversation messages before sending them to the Anthropic API. The pipeline ensures messages conform to API requirements, optimizes token usage, fixes model-specific behavioral quirks, and handles edge cases in multi-turn conversations. The transformations are heavily feature-gated and interwoven — not a clean 11-step sequence as previously described, but roughly 14-17 distinct operations across three phases.
Primary implementation: src/utils/messages.ts (function normalizeMessagesForAPI() at line 1989)
The actual pipeline
Phase 1: Pre-processing and filtering (lines 1996-2054)
- Reorder attachments (
reorderAttachmentsForAPI) — bubble attachment messages upward until hitting an assistant message or tool_result boundary - Strip virtual messages — filter out messages marked
isVirtual(REPL inner tool calls, display-only messages that shouldn't reach the API) - Build error strip map — identify synthetic API error messages from PDF/image/request-too-large errors; map which content blocks to strip from the preceding user message to avoid re-sending problematic content
Phase 2: Core message processing (lines 2056-2293)
The main loop processes messages by type:
User messages (lines 2094-2199):
4. Strip unavailable tool references — remove tool_reference blocks for tools that no longer exist or have search disabled
5. Strip error-triggering content — remove document/image blocks from meta user messages that preceded API errors
6. Inject tool_reference turn boundary (gated on tengu_toolref_defer_j8m) — add a text sibling "\n\nHuman: ..." after tool_reference blocks. This is the "capybara surgery" — prevents capybara models from sampling the stop sequence at ~10% (A/B tested: 21/200 vs 0/200)
7. Merge consecutive user messages — Bedrock doesn't support multiple user messages in a row
Assistant messages (lines 2201-2267):
8. Normalize tool_use inputs (normalizeToolInputForAPI) — strip plan/metadata fields that shouldn't reach the API
9. Strip tool_search fields — remove caller field from tool_use blocks when tool search is disabled
10. Merge by message ID — merge with previous assistant message if same message.id (happens with concurrent agents/streaming)
Attachment messages (lines 2269-2290):
11. Normalize attachments (normalizeAttachmentForAPI) — convert to user/tool_result messages
12. Wrap with system reminder (gated on tengu_chair_sermon) — ensure system reminder tag wrapping
13. Merge into adjacent user message
Phase 3: Post-processing (lines 2295-2370)
- Relocate tool_reference siblings (gated on
tengu_toolref_defer_j8m) — move text-block siblings off user messages containing tool_reference to prevent two-consecutive-human-turns pattern - Filter orphaned thinking-only messages — remove assistant messages containing ONLY thinking blocks with no corresponding non-thinking content. Prevents "thinking blocks cannot be modified" API error
- Filter trailing thinking — strip thinking blocks from end of last assistant message (API rejects trailing thinking)
- Filter whitespace-only assistant messages — remove assistant messages with only whitespace text blocks; merge resulting adjacent user messages
- Ensure non-empty assistant content — insert placeholder text for non-final assistant messages with empty content
- Merge adjacent user messages (gated on
tengu_chair_sermon) — useshoistToolResults()to put tool_results before other blocks - Smoosh system reminder siblings (gated on
tengu_chair_sermon) — fold text blocks starting with<system-reminder>into the last adjacent tool_result viasmooshIntoToolResult() - Sanitize error tool results — strip non-text blocks from
is_errortool_results (API constraint) - Append message ID tags (gated on
HISTORY_SNIPfeature) — append[id:xxx]tags for snip tool visibility - Validate image sizes — check all images against API size limits
Post-normalization (at API call site, src/services/api/claude.ts:1269-1315)
- Strip tool search fields (model-specific) — strip tool_reference blocks and caller fields when tool search disabled
- Ensure tool result pairing — insert synthetic error tool_results for orphaned tool_use blocks; strip orphaned tool_results referencing non-existent tool_use
- Strip advisor blocks (conditional on beta header)
- Strip excess media items — silently drop oldest media items if >100 (API limit)
Smooshing
"Smooshing" is the internal term for merging adjacent same-role messages and folding content into tool results. The core function smooshIntoToolResult() (line 2534):
- Folds text/image/search_result/document content into an adjacent tool_result's content array
- Respects constraints: can't mix tool_reference with other types, can't smoosh images into error tool_results
- Preserves string shape for backward compatibility (string content stays string if all blocks are text)
- Merges adjacent text blocks within array content
The mergeUserMessages() helper (line 2411) joins text blocks with \n at seams to prevent concatenation artifacts (e.g., "2+23+3" → "2+2\n3+3").
Capybara surgery
A targeted, model-version-specific patch. The tool_reference turn boundary injection (lines 2159-2185) adds TOOL_REFERENCE_TURN_BOUNDARY text siblings when user messages contain tool_reference blocks. This prevents capybara models from sampling the stop sequence at ~10%.
This is the only model-specific patch in the pipeline — gated via tengu_toolref_defer_j8m. When the gate is active, siblings are relocated instead of injected (an alternative approach).
Asymmetric processing
Both user and assistant messages are normalized, but differently: - User messages: stripped, merged, attachments processed, system reminders injected, error content removed - Assistant messages: tool inputs normalized, merged by message.id, thinking blocks filtered
Critically, session transcripts store original messages — transformations only apply to the API-bound copy in normalizeMessagesForAPI(). The transcript is a faithful record; the API sees the cleaned version.
Feature gates
The pipeline is heavily conditional:
| Gate | Controls |
|---|---|
tengu_toolref_defer_j8m |
Tool_reference turn boundary injection vs relocation |
tengu_chair_sermon |
System reminder wrapping, adjacent user merge, smoosh SR siblings |
HISTORY_SNIP |
Message ID tag appending for snip tool |
| Tool Search (feature) | Tool_reference and caller field stripping |
ADVISOR_BETA_HEADER |
Advisor block stripping |
What depends on it
- Query loop — normalization runs as the final step before API submission
- Prompt cache — consistent message formatting improves cache hit rates
- Compaction — compaction produces summary messages that must survive normalization
- System prompt assembly — generates system messages that normalization positions correctly
- Frustration telemetry — runs before normalization (needs raw user message)
Design trade-offs
| Decision | Trade-off |
|---|---|
| Multi-pass with feature gates | Flexible for A/B testing, but pipeline has complex conditional behavior — hard to predict which transformations run |
| Smooshing instead of strict role alternation | Preserves rich content structure, but the merging logic is fragile (acknowledged in code comments) |
| Capybara surgery as a named pattern | Explicit about being model-specific, but creates maintenance burden when models change |
| Transformations in-place (not pure functions) | Efficient (no copies), but side effects between passes create subtle ordering dependencies |
| Session stores originals, API sees transformed | Clean transcript, but debugging API issues requires reconstructing what normalization produced |
| Fail-safe sanitization (orphan cleanup, error stripping) | Prevents API errors, but silently drops content — problems may go unnoticed |
Key claims
- ~14-17 distinct transformations across three phases (not a clean 11-step sequence)
- Smooshing merges adjacent same-role messages and folds system reminders into tool results
- Capybara surgery is the only model-specific patch — prevents stop sequence sampling at ~10%
- Session transcripts store originals; only the API-bound copy is transformed
- Pipeline is heavily feature-gated for A/B testing
Relations
rel-norm-query: query loop --[calls]--> normalizeMessagesForAPI before API submissionrel-norm-cache: message normalization --[improves]--> prompt cache hit ratesrel-norm-compact: compaction --[produces]--> messages that must survive normalizationrel-norm-sysprompt: system-prompt-assembly --[generates]--> system reminders that normalization positions
Sources
src-20260409-a5fc157bc756, source code at src/utils/messages.ts, src/services/api/claude.ts