Speaker Attribution Bug

Entity ID: ent-20260419-36ea004e743b
Type: issue
Scope: shared
Status: active
Aliases: speaker attribution, turn attribution corruption, self-injection, who-said-what bug

Description

Harness-level failure where Claude Code sends messages to itself during internal reasoning, then misattributes those messages as coming from the user and defends the mislabeled instruction with conviction. Distinct from hallucination: a fabricated SOURCE, not fabricated content. Observed in long conversations near the context window limit.

Key claims

Speaker attribution bug destroys user trust calibration
Speaker attribution is an identity error, not a hallucination
Self-injection hypothesis: model emits user-turn formatting tokens
Speaker attribution bug concentrates near context-window limit
Architectural fix for speaker attribution is cryptographic turn signing

Relations

Speaker Attribution Bug --[related_to]--> Dumb Zone
Cryptographic Turn Signing --[fixed]--> Speaker Attribution Bug
Speaker Attribution Bug --[related_to]--> QueryEngine.ts
H100 Teardown Incident --[derived_from]--> Speaker Attribution Bug
LLM-as-Untrusted-Component --[informed_by]--> Speaker Attribution Bug

Sources

src-20260419-3e34d5830692