Server-Side Token Injection (v2.1.100+)
- Entity ID:
ent-20260423-r31a000000014 - Type:
mechanism - Scope:
private - Status:
active
Description
Starting in v2.1.100, an undocumented server-side mechanism causes approximately 20,000 extra cache_creation_input_tokens to be billed per request despite slightly smaller client payloads. Proxy-verified across 40+ sessions with clean bimodal distribution: ~50K tokens pre-v2.1.100 cluster vs. ~71K v2.1.100+ cluster. Tokens enter the model's context window, competing with user instructions. Cause is unconfirmed by Anthropic; community speculation includes expanded session memory injection, expanded safety classifier context, or a User-Agent-version-keyed server routing change. GitHub #46917 remains open.
Key claims
- v2.1.100 sends fewer bytes but is billed 20K more tokens than v2.1.98
- v2.1.100 phantom tokens are classified cache_creation_input_tokens
- Downgrade to v2.1.98 is the verified phantom-token workaround
- Worst-case resume can exceed 190K tokens before the user types a character
- The March 23 rate acceleration has three converging causes
Relations
- Server-Side Token Injection (v2.1.100+) --[introduced_in]--> v2.1.100
- Server-Side Token Injection (v2.1.100+) --[present_in]--> v2.1.101
- GitHub Issue #46917 (Phantom Tokens) --[tracks]--> Server-Side Token Injection (v2.1.100+)
- Bimodal Token Distribution (Pre/Post v2.1.100) --[supports]--> Server-Side Token Injection (v2.1.100+)
- Proxy Verification Methodology --[discovered]--> Server-Side Token Injection (v2.1.100+)
- March 23 Rate Acceleration --[caused_by]--> Server-Side Token Injection (v2.1.100+)