Server-Side Token Injection (v2.1.100+)

Entity ID: ent-20260423-r31a000000014
Type: mechanism
Scope: private
Status: active

Description

Starting in v2.1.100, an undocumented server-side mechanism causes approximately 20,000 extra cache_creation_input_tokens to be billed per request despite slightly smaller client payloads. Proxy-verified across 40+ sessions with clean bimodal distribution: ~50K tokens pre-v2.1.100 cluster vs. ~71K v2.1.100+ cluster. Tokens enter the model's context window, competing with user instructions. Cause is unconfirmed by Anthropic; community speculation includes expanded session memory injection, expanded safety classifier context, or a User-Agent-version-keyed server routing change. GitHub #46917 remains open.

Key claims

v2.1.100 sends fewer bytes but is billed 20K more tokens than v2.1.98
v2.1.100 phantom tokens are classified cache_creation_input_tokens
Downgrade to v2.1.98 is the verified phantom-token workaround
Worst-case resume can exceed 190K tokens before the user types a character
The March 23 rate acceleration has three converging causes

Relations

Server-Side Token Injection (v2.1.100+) --[introduced_in]--> v2.1.100
Server-Side Token Injection (v2.1.100+) --[present_in]--> v2.1.101
GitHub Issue #46917 (Phantom Tokens) --[tracks]--> Server-Side Token Injection (v2.1.100+)
Bimodal Token Distribution (Pre/Post v2.1.100) --[supports]--> Server-Side Token Injection (v2.1.100+)
Proxy Verification Methodology --[discovered]--> Server-Side Token Injection (v2.1.100+)
March 23 Rate Acceleration --[caused_by]--> Server-Side Token Injection (v2.1.100+)

Sources

src-20260423-542f02260352