Token Doubling Effect

Entity ID: ent-20260410-d6e0425480cd
Type: concept
Scope: shared
Status: active

Description

The hidden cost phenomenon where extractMemories fires a separate Opus API call after every turn, transmitting the entire conversation with different tool definitions and spawning a second independent cache chain. A 20-turn session with 650K context consumes ~26M tokens instead of ~13M. Invisible to users on flat-rate Max plan; devastating on API billing.

Key claims

120K token-stop pattern derived from 200K*83% compaction threshold

Relations

Two-Version Output Efficiency Directive --[caused]--> Token Doubling Effect

Sources

none