Observability and Cost
What operators and users see — telemetry, cost tracking, token estimation, and how Claude Code accounts for hidden costs.
Cost tracking
The cost tracker (src/cost-tracker.ts) provides session-level token usage and USD cost accounting.
Five-category pricing model
Every API response flows through addToTotalSessionCost(), which calculates USD cost across five token categories:
| Category | What it covers |
|---|---|
| Input tokens | User messages, system prompt, tool results sent to API |
| Output tokens | Model-generated text, tool calls, thinking |
| Cache write | Tokens written to prompt cache (first time) |
| Cache read | Tokens read from prompt cache (subsequent turns) |
| Web search | Per-request cost for web search tool |
Pricing is model-specific: Opus 4.6 at $5/$25 per MTok (standard) or $30/$150 (fast mode), Sonnet at $3/$15, Haiku at $1/$5.
Hidden costs
Advisor (sub-model) costs are recursively tracked. When the auto-mode classifier runs (a separate Sonnet inference per tool call in auto mode), its cost is extracted from the API response's advisor usage and recursively accumulated. A session with 200 tool calls in auto mode incurs 200 additional classifier inferences.
Memory extraction fires a background Opus API call per extraction to write session memory. This fire-and-forget call can double effective token consumption — users see 13M tokens but actual usage may be 26M.
Session persistence
Cost state survives CLI restarts. saveCurrentSessionCosts() writes accumulated totals (USD cost, durations, token counts, lines changed, per-model breakdown) to the project config keyed by session ID. On resume, restoreCostStateForSession() reinstates the state.
Visibility gating
Cost display is gated by billing role — only users with console billing access (admin or billing role) see dollar amounts. Claude.ai subscribers see subscription status instead. The /cost command renders: total USD, API/wall-clock durations, code change line counts, and per-model token breakdown.
Token estimation
The token estimation service (src/services/tokenEstimation.ts) provides fast approximate token counts for strings and message arrays. Used throughout the system:
- Auto-compact trigger — estimates current conversation tokens to detect when compaction is needed
- Session memory extraction — estimates tokens since last extraction to decide trigger cadence
- System prompt assembly — estimates hidden context size
- Cost display — provides token counts for status line
Telemetry
Analytics (event logging)
The analytics service (src/services/analytics/) is the central event bus. Every significant action logs via logEvent():
- Tool invocations and results
- Permission decisions
- Model responses and errors
- Session start/end
- Feature flag exposures
- Unknown model cost events
The queue-and-attachment pattern ensures events are never lost: events queue in memory until the analytics sink is initialized, then drain.
Sinks: Datadog for real-time metrics, 1P event logging (OpenTelemetry) for structured events.
OpenTelemetry integration
Lazy-loaded (~400KB) to avoid startup penalty. Provides:
- Counters — costCounter (USD), tokenCounter (per-category)
- Traces — Perfetto tracing for performance profiling (feature-gated)
- Logging — LoggerProvider for structured logs
GrowthBook experiment tracking
When GrowthBook features are accessed, experiment exposure is logged (deduped per session). This feeds into A/B test analysis: which variant the user was in, which features were evaluated.
What operators see
Status line
The StatusLine component in the terminal UI shows real-time:
- Total cost (USD)
- API duration vs wall-clock duration
- Token counts (input/output)
- Lines added/removed
- Current model
/cost command
Session cost summary: total USD, per-model breakdown with input/output/cache-read/cache-write tokens, API and wall-clock durations, code change counts.
Diagnostics (/doctor)
System health check: config validation, auth status, MCP server connectivity, plugin status.
Session persistence
Session transcripts persist to ~/.claude/sessions/ as JSONL files. The SDK provides getSessionMessages() for programmatic access with pagination. Session info includes: summary, last modified, file size, git branch, cwd, custom title, tag.
Cost optimization signals
The telemetry system surfaces optimization opportunities:
- Cache hit rates — prompt cache read vs write ratios indicate cache effectiveness
- Advisor costs — auto-mode classifier overhead visible in per-model breakdown
- Compaction frequency — how often auto-compact triggers indicates context management efficiency
- Unknown model costs — tengu_unknown_model_cost events flag unrecognized models hitting the fallback pricing tier
Cross-references
- auto-compact — context compression (driven by token estimates)
- piebald — GrowthBook experiment tracking
- service-layer — analytics as the central hub
- system-prompt-assembly — hidden context token costs