Observability and Cost

What operators and users see — telemetry, cost tracking, token estimation, and how Claude Code accounts for hidden costs.

Cost tracking

The cost tracker (src/cost-tracker.ts) provides session-level token usage and USD cost accounting.

Five-category pricing model

Every API response flows through addToTotalSessionCost(), which calculates USD cost across five token categories:

Category	What it covers
Input tokens	User messages, system prompt, tool results sent to API
Output tokens	Model-generated text, tool calls, thinking
Cache write	Tokens written to prompt cache (first time)
Cache read	Tokens read from prompt cache (subsequent turns)
Web search	Per-request cost for web search tool

Pricing is model-specific: Opus 4.6 at $5/$25 per MTok (standard) or $30/$150 (fast mode), Sonnet at $3/$15, Haiku at $1/$5.

Hidden costs

Advisor (sub-model) costs are recursively tracked. When the auto-mode classifier runs (a separate Sonnet inference per tool call in auto mode), its cost is extracted from the API response's advisor usage and recursively accumulated. A session with 200 tool calls in auto mode incurs 200 additional classifier inferences.

Memory extraction fires a background Opus API call per extraction to write session memory. This fire-and-forget call can double effective token consumption — users see 13M tokens but actual usage may be 26M.

Session persistence

Cost state survives CLI restarts. saveCurrentSessionCosts() writes accumulated totals (USD cost, durations, token counts, lines changed, per-model breakdown) to the project config keyed by session ID. On resume, restoreCostStateForSession() reinstates the state.

Visibility gating

Cost display is gated by billing role — only users with console billing access (admin or billing role) see dollar amounts. Claude.ai subscribers see subscription status instead. The /cost command renders: total USD, API/wall-clock durations, code change line counts, and per-model token breakdown.

Token estimation

The token estimation service (src/services/tokenEstimation.ts) provides fast approximate token counts for strings and message arrays. Used throughout the system:

Auto-compact trigger — estimates current conversation tokens to detect when compaction is needed
Session memory extraction — estimates tokens since last extraction to decide trigger cadence
System prompt assembly — estimates hidden context size
Cost display — provides token counts for status line

Telemetry

Analytics (event logging)

The analytics service (src/services/analytics/) is the central event bus. Every significant action logs via logEvent():

Tool invocations and results
Permission decisions
Model responses and errors
Session start/end
Feature flag exposures
Unknown model cost events

The queue-and-attachment pattern ensures events are never lost: events queue in memory until the analytics sink is initialized, then drain.

Sinks: Datadog for real-time metrics, 1P event logging (OpenTelemetry) for structured events.

OpenTelemetry integration

Lazy-loaded (~400KB) to avoid startup penalty. Provides: - Counters — costCounter (USD), tokenCounter (per-category) - Traces — Perfetto tracing for performance profiling (feature-gated) - Logging — LoggerProvider for structured logs

GrowthBook experiment tracking

When GrowthBook features are accessed, experiment exposure is logged (deduped per session). This feeds into A/B test analysis: which variant the user was in, which features were evaluated.

What operators see

Status line

The StatusLine component in the terminal UI shows real-time: - Total cost (USD) - API duration vs wall-clock duration - Token counts (input/output) - Lines added/removed - Current model

/cost command

Session cost summary: total USD, per-model breakdown with input/output/cache-read/cache-write tokens, API and wall-clock durations, code change counts.

Diagnostics (/doctor)

System health check: config validation, auth status, MCP server connectivity, plugin status.

Session persistence

Session transcripts persist to ~/.claude/sessions/ as JSONL files. The SDK provides getSessionMessages() for programmatic access with pagination. Session info includes: summary, last modified, file size, git branch, cwd, custom title, tag.

Cost optimization signals

The telemetry system surfaces optimization opportunities: - Cache hit rates — prompt cache read vs write ratios indicate cache effectiveness - Advisor costs — auto-mode classifier overhead visible in per-model breakdown - Compaction frequency — how often auto-compact triggers indicates context management efficiency - Unknown model costs — tengu_unknown_model_cost events flag unrecognized models hitting the fallback pricing tier

Cross-references

auto-compact — context compression (driven by token estimates)
piebald — GrowthBook experiment tracking
service-layer — analytics as the central hub
system-prompt-assembly — hidden context token costs