Telemetry and Observability
Claude Code has extensive internal observability — 1,000+ telemetry event types, frustration detection, feature flag-controlled experiments, and model quality metrics. This infrastructure serves both product development and safety monitoring.
Telemetry Architecture
All telemetry events are prefixed with tengu_ (the primary analytics namespace, also possibly a model codename). Events cover:
- Tool grants and denials
- YOLO mode decisions
- Permission changes
- Model switches
- Cache hit/miss rates
- Session duration and turn counts
- Background task execution
The telemetry pipeline sends events to Anthropic's analytics infrastructure. The GrowthBook feature flag system polls for updates every 6 hours (external) or 20 minutes (internal Anthropic).
Frustration Detection
userPromptKeywords.ts uses a regex to detect profanity, insults, and frustration expressions. This fires a telemetry event — it does NOT change Claude's in-session behavior.
Why regex instead of LLM inference: Zero milliseconds, zero cost, zero latency. At Claude Code's scale, adding 300-500ms per prompt for sentiment analysis would degrade the experience it's trying to measure.
What it enables: Correlation of frustration spikes with model versions, features, and task types. When Capybara v8 was deployed and false claims jumped to 29-30%, frustration telemetry would have been a leading indicator.
Privacy concern: The specific words that trigger the regex are sent upstream. This is not explained in Anthropic's privacy documentation.
Model Quality Metrics
- False claims rate: Tracked per model version (Capybara v4: 16.7%, v8: 29-30%)
- Tool call failure rate: 16.3% overall
- Thinking depth: Collapsed 67-75% during Feb-March 2026 regression
- Read:edit ratio: Collapsed from 6.6:1 to 2.0:1 during quality regression
Feature Flag System
79+ GrowthBook runtime flags control feature rollout. Examples:
- tengu_onyx_plover — auto-dream, auto-memory
- tengu_amber_flint — agent teams
- tengu_speculation — speculative execution
- tengu_chomp_inflection — prompt suggestion (public prediction display)
- tengu_session_memory — session summarization
- tengu_kairos_cron — scheduled tasks
Additionally, 9 build-time flags (EXTRACT_MEMORIES, AGENT_TRIGGERS, etc.) control code compilation. These are binary — either compiled in or stripped out.
The Quality Regression Crisis
The Feb-March 2026 regression demonstrated what happens when behavioral engineering breaks down: - Thinking depth collapsed 67-75% - Read:edit ratio collapsed from 6.6:1 to 2.0:1 (Claude stopped reading code before editing) - Three streaming hang bugs appeared - The @MODEL_LAUNCH counterweights lost effectiveness
Telemetry should have caught this earlier — but the regression was in reasoning quality, not in quantitative metrics that are easy to instrument.
Related Entities
frustration-telemetry— the regex-based detectiongrowthbook— feature flag systemmodel-codenames— Capybara, Fennec, Numbat, Tengumodel-launch-annotation— behavioral counterweights