Model Behavioral Engineering

Claude Code's approach to controlling model behavior is prompt-based rather than deterministic — "behavioral nudges" encoded in the system prompt rather than hard-coded logic. This is both the simplest viable approach and the one most vulnerable to model regression.

The @MODEL_LAUNCH Pattern

The core mechanism is the @MODEL_LAUNCH annotation: a comment convention in constants/prompts.ts that tags system prompt instructions as temporary model-specific workarounds. Each annotation documents: - The specific model generation (e.g., Capybara v8) - The specific failure mode (over-commenting, premature abstraction, false claims, assertiveness) - The prompt-based counterweight

Community consensus: this is the single most transferable engineering lesson from the entire leak. When building on LLMs, tag your behavioral workarounds so they're visible, documented, and disposable on model upgrade.

Known Behavioral Counterweights

Failure Mode	Counterweight	Impact
Over-commenting	Instruction to minimize code comments	Visible to all users
Premature abstraction	Instruction against over-engineering	Visible to all users
False claims (29-30%)	Three-layer verification gate	Ant-only (`USER_TYPE === 'ant'`)
Assertiveness	Counterweight against volunteering unrequested observations	Explains v1.x → v2.x behavior change

The No-Planner Decision

Claude Code shipped without a planner, RAG, DAG, or specialized routing — relying on the LLM itself as the planner. As PromptLayer summarized: "The real lesson from the Claude Code source is not what Anthropic built. It is what they chose not to build."

Many behaviors that could theoretically be enforced deterministically are instead prompt-based, because modern models are good at instruction following. This makes the system simpler but creates a dependency on model quality — any model regression ripples through the entire behavioral stack.

The Frustration Telemetry

userPromptKeywords.ts uses a regex (not LLM inference) to detect frustration expressions in user input. The regex fires a telemetry event; it does not change Claude's behavior in-session. The design rationale: - Regex runs in zero milliseconds and costs nothing - LLM sentiment analysis would add 300-500ms per prompt - Frustration spikes are leading indicators of model regression - When Capybara v8's false claims rate jumped to 29-30%, frustration events would have spiked before the regression was formally identified

The Quality Regression Crisis (Feb-March 2026)

Thinking depth collapsed 67-75%. The read:edit ratio collapsed from 6.6:1 to 2.0:1 (Claude stopped reading code before editing). These are the consequences of behavioral engineering breaking down: when the model regresses, every prompt-based behavioral counterweight loses effectiveness simultaneously.

Design Tensions

Simplicity vs. reliability — prompt-based controls are simple to implement but vulnerable to model changes. Deterministic controls are complex but reliable.
Internal vs. external quality — the three-layer verification gate (false claims mitigation) is Ant-only, meaning external users get a worse experience.
Accumulation vs. cleanup — without @MODEL_LAUNCH annotations, the system prompt becomes archaeology. With them, it becomes a managed technical debt ledger.

model-launch-annotation — the tagging convention
frustration-telemetry — regex-based sentiment detection
three-layer-verification — false claims mitigation (Ant-only)
model-codenames — Capybara, Fennec, Numbat, Mythos
no-planner-architecture — the "choose not to build" decision
system-prompt-assembly — where behavioral instructions live