Anti-Distillation Defenses

Entity ID: ent-20260409-88dfb37a6820
Type: service
Scope: shared
Status: active
Aliases: anti-distillation, distillation defense

Description

Anti-Distillation Defenses are a set of four layered protections designed to prevent competitors from distilling Claude Code's behavior into their own models or products. Distillation -- the practice of training a smaller model to replicate a larger model's outputs by observing its behavior -- is a significant commercial threat to AI products. Claude Code employs these defenses to make automated behavior extraction unreliable, even though tools like piebald have demonstrated that prompt extraction from compiled bundles remains feasible.

The Four Defense Layers

1. Fake Tools in Tool Registry

The tool registry includes decoy tool definitions that appear in API responses but do not correspond to real functionality. These fake tools have plausible names, descriptions, and parameter schemas, but invoking them produces no useful output. A competitor attempting to distill Claude Code's behavior by enumerating its tools would incorporate these fake tools into their training data, introducing systematic errors into their clone. The toolsearch-system's lazy-loading architecture further obscures the full tool inventory, since not all tools are visible at any given time.

2. Signed Chain-of-Thought

Claude Code's chain-of-thought (CoT) reasoning is cryptographically signed before being included in the conversation transcript. This signature allows the system to verify that reasoning traces originated from a genuine Claude model rather than being replayed or synthesized. A distillation attempt that captures CoT tokens would produce outputs that fail signature verification, making it detectable when a clone tries to mimic Claude Code's reasoning patterns.

3. Zig-Native DRM (cch Hash)

A DRM mechanism implemented as a native Zig module via native-attestation computes a cch hash that validates the runtime environment. This hash incorporates factors such as the installation path, the npm package signature, and runtime characteristics. If the hash check fails, certain premium features are degraded or disabled. The choice of Zig (rather than C or Rust) for this module is notable -- it produces small, dependency-free binaries that are difficult to reverse-engineer.

4. Protocol Isolation via system-prompt-fingerprinting

The communication protocol between Claude Code and the Anthropic API includes session-specific tokens and request signatures that are not part of the public API specification. Together these make it impractical to reverse-engineer Claude Code behavior by observing API interactions. A distillation system that intercepts API traffic would need to replicate these protocol-level details to maintain a functional connection.

Effectiveness and Limitations

Layer	Protects Against	Known Limitations
Fake tools	Automated tool enumeration	Human review can identify decoys
Signed CoT	CoT replay attacks	Does not prevent behavior-level distillation (ignoring CoT)
Zig DRM	Unauthorized redistribution	Binary can be patched or bypassed
Protocol isolation	API traffic interception	Protocol can be reverse-engineered over time

These defenses raise the cost and complexity of distillation but do not make it impossible. Community tools like piebald and ccunpacked-dev have demonstrated that significant behavioral and structural information can be extracted despite these protections. The defenses are best understood as speed bumps rather than impenetrable walls.

Integration

The anti-distillation defenses span multiple subsystems: fake tools are registered in the same tool registry used by the plugin-system and toolsearch-system, signed CoT flows through queryengine-ts, the Zig DRM module runs during system-prompt-assembly, and protocol isolation is enforced at the API communication layer within the five-layer-architecture.

Key claims

none yet

Relations

none yet

Sources

src-20260409-a5fc157bc756