Claude Mythos Preview
- Entity ID:
ent-20260409-1f3a059ab9f1 - Type:
concept - Scope:
shared - Status:
active - Aliases: Mythos, Mythos Preview, Project Glasswing
Description
Claude Mythos Preview is an unreleased frontier model announced on April 7, 2026 through Project Glasswing, Anthropic's first public system card for a model that was intentionally withheld from release due to its capabilities. Mythos represents a qualitative leap in both coding ability and offensive security capabilities, achieving benchmark results that substantially exceed all prior Claude models while simultaneously demonstrating unprecedented alignment scores.
Benchmark Performance
| Benchmark | Mythos Score | Previous Best (Opus 4) | Notes |
|---|---|---|---|
| SWE-bench Verified | 93.9% | ~72% | Near-human software engineering performance |
| Firefox 147 exploits | 181 working exploits | 2 (Opus) | Full browser exploitation capability |
| Control flow hijack | 10/10 patched targets | 0/10 (Opus) | Complete bypass of patched vulnerabilities |
| Alignment evaluation | Best by significant margin | -- | Described as most aligned model Anthropic has produced |
Offensive Security Capabilities
The most striking aspect of Mythos is its offensive security performance. In controlled evaluations, Mythos generated 181 working exploits against Firefox 147 -- a fully patched, current-release browser -- compared to just 2 from Claude Opus 4. It achieved full control flow hijack on all 10 patched target programs in the evaluation suite, meaning it could take complete control of program execution despite the presence of security patches.
Beyond the benchmark suite, Mythos independently discovered a 23-year-old Linux kernel vulnerability that had evaded detection by the entire security research community. It also found and produced a working proof-of-concept for a Vim CVE in under 2 minutes, demonstrating the ability to find real-world vulnerabilities at superhuman speed.
Why It Was Withheld
Anthropic chose not to release Mythos, making it the first model for which they published a system card explaining the decision to withhold. The reasoning centered on the offensive security capabilities: while the model's coding and alignment improvements are desirable, the ability to generate hundreds of browser exploits and discover kernel vulnerabilities represents a capability that could cause serious harm if widely available. The Project Glasswing system card frames this as a case study in responsible scaling, where capability improvements cross a threshold requiring new deployment safeguards.
Implications for Claude Code
If Mythos were deployed as the backing model for claude-code, its 93.9% SWE-bench score would dramatically improve autonomous coding accuracy. The yolo-classifier and permission-pipeline would need significant hardening to account for a model capable of discovering and potentially exploiting vulnerabilities in the code it is asked to work on. The three-layer-verification system's threat model would need to expand to include a model that understands offensive security at an expert level.
Key claims
- none yet
Relations
- none yet
Sources
src-20260409-09a1b2325b23