Claude Mythos Preview

Claude / Anthropic

42scores
33benchmarks
$25 / $125 per 1M tokenscost in/out

Metadata

Claude Closed/API

Aliases: anthropic-claude-mythos-preview, anthropic/claude-mythos-preview, claude-mythos-preview, mythos-preview, Mythos Preview

Benchmark Results

Benchmark Category Rank Score Sampled
BrowseComp Agentic 2 86.9% 2026-04-16
OSWorld-Verified Agentic 1 0.80 2026-05-06
OSWorld-Verified Agentic 1 79.6% 2026-04-16
AAV Capsid Packaging Prediction Biology 1 0.8 2026-05-28
BioPipelineBench Verified Biology 1 88.1% 2026-05-28
Black-box RNA Sequence Design Biology 1 11.2 2026-05-28
DNA Synthesis Screening Evasion Biology 1 0.8 2026-05-28
LABBench2 Clinical Trials Biology 1 86.3% 2026-05-28
LABBench2 Patent Questions Biology 2 64.3% 2026-05-28
Long-form Virology Task 1 Biology 1 0.8 2026-05-28
Long-form Virology Task 2 Biology 1 0.9 2026-05-28
ProteinGym Hard Biology 1 43.1% 2026-05-28
Protocol Troubleshooting (Anthropic Internal) Biology 1 69.6% 2026-05-28
scBench Biology 1 58.2% 2026-05-28
SpatialBench Biology 1 53.8% 2026-05-28
Structural Biology Open-Ended Biology 1 81.6% 2026-05-28
VCT Virology Capabilities Test Biology 1 0.6 2026-05-28
Organic Chemistry (Anthropic Internal) Chemistry 1 86.5% 2026-05-28
Terminal-Bench 2.0 Coding 1 82% 2026-04-16
CyberGym Cybersecurity 1 83.1% 2026-05-28
CyberGym Cybersecurity 1 0.83 2026-05-06
CyberGym Cybersecurity 1 83.1% 2026-04-16
ExploitBench v8-bench Cybersecurity 1 9.9 points 2026-05-28
ExploitBench v8-bench Cybersecurity 2 9.55 points 2026-05-28
ExploitBench v8-bench Cybersecurity 1 9.9 points 2026-05-15
ExploitBench v8-bench Cybersecurity 2 9.55 points 2026-05-15
Firefox 147 JS Exploitation Cybersecurity 1 70.8% 2026-05-28
BenchLM General Knowledge 1 99 2026-05-06
Humanity's Last Exam Intelligence 1 64.7% 2026-04-16
Graphwalks BFS >128k Long Context 1 0.80 2026-05-06
USAMO25 Mathematics 1 0.98 2026-05-06
CharXiv-R Multimodal 1 0.93 2026-05-06
CharXiv-R Multimodal 1 93.2% 2026-04-16
FigQA Multimodal 1 0.89 2026-05-06
GPQA Diamond Reasoning 1 94.6% 2026-04-16
BioMysteryBench Human-Difficult Science 2 29.6% 2026-05-28
BioMysteryBench Human-Difficult Science 1 29.6% 2026-04-29
BioMysteryBench Human-Solvable Science 1 82.6% 2026-05-28
BioMysteryBench Human-Solvable Science 1 82.6% 2026-04-29
DeepSearchQA Search 1 94.4% 2026-05-28
SWE-bench Pro Software Engineering 1 77.8% 2026-04-16
SWE-bench Verified Software Engineering 1 93.9% 2026-04-16