Claude 3.7 Sonnet (thinking)
Claude / Anthropic
17scores
17benchmarks
$3 / $15 per 1M tokenscost in/out
Metadata
Claude Closed/API
Aliases: anthropic-claude-3.7-sonnet-thinking, anthropic/claude-3.7-sonnet:thinking, claude-3.7-sonnet:thinking
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| MultiChallenge | Agentic | 16 | 51.58 | 2026-05-06 |
| Tau2-Bench Telecom | Agentic | 166 | 54.7% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 139 | 21.2% | 2026-05-11 |
| SciCode | Coding | 93 | 40.3% | 2026-05-11 |
| TutorBench | Education | 21 | 46.45 | 2026-05-06 |
| Artificial Analysis Intelligence Index | Intelligence | 110 | 34.71 | 2026-05-11 |
| Humanity's Last Exam | Intelligence | 147 | 10.3% | 2026-05-11 |
| MMLU-Pro | Intelligence | 46 | 83.7% | 2026-05-11 |
| AIME 2025 | Math | 126 | 56.3% | 2026-05-11 |
| MMSI-Bench | Multimodal | 18 | 30.2% | 2026-05-28 |
| Visual-Language Understanding | Multimodal | 12 | 48.23 | 2026-05-06 |
| EnigmaEval | Reasoning | 17 | 4.23 | 2026-05-06 |
| GPQA Diamond | Reasoning | 131 | 77.2% | 2026-05-11 |
| Humanity's Last Exam (Text Only) | Reasoning | 33 | 7.89 | 2026-05-06 |
| MultiNRC | Reasoning | 21 | 27.77 | 2026-05-06 |
| CritPt | Science | 90 | 0.9% | 2026-05-11 |
| LiveSQLBench | Text to SQL | 19 | 26.55 | 2026-05-06 |
No matching rows.