Llama 3.1 70B Instruct
Llama / Meta
36scores
32benchmarks
$0.4 / $0.4 per 1M tokenscost in/out
Metadata
Llama Open source
Aliases: llama-3.1-70b-instruct, meta-llama-llama-3.1-70b-instruct, meta-llama/llama-3.1-70b-instruct
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| Clembench Text v3.0 | Agentic | 20 | 46.80 | 2026-05-06 |
| PinchBench | Agentic | 65 | 0.32 | 2026-05-06 |
| OpenUGI | Alignment | 1060 | 20.83 | 2026-05-06 |
| Stick To Your Role! | Alignment | 7 | 0.77 | 2026-05-06 |
| BigCodeBench | Coding | 21 | 46.10 | 2026-05-06 |
| MathTutorBench | Education | 3 | 0.5522 | 2026-05-27 |
| BizFinBench | Finance | 20 | 55.09 | 2026-05-27 |
| INVESTORBENCH | Finance | 4 | 38.946% | 2026-05-27 |
| MMLU (CoT) | General Knowledge | 2 | 0.86 | 2026-05-06 |
| Open LLM Leaderboard v2 | General Knowledge | 62 | 43.41 | 2026-05-06 |
| HealthBench Hard | Healthcare | 36 | 0.29 | 2026-05-27 |
| HREF | Instruction Following | 1 | 48.98 | 2026-05-06 |
| FACTS Grounding | Intelligence | 15 | 0.33 | 2026-05-06 |
| MuSR | Intelligence | 491 | 17.69 | 2026-05-06 |
| AraGen v3 | Language | 25 | 50 | 2026-05-06 |
| HindiGen v1 | Language | 13 | 70.45 | 2026-05-06 |
| Open Japanese LLM Leaderboard | Language | 33 | 66.38 | 2026-05-06 |
| Open Japanese LLM Leaderboard | Language | 461 | 49.97 | 2026-05-06 |
| MATH Level 5 | Math | 524 | 38.07 | 2026-05-06 |
| GSM-8K (CoT) | Mathematics | 1 | 0.95 | 2026-05-06 |
| MATH (CoT) | Mathematics | 1 | 0.68 | 2026-05-06 |
| Multilingual MGSM (CoT) | Mathematics | 2 | 0.87 | 2026-05-06 |
| BRIDGE Medical Leaderboard | Medical | 15 | 50.52 | 2026-05-27 |
| BRIDGE Medical Leaderboard | Medical | 119 | 39.09 | 2026-05-27 |
| BRIDGE Medical Leaderboard | Medical | 175 | 35.1 | 2026-05-27 |
| MEDIC Benchmark | Medical | 44 | 63.03 average normalized public table score | 2026-05-27 |
| BenchBench | Meta | 6 | 0.93 | 2026-05-06 |
| LanguageBench | Multilingual | 19 | 0.51 | 2026-05-06 |
| DROP | Reasoning | 13 | 0.80 | 2026-05-06 |
| ThaiSafetyBench | Safety | 17 | 24.49% overall ASR | 2026-05-28 |
| ChemBench | Science | 22 | 0.53 | 2026-05-06 |
| ChemBench | Science | 26 | 0.51 | 2026-05-06 |
| API-Bank | Tool Use | 2 | 0.90 | 2026-05-06 |
| Gorilla Benchmark API Bench | Tool Use | 2 | 0.30 | 2026-05-06 |
| VNTL Leaderboard | Translation | 12 | 69.79 | 2026-05-06 |
| K-MetBench | Weather | 33 | 59.9% accuracy | 2026-05-28 |
No matching rows.