LocalScore
Mozilla Builders local LLM hardware benchmark measuring prompt processing speed, generation speed, time to first token, and an aggregate LocalScore for model-and-accelerator configurations.
13rows
performance_scoreprimary metric
2026-05-06sampled
Metadata
Metrics
LocalScore, Average prompt processing speed, Average generation speed, Average time to first token (lower is better)
| Rank | Subject | LocalScore | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | NVIDIA GeForce RTX 3090 - Llama 3.2 1B Instruct (Q4_K - Medium) | 3661.57 | — | Imported | 2026-05-06 |
| 2 | NVIDIA GeForce RTX 4060 Ti - Llama 3.2 1B Instruct (Q4_K - Medium) | 2327.86 | — | Imported | 2026-05-06 |
| 3 | Apple M4 Max 12P+4E+40GPU - Llama 3.2 1B Instruct (Q4_K - Medium) | 1335.64 | — | Imported | 2026-05-06 |
| 4 | NVIDIA GeForce RTX 3090 - Meta Llama 3.1 8B Instruct (Q4_K - Medium) | 1007.73 | — | Imported | 2026-05-06 |
| 5 | AMD Radeon RX 6650 XT - Llama 3.2 1B Instruct (Q4_K - Medium) | 907.29 | — | Imported | 2026-05-06 |
| 6 | NVIDIA GeForce RTX 3090 - Qwen2.5 14B Instruct (Q4_K - Medium) | 572.57 | — | Imported | 2026-05-06 |
| 7 | NVIDIA GeForce RTX 4060 Ti - Meta Llama 3.1 8B Instruct (Q4_K - Medium) | 565.37 | — | Imported | 2026-05-06 |
| 8 | Apple M1 Pro 8P+2E+16GPU - Llama 3.2 1B Instruct (Q4_K - Medium) | 458.91 | — | Imported | 2026-05-06 |
| 9 | Apple M4 Max 12P+4E - Llama 3.2 1B Instruct (Q4_K - Medium) | 378.05 | — | Imported | 2026-05-06 |
| 10 | NVIDIA GeForce RTX 4060 Ti - Qwen2.5 14B Instruct (Q4_K - Medium) | 316.56 | — | Imported | 2026-05-06 |
| 11 | Apple M4 Max 12P+4E+40GPU - Meta Llama 3.1 8B Instruct (Q4_K - Medium) | 250.42 | — | Imported | 2026-05-06 |
| 12 | AMD Radeon RX 6650 XT - Meta Llama 3.1 8B Instruct (Q4_K - Medium) | 180.64 | — | Imported | 2026-05-06 |
| 13 | AMD EPYC 7352 24-Core Processor (znver2) - Llama 3.2 1B Instruct (Q4_K - Medium) | 87.96 | — | Imported | 2026-05-06 |
No matching rows.