LocalScore

Mozilla Builders local LLM hardware benchmark measuring prompt processing speed, generation speed, time to first token, and an aggregate LocalScore for model-and-accelerator configurations.

13rows
performance_scoreprimary metric
2026-05-06sampled

Metadata

Metrics

LocalScore, Average prompt processing speed, Average generation speed, Average time to first token (lower is better)

Latest Results

Rows are imported from accelerator_model_performance_scores in the public LocalScore SQLite database. LocalScore documentation describes 1,000 as excellent, 250 as passable, and below 100 as likely a poor user experience.

Rank Subject LocalScore Model Match Provenance Sampled
1 NVIDIA GeForce RTX 3090 - Llama 3.2 1B Instruct (Q4_K - Medium) 3661.57 Imported 2026-05-06
2 NVIDIA GeForce RTX 4060 Ti - Llama 3.2 1B Instruct (Q4_K - Medium) 2327.86 Imported 2026-05-06
3 Apple M4 Max 12P+4E+40GPU - Llama 3.2 1B Instruct (Q4_K - Medium) 1335.64 Imported 2026-05-06
4 NVIDIA GeForce RTX 3090 - Meta Llama 3.1 8B Instruct (Q4_K - Medium) 1007.73 Imported 2026-05-06
5 AMD Radeon RX 6650 XT - Llama 3.2 1B Instruct (Q4_K - Medium) 907.29 Imported 2026-05-06
6 NVIDIA GeForce RTX 3090 - Qwen2.5 14B Instruct (Q4_K - Medium) 572.57 Imported 2026-05-06
7 NVIDIA GeForce RTX 4060 Ti - Meta Llama 3.1 8B Instruct (Q4_K - Medium) 565.37 Imported 2026-05-06
8 Apple M1 Pro 8P+2E+16GPU - Llama 3.2 1B Instruct (Q4_K - Medium) 458.91 Imported 2026-05-06
9 Apple M4 Max 12P+4E - Llama 3.2 1B Instruct (Q4_K - Medium) 378.05 Imported 2026-05-06
10 NVIDIA GeForce RTX 4060 Ti - Qwen2.5 14B Instruct (Q4_K - Medium) 316.56 Imported 2026-05-06
11 Apple M4 Max 12P+4E+40GPU - Meta Llama 3.1 8B Instruct (Q4_K - Medium) 250.42 Imported 2026-05-06
12 AMD Radeon RX 6650 XT - Meta Llama 3.1 8B Instruct (Q4_K - Medium) 180.64 Imported 2026-05-06
13 AMD EPYC 7352 24-Core Processor (znver2) - Llama 3.2 1B Instruct (Q4_K - Medium) 87.96 Imported 2026-05-06