VLMbench Performance
Vision-language model throughput benchmark reporting peak tokens/sec, TTFT, TPOT, and worker count on an RTX PRO 6000 Blackwell vLLM setup.
5rows
best_toks_per_sprimary metric
2026-05-28sampled
Metadata
Metrics
Best Tok/s, Workers, TTFT (lower is better), TPOT (lower is better)
| Rank | Subject | Best Tok/s | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | lightonai/LightOnOCR-2-1B | 2439.8 tok/s | — | Imported | 2026-05-28 |
| 2 | Qwen/Qwen3-VL-2B-Instruct | 2409.3 tok/s | — | Imported | 2026-05-28 |
| 3 | PaddlePaddle/PaddleOCR-VL | 2341.9 tok/s | — | Imported | 2026-05-28 |
| 4 | deepseek-ai/DeepSeek-OCR | 1195.8 tok/s | — | Imported | 2026-05-28 |
| 5 | Qwen/Qwen3-VL-8B-Instruct | 953.8 tok/s | Qwen3 VL 8B Instruct qwen-qwen3-vl-8b-instruct | Imported | 2026-05-28 |
No matching rows.