InferenceLatency.com
Live provider-level inference latency, throughput, cost, and reliability tracker for hosted language-model APIs.
3rows
throughput_tokens_per_secprimary metric
2026-05-27sampled
Metadata
Metrics
Throughput, Latency (lower is better), Availability, Error Rate (lower is better), Cost per 1K tokens (lower is better)
| Rank | Subject | Throughput | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Claude Claude Sonnet 4 | 22.1 tokens/s | — | Imported | 2026-05-27 |
| 2 | DeepSeek deepseek-chat | 11.88 tokens/s | — | Imported | 2026-05-27 |
| 3 | OpenAI GPT-4o | 10.26 tokens/s | — | Imported | 2026-05-27 |
No matching rows.