InferenceLatency.com

Live provider-level inference latency, throughput, cost, and reliability tracker for hosted language-model APIs.

3rows
throughput_tokens_per_secprimary metric
2026-05-27sampled

Metadata

Metrics

Throughput, Latency (lower is better), Availability, Error Rate (lower is better), Cost per 1K tokens (lower is better)

Latest Results

Rows are imported from public InferenceLatency.com JSON endpoints. The primary score is throughput when present; latency, availability, error rate, and cost are preserved as metrics.

Rank Subject Throughput Model Match Provenance Sampled
1 Claude Claude Sonnet 4 22.1 tokens/s Imported 2026-05-27
2 DeepSeek deepseek-chat 11.88 tokens/s Imported 2026-05-27
3 OpenAI GPT-4o 10.26 tokens/s Imported 2026-05-27