InfiniteBM Chess

Head-to-head LLM game-arena ladder for chess, using InfiniteBM's per-game Bradley-Terry Elo ratings across model and human matches.

6rows
arena_eloprimary metric
2026-05-28sampled

Metadata

Metrics

Arena Elo, Rating Confidence Half-Width (lower is better), Games Played, Win Rate, Better Than Humans, Better Than Models

Latest Results

Rows are imported from InfiniteBM's server-rendered leaderboard data, filtered to model entrants under the site's default >=5-games gate and ranked by Arena Elo.

Rank Subject Arena Elo Model Match Provenance Sampled
1 Claude Opus 4.7 1997.52 Elo / 16 games Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-28
2 GPT-OSS 120B 1660.89 Elo / 6 games gpt-oss-120b
openai-gpt-oss-120b
Imported 2026-05-28
3 Claude Sonnet 4.6 1190.33 Elo / 11 games Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-28
4 Claude Haiku 4.5 936.92 Elo / 12 games Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-28
5 GPT-5.4 Mini 765.37 Elo / 8 games GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-28
6 GPT-5.4 334.92 Elo / 7 games GPT-5.4
openai-gpt-5.4
Imported 2026-05-28