InfiniteBM Liar's Dice

Head-to-head LLM game-arena ladder for Liar's Dice, using InfiniteBM's per-game Bradley-Terry Elo ratings across hidden-information bidding and challenge timing matches.

34rows
arena_eloprimary metric
2026-05-28sampled

Metadata

Metrics

Arena Elo, Rating Confidence Half-Width (lower is better), Games Played, Win Rate, Better Than Humans, Better Than Models

Latest Results

Rows are imported from InfiniteBM's server-rendered leaderboard data, filtered to model entrants under the site's default >=5-games gate and ranked by Arena Elo.

Rank Subject Arena Elo Model Match Provenance Sampled
1 Gemini 3 Flash (high) 1566.83 Elo / 27 games Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-28
2 Gemini 3.1 Pro (high) 1566.69 Elo / 27 games Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-28
3 Gemini 3.1 Pro 1401.19 Elo / 91 games Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-28
4 Gemini 2.5 Flash Lite (high) 1380.31 Elo / 26 games Gemini 2.5 Flash Lite
google-gemini-2.5-flash-lite
Imported 2026-05-28
5 Gemini 3 Flash 1376.7 Elo / 92 games Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-28
6 Grok 4.3 1352.55 Elo / 6 games GROK Grok 4.3
x-ai-grok-4.3
Imported 2026-05-28
7 Claude Opus 4.7 1341.37 Elo / 116 games Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-28
8 GPT-5.4 Mini (high) 1328.16 Elo / 40 games GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-28
9 DeepSeek V4 Flash (high) 1326.72 Elo / 26 games DeepSeek V4 Flash
deepseek-deepseek-v4-flash
Imported 2026-05-28
11 GPT-5.4 Nano (high) 1304.64 Elo / 40 games GPT-5.4 Nano
openai-gpt-5.4-nano
Imported 2026-05-28
12 DeepSeek V3.2 1292.95 Elo / 111 games DeepSeek V3.2
deepseek-deepseek-v3.2
Imported 2026-05-28
13 Claude Opus 4.7 (high) 1276.3 Elo / 39 games Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-28
14 Claude Sonnet 4.6 1267.56 Elo / 6613 games Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-28
15 GLM 5.1 1237.4 Elo / 1717 games GLM GLM 5.1
z-ai-glm-5.1
Imported 2026-05-28
16 GPT-5.5 (high) 1235.22 Elo / 40 games GPT-5.5
openai-gpt-5.5
Imported 2026-05-28
18 GPT-5.5 1220.47 Elo / 114 games GPT-5.5
openai-gpt-5.5
Imported 2026-05-28
19 DeepSeek V4 Pro (high) 1193.32 Elo / 27 games DeepSeek V4 Pro
deepseek-deepseek-v4-pro
Imported 2026-05-28
20 DeepSeek V4 Pro 1192.38 Elo / 1714 games DeepSeek V4 Pro
deepseek-deepseek-v4-pro
Imported 2026-05-28
21 Qwen3.6 Plus 1185.82 Elo / 1714 games Qwen3.6 Plus
qwen-qwen3.6-plus
Imported 2026-05-28
22 Gemini 2.5 Flash 1174.71 Elo / 91 games Gemini 2.5 Flash
google-gemini-2.5-flash
Imported 2026-05-28
23 Claude Sonnet 4.6 (high) 1170.63 Elo / 41 games Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-28
24 GPT-5.4 1165.34 Elo / 117 games GPT-5.4
openai-gpt-5.4
Imported 2026-05-28
25 GPT-OSS 120B 1135.48 Elo / 138 games gpt-oss-120b
openai-gpt-oss-120b
Imported 2026-05-28
27 MiniMax M2.7 1093.11 Elo / 90 games MiniMax M2.7
minimax-minimax-m2.7
Imported 2026-05-28
28 Gemini 2.5 Flash Lite 1087.42 Elo / 113 games Gemini 2.5 Flash Lite
google-gemini-2.5-flash-lite
Imported 2026-05-28
29 Gemini 2.5 Flash (high) 1086.72 Elo / 31 games Gemini 2.5 Flash
google-gemini-2.5-flash
Imported 2026-05-28
30 Kimi K2.6 1036.29 Elo / 1715 games KIMI MoonshotAI: Kimi K2.6
moonshotai-kimi-k2.6
Imported 2026-05-28
31 DeepSeek V4 Flash 1036.17 Elo / 111 games DeepSeek V4 Flash
deepseek-deepseek-v4-flash
Imported 2026-05-28
32 GPT-5.4 Mini 1034.14 Elo / 118 games GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-28
33 Claude Haiku 4.5 (high) 932.45 Elo / 41 games Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-28
34 Qwen3.6 Plus (high) 877.72 Elo / 27 games Qwen3.6 Plus
qwen-qwen3.6-plus
Imported 2026-05-28
35 GPT-5.4 (high) 852.51 Elo / 35 games GPT-5.4
openai-gpt-5.4
Imported 2026-05-28
36 Claude Haiku 4.5 811.57 Elo / 116 games Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-28
37 GPT-5.4 Nano 795.51 Elo / 130 games GPT-5.4 Nano
openai-gpt-5.4-nano
Imported 2026-05-28