IOI
Based on the International Olympiad in Informatics
54rows
scoreprimary metric
2026-05-26sampled
Metadata
Metrics
Score, Std. error (lower is better), Latency (lower is better), Cost per test (lower is better)
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT 5.4 2026-03-05 | 67.834% | GPT-5.4 openai-gpt-5.4 | Imported | 2026-05-26 |
| 2 | GPT 5.2 2025-12-11 | 54.833% | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-26 |
| 3 | Claude Opus 4.7 | 47.084% | Claude Opus 4.7 anthropic-claude-opus-4.7 | Imported | 2026-05-26 |
| 4 | Qwen 3.7 Max | 46.75% | Qwen3.7 Max qwen-qwen3.7-max | Imported | 2026-05-26 |
| 5 | GPT 5.3 Codex | 43.834% | GPT-5.3-Codex openai-gpt-5.3-codex | Imported | 2026-05-26 |
| 6 | Gemini 3 Flash Preview | 39.084% | Gemini 3 Flash Preview google-gemini-3-flash-preview | Imported | 2026-05-26 |
| 7 | Gemini 3 Pro Preview | 38.834% | Gemini 3 google-gemini-3 | Imported | 2026-05-26 |
| 8 | DeepSeek V4 Pro | 35.833% | DeepSeek V4 Pro deepseek-deepseek-v4-pro | Imported | 2026-05-26 |
| 9 | Grok 4.20 0309 Reasoning | 30.166% | Grok 4.20 x-ai-grok-4.20 | Imported | 2026-05-26 |
| 10 | Grok 4.0709 | 26.167% | Grok 4 x-ai-grok-4 | Imported | 2026-05-26 |
| 11 | Claude Opus 4.5 20251101 | 23.584% | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-26 |
| 12 | GLM 5 Thinking | 22% | GLM 5 z-ai-glm-5 | Imported | 2026-05-26 |
| 13 | GPT 5.1 2025-11-13 | 21.5% | GPT-5.1 openai-gpt-5.1 | Imported | 2026-05-26 |
| 14 | GPT 5.1 Codex Max | 21.416% | GPT-5.1-Codex-Max openai-gpt-5.1-codex-max | Imported | 2026-05-26 |
| 15 | Claude Opus 4.5 20251101 Thinking | 20.25% | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-26 |
| 16 | GPT 5.2025-08-07 | 20% | GPT-5 openai-gpt-5 | Imported | 2026-05-26 |
| 17 | Claude Sonnet 4.5 20250929 Thinking | 18.334% | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-26 |
| 18 | Kimi K2.5 Thinking | 17.667% | MoonshotAI: Kimi K2.5 moonshotai-kimi-k2.5 | Imported | 2026-05-26 |
| 19 | Gemini 2.5 Pro | 17.084% | Gemini 2.5 Pro google-gemini-2.5-pro | Imported | 2026-05-26 |
| 20 | Qwen 3 Max | 15.666% | Qwen3 Max qwen-qwen3-max | Imported | 2026-05-26 |
| 21 | Grok 4.3 | 15.334% | Grok 4.3 x-ai-grok-4.3 | Imported | 2026-05-26 |
| 22 | GPT 5.4 Nano 2026-03-17 | 15.25% | GPT-5.4 Nano openai-gpt-5.4-nano | Imported | 2026-05-26 |
| 23 | DeepSeek V3P2 | 14.416% | — | Imported | 2026-05-26 |
| 24 | Qwen 3 Max 2026-01-23 | 13.75% | — | Imported | 2026-05-26 |
| 25 | Claude Opus 4.1 20250805 | 12.516% | Claude Opus 4.1 anthropic-claude-opus-4.1 | Imported | 2026-05-26 |
| 26 | Grok 4 Fast Reasoning | 11.5% | Grok 4 Fast x-ai-grok-4-fast | Imported | 2026-05-26 |
| 27 | DeepSeek V3P2 Thinking | 10.667% | — | Imported | 2026-05-26 |
| 28 | GPT 5 Codex | 9.75% | GPT-5 Codex openai-gpt-5-codex | Imported | 2026-05-26 |
| 29 | Qwen 3 Max Preview | 7.75% | — | Imported | 2026-05-26 |
| 30 | Grok 4.1 Fast Non Reasoning | 7.667% | Grok 4.1 Fast x-ai-grok-4.1-fast | Imported | 2026-05-26 |
| 31 | GLM 4.7 | 7.584% | GLM 4.7 z-ai-glm-4.7 | Imported | 2026-05-26 |
| 32 | GPT 5 Mini 2025-08-07 | 6.75% | GPT-5 Mini openai-gpt-5-mini | Imported | 2026-05-26 |
| 33 | MiniMax M2.5 Lightning | 6.666% | — | Imported | 2026-05-26 |
| 34 | Claude Sonnet 4.20250514 | 6.5% | Claude Sonnet 4 anthropic-claude-sonnet-4 | Imported | 2026-05-26 |
| 35 | GPT 5.4 Mini 2026-03-17 | 6.417% | GPT-5.4 Mini openai-gpt-5.4-mini | Imported | 2026-05-26 |
| 36 | Claude Haiku 4.5 20251001 Thinking | 6.166% | Claude Haiku 4.5 anthropic-claude-haiku-4.5 | Imported | 2026-05-26 |
| 37 | MiniMax M2.7 | 4.916% | MiniMax M2.7 minimax-minimax-m2.7 | Imported | 2026-05-26 |
| 38 | O4 Mini 2025-04-16 | 4.834% | o4 Mini openai-o4-mini | Imported | 2026-05-26 |
| 39 | Claude Sonnet 4.20250514 Thinking | 4.584% | — | Imported | 2026-05-26 |
| 40 | GLM 4.6 | 4.334% | GLM 4.6 z-ai-glm-4.6 | Imported | 2026-05-26 |
| 41 | Grok Code Fast 1 | 4.333% | Grok Code Fast 1 x-ai-grok-code-fast-1 | Imported | 2026-05-26 |
| 42 | Mistral Large 2512 | 4% | Mistral: Mistral Large 3 2512 mistralai-mistral-large-2512 | Imported | 2026-05-26 |
| 43 | Grok 4 Fast Non Reasoning | 3.834% | Grok 4 Fast x-ai-grok-4-fast | Imported | 2026-05-26 |
| 44 | GPT 5.1 Codex | 3.666% | GPT-5.1-Codex openai-gpt-5.1-codex | Imported | 2026-05-26 |
| 45 | Grok 4.1 Fast Reasoning | 3.084% | Grok 4.1 Fast x-ai-grok-4.1-fast | Imported | 2026-05-26 |
| 46 | GLM 4.5 | 2.917% | GLM 4.5 z-ai-glm-4.5 | Imported | 2026-05-26 |
| 47 | Gemini 2.5 Flash | 2.611% | Gemini 2.5 Flash google-gemini-2.5-flash | Imported | 2026-05-26 |
| 48 | Labs Devstral Small 2512 | 2.5% | — | Imported | 2026-05-26 |
| 49 | MiniMax M2.1 | 2.334% | MiniMax M2.1 minimax-minimax-m2.1 | Imported | 2026-05-26 |
| 50 | DeepSeek V3 0324 | 1.667% | DeepSeek V3 0324 deepseek-deepseek-chat-v3-0324 | Imported | 2026-05-26 |
| 51 | Kimi K2 Instruct | 1.25% | MoonshotAI: Kimi K2 0711 moonshotai-kimi-k2 | Imported | 2026-05-26 |
| 52 | Devstral 2512 | 0.666% | Mistral: Devstral 2 2512 mistralai-devstral-2512 | Imported | 2026-05-26 |
| 53 | Magistral Medium 2509 | 0.666% | — | Imported | 2026-05-26 |
| 54 | Qwen 3 235B A22b | 0% | Qwen3 235B A22B qwen-qwen3-235b-a22b | Imported | 2026-05-26 |
No matching rows.