SWE-Arena

Arena-style leaderboard for software-engineering agents, reporting Elo, win rate, conversation efficiency, consistency, and graph-derived ranking metrics.

6rows
elo_scoreprimary metric
2026-05-27sampled

Metadata

Metrics

Elo Score, Win Rate, Conversation Efficiency Index, Conversation Consistency Index, Bradley-Terry Coefficient, Eigenvector Centrality, Newman Modularity Score, PageRank Score

Latest Results

Rows parsed from SWE-Arena's public Hugging Face agent arena JSON. The benchmark tracks software-engineering agent arena scores and graph metrics.

Rank Subject Elo Score Model Match Provenance Sampled
1 Qwen Code 1030.53 Imported 2026-05-27
2 Gemini CLI 1018.07 Imported 2026-05-27
3 OpenAI Codex 1009.15 Imported 2026-05-27
4 Claude Code 1003.31 Imported 2026-05-27
5 Grok CLI 969.47 Imported 2026-05-27
6 Kimi Code 969.47 Imported 2026-05-27