Rogo Big Finance Bench

Vendor-reported 928-question finance-agent benchmark spanning vertical-specific skills, metrics, financial-statement analysis, and forecasting workflows.

10rows
rubric_scoreprimary metric
2026-05-28sampled

Metadata

Metrics

Rubric Score, Final-Answer Accuracy

Latest Results

Rows are imported from the public Figure 1 leaderboard in Rogo's Big Finance Bench announcement. Scores are vendor-reported and should be treated as announcement results until the paper/subset/harness are released.

Rank Subject Rubric Score Model Match Provenance Sampled
1 Claude Opus 4.7 59% rubric / 41% final Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-28
2 GPT-5.5 59% rubric / 44% final GPT-5.5
openai-gpt-5.5
Imported 2026-05-28
3 Claude Sonnet 4.6 59% rubric / 38% final Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-28
4 GLM 5.1 55% rubric / 36% final GLM GLM 5.1
z-ai-glm-5.1
Imported 2026-05-28
5 Qwen 3.6 27B 47% rubric / 30% final Qwen3.6 27B
qwen-qwen3.6-27b
Imported 2026-05-28
6 Kimi K2-6 45% rubric / 27% final KIMI MoonshotAI: Kimi K2.6
moonshotai-kimi-k2.6
Imported 2026-05-28
7 Gemini 3 Flash 43% rubric / 26% final Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-28
8 Gemini 3.1 Pro 41% rubric / 35% final Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-28
9 Gemma 4.5 1B 35% rubric / 21% final Imported 2026-05-28
10 GPT-5.4 Mini 22% rubric / 7% final GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-28