FinBen

FinBen: Evaluates financial analysis, accounting, market reasoning, and quantitative business tasks.

7rows
cumulative_returnprimary metric
2026-05-27sampled

Metadata

Metrics

Cumulative Return, Cumulative Return 95% CI (lower is better), Sharpe Ratio, Sharpe Ratio 95% CI (lower is better), Daily Volatility (lower is better), Annualized Volatility (lower is better), Max Drawdown (lower is better)

Latest Results

Rows are transcribed from public FinBen NeurIPS 2024 Table 4, covering average FinTrade stock-trading performance across 10 stocks. Primary score is cumulative return.

Rank Subject Cumulative Return Model Match Provenance Sampled
1 GPT-4 28.19% GPT-4
openai-gpt-4
Imported 2026-05-27
2 gemini 14.95% Imported 2026-05-27
3 GPT3.5-Turbo 4.48% GPT-3.5 Turbo
openai-gpt-3.5-turbo
Imported 2026-05-27
4 llama2-70B 4.02% Imported 2026-05-27
5 llama3-70B -2.57% Imported 2026-05-27
6 Buy & Hold -4.0% Imported 2026-05-27
7 GPT-4o -5.54% GPT-4o
openai-gpt-4o
Imported 2026-05-27