INVESTORBENCH
Financial decision-making benchmark for investment agents, evaluating portfolio or trading decisions rather than only financial QA.
14rows
cumulative_returnprimary metric
2026-05-27sampled
Metadata
Metrics
Average stock cumulative return, Average stock Sharpe ratio, Average stock annualized volatility (lower is better), Average stock maximum drawdown (lower is better)
| Rank | Subject | Average stock cumulative return | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Qwen2.5-72B-Instruct | 46.153% | Qwen2.5 72B Instruct qwen-qwen-2.5-72b-instruct | Imported | 2026-05-27 |
| 2 | GPT-4 | 43.696% | GPT-4 openai-gpt-4 | Imported | 2026-05-27 |
| 3 | GPT-4o | 39.031% | GPT-4o openai-gpt-4o | Imported | 2026-05-27 |
| 4 | Llama-3.1-70B-Instruct | 38.946% | Llama 3.1 70B Instruct meta-llama-llama-3.1-70b-instruct | Imported | 2026-05-27 |
| 5 | Yi-1.5-34B-Chat | 37.966% | — | Imported | 2026-05-27 |
| 6 | Buy & Hold | 34.099% | — | Imported | 2026-05-27 |
| 7 | Qwen-2.5-Instruct-7B | 29.515% | — | Imported | 2026-05-27 |
| 8 | DeepSeek-V2-Lite (15.7B) | 28.745% | — | Imported | 2026-05-27 |
| 9 | DeepSeek-67B-Chat | 26.941% | — | Imported | 2026-05-27 |
| 10 | Llama-3.1-8B-Instruct | 25.463% | Llama 3.1 8B Instruct meta-llama-llama-3.1-8b-instruct | Imported | 2026-05-27 |
| 11 | GPT-o1-preview | 25.057% | — | Imported | 2026-05-27 |
| 12 | Yi-1.5-9B-Chat | 22.913% | — | Imported | 2026-05-27 |
| 13 | Qwen2.5-32B-Instruct | 20.884% | — | Imported | 2026-05-27 |
| 14 | Palmyra-Fin-70B | -0.453% | — | Imported | 2026-05-27 |
No matching rows.