Fin-RATE
Financial analytics benchmark over SEC filings covering detail/reasoning QA, enterprise comparison QA, and longitudinal tracking QA.
17rows
macro_accuracyprimary metric
2026-05-28sampled
Metadata
Metrics
Macro Accuracy, DR-QA Accuracy, EC-QA Accuracy, LT-QA Accuracy
| Rank | Subject | Macro Accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-5-websearch | 43.37% | GPT-5 openai-gpt-5 | Imported | 2026-05-28 |
| 2 | GPT-4.1 | 33.24% | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-28 |
| 3 | GPT-4.1-websearch | 31.80% | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-28 |
| 4 | Qwen3-235B | 24.39% | Qwen3 235B A22B qwen-qwen3-235b-a22b | Imported | 2026-05-28 |
| 5 | Fin-R1 | 23.51% | — | Imported | 2026-05-28 |
| 6 | GPT-OSS-20B | 18.69% | gpt-oss-20b openai-gpt-oss-20b | Imported | 2026-05-28 |
| 7 | MIMO-V2-Flash | 18.05% | MiMo-V2-Flash xiaomi-mimo-v2-flash | Imported | 2026-05-28 |
| 8 | Qwen3-30B-A3B-Instruct-2507 | 17.63% | Qwen3 30B A3B Instruct 2507 qwen-qwen3-30b-a3b-instruct-2507 | Imported | 2026-05-28 |
| 9 | Llama-3.3-70B-Instruct | 16.76% | Llama 3.3 70B Instruct meta-llama-llama-3.3-70b-instruct | Imported | 2026-05-28 |
| 10 | DeepSeek-V3.2 | 16.32% | DeepSeek V3.2 deepseek-deepseek-v3.2 | Imported | 2026-05-28 |
| 11 | DeepSeek-R1 | 15.53% | R1 deepseek-r1 | Imported | 2026-05-28 |
| 12 | Fino1-14B | 13.13% | — | Imported | 2026-05-28 |
| 13 | Qwen3-14B | 11.25% | Qwen3 14B qwen-qwen3-14b | Imported | 2026-05-28 |
| 14 | DeepSeek-V3 | 9.81% | DeepSeek V3 deepseek-deepseek-chat | Imported | 2026-05-28 |
| 15 | Qwen3-8B | 5.48% | Qwen3 8B qwen-qwen3-8b | Imported | 2026-05-28 |
| 16 | FinanceConnect-13B | 2.65% | — | Imported | 2026-05-28 |
| 17 | TouchstoneGPT-7B-Instruct | 0.41% | — | Imported | 2026-05-28 |
No matching rows.