FinQA
FinQA: Evaluates financial analysis, accounting, market reasoning, and quantitative business tasks.
13rows
execution_accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
Execution accuracy, Program accuracy
| Rank | Subject | Execution accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Human Expert Performance | 91.16 | — | Imported | 2026-05-27 |
| 2 | FinQANet-Gold (RoBERTa-large) | 70.0 | — | Imported | 2026-05-27 |
| 3 | FinQANet (RoBERTa-large) | 61.24 | — | Imported | 2026-05-27 |
| 4 | FinQANet (RoBERTa-base) | 56.1 | — | Imported | 2026-05-27 |
| 5 | FinQANet (BERT-large) | 53.52 | — | Imported | 2026-05-27 |
| 6 | General Crowd Performance | 50.68 | — | Imported | 2026-05-27 |
| 7 | FinQANet (FinBert) | 50.1 | — | Imported | 2026-05-27 |
| 8 | FinQANet (BERT-base) | 50.0 | — | Imported | 2026-05-27 |
| 9 | Retriever + NeRd (BERT-base) | 48.57 | — | Imported | 2026-05-27 |
| 10 | Pre-Trained Longformer (base) | 21.9 | — | Imported | 2026-05-27 |
| 11 | Retriever + Seq2seq | 19.71 | — | Imported | 2026-05-27 |
| 12 | TF-IDF + Single Op | 1.01 | — | Imported | 2026-05-27 |
| 13 | Retriever + Direct Generation | 0.3 | — | Imported | 2026-05-27 |
No matching rows.