TAT-QA
TAT-QA evaluates question answering and numerical reasoning over hybrid financial tables and associated text.
28rows
f1primary metric
2026-05-06sampled
Metadata
Metrics
Exact Match, F1
| Rank | Subject | F1 | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Human Performance | 90.80 | — | Imported | 2026-05-06 |
| 2 | TAT-LLM (70B) | 88.40 | — | Imported | 2026-05-06 |
| 3 | TAT-LLM (13B) | 85.90 | — | Imported | 2026-05-06 |
| 4 | TAT-LLM (7B) | 85.10 | — | Imported | 2026-05-06 |
| 5 | MATATA-8B | 84.90 | — | Imported | 2026-05-06 |
| 6 | Code Generation for Table-Text Question using LLM (70B) | 84.70 | — | Imported | 2026-05-06 |
| 7 | AeNER: Attention-enhanced Numerical Embeddings for Reasoning | 83.20 | — | Imported | 2026-05-06 |
| 8 | MATATA-3.8B | 82.40 | — | Imported | 2026-05-06 |
| 9 | Code Generation for Table-Text Question using LLM (13B) | 81.80 | — | Imported | 2026-05-06 |
| 10 | Encore | 80.10 | — | Imported | 2026-05-06 |
| 11 | KFEX-N: A Table-Text QA Model with Knowledge-Fused Encoder & EX-N Tree Decoder | 79.50 | — | Imported | 2026-05-06 |
| 12 | MVGE: Multi-View Graph Encoder for Answering Hybrid Numerical Reasoning Question | 79.10 | — | Imported | 2026-05-06 |
| 13 | RegHNT: Relational graph neural network with special multitask decoder | 77.90 | — | Imported | 2026-05-06 |
| 14 | Code Generation for Table-Text Question using LLM (7B) | 77.30 | — | Imported | 2026-05-06 |
| 15 | UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation | 76 | — | Imported | 2026-05-06 |
| 16 | SoarGraph: Semantic-Oriented Hierarchical Graphs | 75.30 | — | Imported | 2026-05-06 |
| 17 | RSTQA: Rounds Specified numerical reasoning for Table-Text QA | 75 | — | Imported | 2026-05-06 |
| 18 | MHST: Multi-Head with Sequence to Expression Tree | 72.70 | — | Imported | 2026-05-06 |
| 19 | UniPCQA | 72.20 | — | Imported | 2026-05-06 |
| 20 | GANO: GNN for Tabular and Textual QA with Numerical Reasoning | 72.10 | — | Imported | 2026-05-06 |
| 21 | TBC | 68.70 | — | Imported | 2026-05-06 |
| 22 | FinMath: Injecting a Tree-structured Solver for Question Answering over Financial Reports | 68.20 | — | Imported | 2026-05-06 |
| 23 | KIQA: Knowledge-infused QA Model for Table and Text | 67.40 | — | Imported | 2026-05-06 |
| 24 | GenQA:Generative model for QA from table and text | 65.60 | — | Imported | 2026-05-06 |
| 25 | LETTER: Logic Enhanced Table-Text Reasoning | 64.30 | — | Imported | 2026-05-06 |
| 26 | OPERA-H | 63.80 | — | Imported | 2026-05-06 |
| 27 | TeaBReaC-pretrained T5-3B | 63.80 | — | Imported | 2026-05-06 |
| 28 | Baseline - TagOp | 58 | — | Imported | 2026-05-06 |
No matching rows.