DataBench

Real-world tabular question-answering benchmark over many datasets, used in SemEval 2025 Task 8.

37rows
accuracyprimary metric
2026-05-27sampled

Metadata

Metrics

DataBench accuracy, DataBench Lite accuracy

Latest Results

Rows are transcribed from the public SemEval-2025 Task 8 paper open-model rankings. Primary score is DataBench Subtask A accuracy.

Rank Subject DataBench accuracy Model Match Provenance Sampled
1 TeleAI 95.02% Imported 2026-05-27
2 SRPOL AIS 89.66% Imported 2026-05-27
3 HITSZ-HLT 86.97% Imported 2026-05-27
4 G-MACT 86.02% Imported 2026-05-27
5 SBU-NLP 85.63% Imported 2026-05-27
6 Oseibrefo-Liang 84.67% Imported 2026-05-27
7 ITU-NLP 84.1% Imported 2026-05-27
8 grazh 83.72% Imported 2026-05-27
9 Howard University- AI4PC 81.42% Imported 2026-05-27
10 QleverAnswering-PUCRS 81.03% Imported 2026-05-27
11 I2R-NLP 80.65% Imported 2026-05-27
12 anotheroption 80.08% Imported 2026-05-27
13 Exploration Lab IITK 79.69% Imported 2026-05-27
14 CCNUNLP 79.5% Imported 2026-05-27
15 Sherlok 79.31% Imported 2026-05-27
16 Saama Technologies 78.35% Imported 2026-05-27
17 ScottyPoseidon 76.63% Imported 2026-05-27
18 MINDS 72.41% Imported 2026-05-27
19 MRT 70.5% Imported 2026-05-27
20 Dataground 68.97% Imported 2026-05-27
21 IUST_Champs 68.77% Imported 2026-05-27
22 LyS Group 67.62% Imported 2026-05-27
23 NexGenius 65.64% Imported 2026-05-27
24 Tree-Search 64.56% Imported 2026-05-27
25 TableWise 63.98% Imported 2026-05-27
26 Myo Thiha 62.45% Imported 2026-05-27
27 tabaqa_team 54.79% Imported 2026-05-27
28 nevvton 52.87% Imported 2026-05-27
29 Basharat Ali 43.1% Imported 2026-05-27
30 AlphaPro 38.46% Imported 2026-05-27
31 CAILMD-24 36.4% Imported 2026-05-27
32 baseline 26.0% Imported 2026-05-27
33 Laughter (open rank 32) 10.54% Imported 2026-05-27
34 Laughter (open rank 33) 8.24% Imported 2026-05-27
35 TQASSN 7.85% Imported 2026-05-27
36 SUT 3.7% Imported 2026-05-27
37 fahimebehzadi 1.64% Imported 2026-05-27