SQA

SQA: Measures structured-data reasoning over tables, spreadsheets, charts, databases, or data analysis tasks.

12rows
dev_accuracyprimary metric
2026-05-27sampled

Metadata

Metrics

Dev accuracy

Latest Results

Hugging Face google/tapas-large-finetuned-sqa model card SQA dev accuracy table.

Rank Subject Dev accuracy Model Match Provenance Sampled
1 TAPAS Base (reset) 0.874 Imported 2026-05-27
2 TAPAS Large (reset) 0.7289 Imported 2026-05-27
3 TAPAS Large (no reset) 0.7223 Imported 2026-05-27
4 TAPAS Base (no reset) 0.6737 Imported 2026-05-27
5 TAPAS Medium (reset) 0.6561 Imported 2026-05-27
6 TAPAS Medium (no reset) 0.6464 Imported 2026-05-27
7 TAPAS Small (reset) 0.6155 Imported 2026-05-27
8 TAPAS Small (no reset) 0.5876 Imported 2026-05-27
9 TAPAS Mini (reset) 0.5148 Imported 2026-05-27
10 TAPAS Mini (no reset) 0.4574 Imported 2026-05-27
11 TAPAS Tiny (reset) 0.2375 Imported 2026-05-27
12 TAPAS Tiny (no reset) 0.2004 Imported 2026-05-27