SQA
SQA: Measures structured-data reasoning over tables, spreadsheets, charts, databases, or data analysis tasks.
12rows
dev_accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
Dev accuracy
| Rank | Subject | Dev accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | TAPAS Base (reset) | 0.874 | — | Imported | 2026-05-27 |
| 2 | TAPAS Large (reset) | 0.7289 | — | Imported | 2026-05-27 |
| 3 | TAPAS Large (no reset) | 0.7223 | — | Imported | 2026-05-27 |
| 4 | TAPAS Base (no reset) | 0.6737 | — | Imported | 2026-05-27 |
| 5 | TAPAS Medium (reset) | 0.6561 | — | Imported | 2026-05-27 |
| 6 | TAPAS Medium (no reset) | 0.6464 | — | Imported | 2026-05-27 |
| 7 | TAPAS Small (reset) | 0.6155 | — | Imported | 2026-05-27 |
| 8 | TAPAS Small (no reset) | 0.5876 | — | Imported | 2026-05-27 |
| 9 | TAPAS Mini (reset) | 0.5148 | — | Imported | 2026-05-27 |
| 10 | TAPAS Mini (no reset) | 0.4574 | — | Imported | 2026-05-27 |
| 11 | TAPAS Tiny (reset) | 0.2375 | — | Imported | 2026-05-27 |
| 12 | TAPAS Tiny (no reset) | 0.2004 | — | Imported | 2026-05-27 |
No matching rows.