WikiSQL
WikiSQL: Measures structured-data reasoning over tables, spreadsheets, charts, databases, or data analysis tasks.
36rows
test_execution_accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
Test execution accuracy, Dev execution accuracy, Test logical form accuracy, Dev logical form accuracy
| Rank | Subject | Test execution accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | SeaD +Execution-Guided Decoding (Xu 2021) (Ant Group, Ada & ZhiXiaoBao) | 93.0 | — | Imported | 2026-05-27 |
| 2 | SDSQL +Execution-Guided Decoding (Hui 2020) (Alibaba Group) | 92.7 | — | Imported | 2026-05-27 |
| 3 | IE-SQL +Execution-Guided Decoding (Ma 2020) (Ping An Life, AI Team) | 92.5 | — | Imported | 2026-05-27 |
| 4 | HydraNet +Execution-Guided Decoding (Lyu 2020) (Microsoft Dynamics 365 AI) (code) | 92.2 | — | Imported | 2026-05-27 |
| 5 | BRIDGE^ +Execution-Guided Decoding (Lin 2020) (Salesforce Research) | 91.9 | — | Imported | 2026-05-27 |
| 6 | X-SQL +Execution-Guided Decoding (He 2019) | 91.8 | — | Imported | 2026-05-27 |
| 7 | SDSQL (Hui 2020) (Alibaba Group) | 91.4 | — | Imported | 2026-05-27 |
| 8 | BRIDGE^ (Lin 2020) (Salesforce Research) | 91.1 | — | Imported | 2026-05-27 |
| 9 | Text2SQLGen + EG (Mellah 2021) (Novelis.io Research) | 91.0 | — | Imported | 2026-05-27 |
| 10 | SeqGenSQL+EG (Li 2020) | 90.5 | — | Imported | 2026-05-27 |
| 11 | SeqGenSQL (Li 2020) | 90.3 | — | Imported | 2026-05-27 |
| 12 | (Guo 2019) +Execution-Guided Decoding with BERT-Base-Uncased^ | 90.1 | — | Imported | 2026-05-27 |
| 13 | SeaD (Xu 2021) (Ant Group, Ada & ZhiXiaoBao) | 90.1 | — | Imported | 2026-05-27 |
| 14 | SQLova +Execution-Guided Decoding (Hwang 2019) | 89.6 | — | Imported | 2026-05-27 |
| 15 | TAPEX (Liu 2022) | 89.5 | — | Imported | 2026-05-27 |
| 16 | (Guo 2019) with BERT-Base-Uncased^ | 89.2 | — | Imported | 2026-05-27 |
| 17 | HydraNet (Lyu 2020) (Microsoft Dynamics 365 AI) (code) | 89.2 | — | Imported | 2026-05-27 |
| 18 | IE-SQL (Ma 2020) (Ping An Life, AI Team) | 88.8 | — | Imported | 2026-05-27 |
| 19 | X-SQL (He 2019) | 88.7 | — | Imported | 2026-05-27 |
| 20 | IncSQL +Execution-Guided Decoding (Shi 2018) | 87.1 | — | Imported | 2026-05-27 |
| 21 | SQLova (Hwang 2019) | 86.2 | — | Imported | 2026-05-27 |
| 22 | HardEM (Min 2019) | 83.9 | — | Imported | 2026-05-27 |
| 23 | Execution-Guided Decoding (Wang 2018) | 83.8 | — | Imported | 2026-05-27 |
| 24 | IncSQL (Shi 2018) | 83.7 | — | Imported | 2026-05-27 |
| 25 | Auxiliary Mapping Task (Chang 2019) | 81.7 | — | Imported | 2026-05-27 |
| 26 | MQAN (ordered) (McCann 2018) | 81.4 | — | Imported | 2026-05-27 |
| 27 | MQAN (unordered) (McCann 2018) | 81.4 | — | Imported | 2026-05-27 |
| 28 | LatentAlignment (Wang 2019) | 79.3 | — | Imported | 2026-05-27 |
| 29 | Coarse2Fine (Dong 2018) | 78.5 | — | Imported | 2026-05-27 |
| 30 | TypeSQL (Yu 2018) | 73.5 | — | Imported | 2026-05-27 |
| 31 | (Guo 2018) | 69.0 | — | Imported | 2026-05-27 |
| 32 | PT-MAML (Huang 2018) | 68.0 | — | Imported | 2026-05-27 |
| 33 | SQLNet (Xu 2017) | 68.0 | — | Imported | 2026-05-27 |
| 34 | Wang 2017^ | 66.8 | — | Imported | 2026-05-27 |
| 35 | Seq2SQL (Zhong 2017) | 59.4 | — | Imported | 2026-05-27 |
| 36 | Baseline (Zhong 2017) | 35.9 | — | Imported | 2026-05-27 |
No matching rows.