ARC Challenge
ARC Challenge: Evaluates broad language-model knowledge, reasoning, commonsense, instruction following, or exam-style accuracy.
10rows
arc_challenge_test_accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
ARC Challenge test accuracy, ARC Easy test accuracy
| Rank | Subject | ARC Challenge test accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | DGEM | 27.11 | — | Imported | 2026-05-27 |
| 2 | TableILP | 26.97 | — | Imported | 2026-05-27 |
| 3 | BiDAF | 26.54 | — | Imported | 2026-05-27 |
| 4 | DGEM-OpenIE | 26.41 | — | Imported | 2026-05-27 |
| 5 | Guess-all random | 25.02 | — | Imported | 2026-05-27 |
| 6 | DecompAttn | 24.34 | — | Imported | 2026-05-27 |
| 7 | TupleInference | 23.83 | — | Imported | 2026-05-27 |
| 8 | IR (using ARC Corpus) | 20.26 | — | Imported | 2026-05-27 |
| 9 | PMI (dataset definition) | 2.03 | — | Imported | 2026-05-27 |
| 10 | IR (dataset definition) | 1.02 | — | Imported | 2026-05-27 |
No matching rows.