ARC Challenge

ARC Challenge: Evaluates broad language-model knowledge, reasoning, commonsense, instruction following, or exam-style accuracy.

10rows
arc_challenge_test_accuracyprimary metric
2026-05-27sampled

Metadata

Metrics

ARC Challenge test accuracy, ARC Easy test accuracy

Latest Results

ARC paper Table baselines test accuracy rows; the paper names ARC Easy as the Additional Set.

Rank Subject ARC Challenge test accuracy Model Match Provenance Sampled
1 DGEM 27.11 Imported 2026-05-27
2 TableILP 26.97 Imported 2026-05-27
3 BiDAF 26.54 Imported 2026-05-27
4 DGEM-OpenIE 26.41 Imported 2026-05-27
5 Guess-all random 25.02 Imported 2026-05-27
6 DecompAttn 24.34 Imported 2026-05-27
7 TupleInference 23.83 Imported 2026-05-27
8 IR (using ARC Corpus) 20.26 Imported 2026-05-27
9 PMI (dataset definition) 2.03 Imported 2026-05-27
10 IR (dataset definition) 1.02 Imported 2026-05-27