OpenBookQA

OpenBookQA is a question-answering dataset modeled after open book exams for assessing human understanding. It contains 5,957 multiple-choice elementary-level science questions that probe understanding of 1,326 core science facts and their application to novel situations, requiring combination of open book facts with broad common knowledge through multi-hop reasoning.

5rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Normalized Score

Latest Results

Rank Subject Score Model Match Provenance Sampled
1 Phi-3.5-MoE-instruct 0.90 Self-reported 2026-05-06
2 Phi-3.5-mini-instruct 0.79 Self-reported 2026-05-06
2 Phi 4 Mini 0.79 Self-reported 2026-05-06
4 Mistral NeMo Instruct 0.61 Mistral: Mistral Nemo
mistralai-mistral-nemo
Self-reported 2026-05-06
5 Hermes 3 70B 0.49 Self-reported 2026-05-06