Gaokao-Bench

Gaokao-Bench: Evaluates broad language-model knowledge, reasoning, commonsense, instruction following, or exam-style accuracy.

13rows
objective_overall_accuracyprimary metric
2026-05-27sampled

Metadata

Metrics

Objective Overall, Objective Chinese, Objective English, Objective Science Math, Objective Humanities Math, Objective Physics, Objective Chemistry, Objective Biology, Objective Politics, Objective History, Objective Geography, Subjective Overall, Subjective Chinese, Subjective English, Subjective Science Math, Subjective Humanities Math, Subjective Physics, Subjective Chemistry, Subjective Biology, Subjective Politics, Subjective History, Subjective Geography

Latest Results

Rows are parsed from the public GAOKAO-Bench README objective and subjective score-rate tables. The primary score is objective overall accuracy.

Rank Subject Objective Overall Model Match Provenance Sampled
1 GPT-4-0314 72.2% GPT-4
openai-gpt-4
Imported 2026-05-27
2 GPT-4-0613 71.6% GPT-4
openai-gpt-4
Imported 2026-05-27
3 Gemini-Pro 57.9% Imported 2026-05-27
4 ERNIE-Bot-0615 56.6% Imported 2026-05-27
5 GPT-3.5-turbo-0301 53.2% GPT-3.5 Turbo
openai-gpt-3.5-turbo
Imported 2026-05-27
6 ERNIE-Bot-turbo-0725 45.6% Imported 2026-05-27
7 Baichuan2-13b-Chat 43.9% Imported 2026-05-27
8 ChatGLM2-6b 42.7% Imported 2026-05-27
9 Baichuan2-7b-Chat 40.5% Imported 2026-05-27
10 ChatGLM-6b 30.8% Imported 2026-05-27
11 Baichuan2-7b-Base 27.2% Imported 2026-05-27
12 LLaMA-7b 21.1% Imported 2026-05-27
13 Vicuna-7b 21% Imported 2026-05-27