CMMLU

CMMLU (Chinese Massive Multitask Language Understanding) is a comprehensive Chinese benchmark that evaluates the knowledge and reasoning capabilities of large language models across 67 different subject topics. The benchmark covers natural sciences, social sciences, engineering, and humanities with multiple-choice questions ranging from basic to advanced professional levels.

5rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Normalized Score

Latest Results

Rank Subject Score Model Match Provenance Sampled
1 Qwen2 72B Instruct 0.90 Self-reported 2026-05-06
2 LongCat-Flash-Chat 0.84 Self-reported 2026-05-06
3 LongCat-Flash-Lite 0.82 Self-reported 2026-05-06
4 MiniCPM-SALA 0.82 Self-reported 2026-05-06
5 ERNIE 4.5 0.40 ERNIE 4.5 300B A47B
baidu-ernie-4.5-300b-a47b
Self-reported 2026-05-06