Global-MMLU-Lite

A lightweight version of Global MMLU benchmark that evaluates language models across multiple languages while addressing cultural and linguistic biases in multilingual evaluation.

14rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Normalized Score

Latest Results

Rank Subject Score Model Match Provenance Sampled
1 Gemini 2.5 Pro Preview 06-05 0.89 Gemini 2.5 Pro Preview 06-05
google-gemini-2.5-pro-preview
Self-reported 2026-05-06
2 Gemini 2.5 Pro 0.89 Gemini 2.5 Pro
google-gemini-2.5-pro
Self-reported 2026-05-06
3 Gemini 2.5 Flash 0.88 Gemini 2.5 Flash
google-gemini-2.5-flash
Self-reported 2026-05-06
4 Gemini 2.5 Flash-Lite 0.81 Gemini 2.5 Flash Lite
google-gemini-2.5-flash-lite
Self-reported 2026-05-06
5 Gemini 2.0 Flash-Lite 0.78 Gemini 2.0 Flash Lite
google-gemini-2.0-flash-lite-001
Self-reported 2026-05-06
6 Gemma 3 27B 0.75 Gemma 3 27B
google-gemma-3-27b-it
Self-reported 2026-05-06
7 Gemma 3 12B 0.69 Gemma 3 12B
google-gemma-3-12b-it
Self-reported 2026-05-06
8 Gemini Diffusion 0.69 Self-reported 2026-05-06
9 Gemma 3n E4B Instructed LiteRT Preview 0.65 Gemma 3n 4B
google-gemma-3n-e4b-it
Self-reported 2026-05-06
9 Gemma 3n E4B Instructed 0.65 Gemma 3n 4B
google-gemma-3n-e4b-it
Self-reported 2026-05-06
11 Gemma 3n E2B Instructed 0.59 Gemma 3n 2B
google-gemma-3n-e2b-it
Self-reported 2026-05-06
11 Gemma 3n E2B Instructed LiteRT (Preview) 0.59 Gemma 3n 2B
google-gemma-3n-e2b-it
Self-reported 2026-05-06
13 Gemma 3 4B 0.55 Gemma 3 4B
google-gemma-3-4b-it
Self-reported 2026-05-06
14 Gemma 3 1B 0.34 Self-reported 2026-05-06