MMAU

MMAU evaluates audio understanding across sound, music, and speech tasks with test-mini and test splits.

29rows
avg_testprimary metric
2026-05-06sampled

Metadata

Metrics

Sound Test-mini, Sound Test, Music Test-mini, Music Test, Speech Test-mini, Speech Test, Avg Test-mini, Avg Test

Latest Results

Rows are ranked by Avg Test from the parsed MMAU-v05.15.25 leaderboard JSON. Source display names and source types are preserved.

Rank Subject Avg Test Model Match Provenance Sampled
1 Audio-Thinker 75.98 Imported 2026-05-06
2 Nova 2 Omni 75.28 Imported 2026-05-06
3 Step-Audio-2 73.86 Imported 2026-05-06
4 MiMo-Audio 72.59 Imported 2026-05-06
5 Audio Flamingo 3 72.42 Imported 2026-05-06
6 Qwen2.5-Omni 71 Imported 2026-05-06
7 Step-Audio-2-mini 70.23 Imported 2026-05-06
8 Gemini 2.5 Pro 69.36 Gemini 2.5 Pro
google-gemini-2.5-pro
Imported 2026-05-06
9 Gemini 2.5 Flash 67.39 Gemini 2.5 Flash
google-gemini-2.5-flash
Imported 2026-05-06
10 Gemini 2.0 Flash 67.03 Gemini 2.0 Flash
google-gemini-2.0-flash
Imported 2026-05-06
11 DeSTA2.5-Audio 65.21 Imported 2026-05-06
12 Kimi-Audio 64.40 Imported 2026-05-06
13 Audio Reasoner 63.78 Imported 2026-05-06
14 Phi-4-multimodal 62.81 Imported 2026-05-06
15 Gemini 2.5 Flash Lite 61.61 Gemini 2.5 Flash Lite
google-gemini-2.5-flash-lite
Imported 2026-05-06
16 Audio Flamingo 2 61.06 Imported 2026-05-06
17 GPT-4o Audio 60.82 GPT-4o Audio
openai-gpt-4o-audio-preview
Imported 2026-05-06
18 Qwen2-Audio-Instruct 57.40 Imported 2026-05-06
19 Gemma 3n 55.20 Imported 2026-05-06
20 Gemma 3n 52.06 Imported 2026-05-06
21 GPT-4o mini Audio 51.03 GPT-4
openai-gpt-4
Imported 2026-05-06
22 M2UGen 39.76 Imported 2026-05-06
23 MusiLingo 38.29 Imported 2026-05-06
24 SALMONN 36.23 Imported 2026-05-06
25 MuLLaMa 25.91 Imported 2026-05-06
26 GAMA-IT 22.22 Imported 2026-05-06
27 GAMA 21.68 Imported 2026-05-06
28 LTU 17.23 Imported 2026-05-06
29 Audio Flamingo Chat 15.59 Imported 2026-05-06