OmniGAIA

Omni-modal general AI assistant benchmark evaluating long-horizon multimodal agents over vision, audio, language, web search, browser, and code-execution tasks.

18rows
overallprimary metric
2026-05-06sampled

Metadata

Metrics

Overall, Easy, Medium, Hard, Geography, Technology, History, Finance, Sports, Arts, Movies, Science, Food

Latest Results

Rows are parsed from the public leaderboard Space results.json. Scores are Pass@1 accuracy percentages by difficulty and domain category.

Rank Subject Overall Model Match Provenance Sampled
1 Orchestra-o1-GPT-5 72.80 Imported 2026-05-06
2 Gemini-3-Pro 62.50 Imported 2026-05-06
3 Qwen3.5-Omni-Plus 57.20 Imported 2026-05-06
4 Gemini-3-Flash 51.70 Imported 2026-05-06
5 Qwen3.5-Omni-Flash 33.90 Imported 2026-05-06
6 Gemini-2.5-Pro 30.80 Imported 2026-05-06
7 OmniAtlas-Qwen-3 20.80 Imported 2026-05-06
8 Qwen-3-Omni 13.30 Imported 2026-05-06
9 OmniAtlas-Qwen-2.5 13.30 Imported 2026-05-06
10 LongCat-Flash-Omni 11.10 Imported 2026-05-06
11 OmniAtlas-Qwen-2.5 10.30 Imported 2026-05-06
12 Gemini-2.5-Flash-Lite 8.60 Imported 2026-05-06
13 Ming-Flash-Omni 8.30 Imported 2026-05-06
14 Ming-Lite-Omni-1.5 3.90 Imported 2026-05-06
15 Qwen-2.5-Omni 3.60 Imported 2026-05-06
16 MiniCPM-O-2.6 3.10 Imported 2026-05-06
17 Baichuan-Omni-1.5 2.80 Imported 2026-05-06
18 Qwen-2.5-Omni 1.40 Imported 2026-05-06