MCP Atlas | BenchmarkList

Metadata

ID: mcp_atlas
Category: Agentic
Release: 2026-01-01
Source: Source page
Snapshot: Snapshot source

Metrics

Score, Confidence Interval Upper, Max Score

Showing 5 latest source slices.

Rank	Subject	Score	Model Match	Provenance	Sampled
1	Claude Opus 4.8	82.2%	Claude Opus 4.8 anthropic-claude-opus-4.8	Self-reported	2026-05-28
2	Claude Opus 4.7	79.1%	Claude Opus 4.7 anthropic-claude-opus-4.7	Self-reported	2026-05-28
3	Gemini 3.1 Pro Preview	78.2%	Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview	Self-reported	2026-05-28
4	GPT-5.5	75.3%	GPT-5.5 openai-gpt-5.5	Self-reported	2026-05-28
1	Qwen3.7 Max	76.4%	Qwen3.7 Max qwen-qwen3.7-max	Self-reported	2026-05-28
2	Claude Opus 4.6 Max	75.8%	Claude Opus 4.6 anthropic-claude-opus-4.6	Self-reported	2026-05-28
3	Qwen3.6 Plus	74.1%	Qwen3.6 Plus qwen-qwen3.6-plus	Self-reported	2026-05-28
4	DeepSeek V4 Pro Max	73.6%	DeepSeek V4 Pro deepseek-deepseek-v4-pro	Self-reported	2026-05-28
5	GLM-5.1 Thinking	71.8%	GLM GLM 5.1 z-ai-glm-5.1	Self-reported	2026-05-28
6	Kimi K2.6 Thinking	66.6%	KIMI MoonshotAI: Kimi K2.6 moonshotai-kimi-k2.6	Self-reported	2026-05-28
1	Muse Spark	82.20	—	Imported	2026-05-06
1	claude-opus-4-7 (max)	79.10	Claude Opus 4.7 anthropic-claude-opus-4.7	Imported	2026-05-06
1	gemini-3.1-pro-preview (high)	78.20	Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview	Imported	2026-05-06
2	claude-opus-4-6 (max)	76.80	Claude Opus 4.6 anthropic-claude-opus-4.6	Imported	2026-05-06
2	glm-5p1	75.60	GLM GLM 5.1 z-ai-glm-5.1	Imported	2026-05-06
2	gpt-5.5 (xhigh)	75.30	GPT-5.5 openai-gpt-5.5	Imported	2026-05-06
5	gpt-5.4 (xhigh)	70.60	GPT-5.4 openai-gpt-5.4	Imported	2026-05-06
5	gemini-3-pro-preview	70.30	Gemini 3 google-gemini-3	Imported	2026-05-06
6	claude-opus-4-5 (high)	69.80	Claude Opus 4.5 anthropic-claude-opus-4.5	Imported	2026-05-06
7	claude-sonnet-4-6	69.50	Claude Sonnet 4.6 anthropic-claude-sonnet-4.6	Imported	2026-05-06
7	gpt-5.2 (xhigh)	67.60	GPT-5.2 openai-gpt-5.2	Imported	2026-05-06
9	kimi-k2p5	64.40	—	Imported	2026-05-06
11	gemini-3-flash-preview	62	Gemini 3 Flash Preview google-gemini-3-flash-preview	Imported	2026-05-06
12	claude-sonnet-4-5 (thinking)	59.50	Claude Sonnet 4.5 anthropic-claude-sonnet-4.5	Imported	2026-05-06
13	glm-4p7	58.10	GLM GLM 4.7 z-ai-glm-4.7	Imported	2026-05-06
13	gemini-3.1-flash-lite (high)	57.10	—	Imported	2026-05-06
13	gpt-5.4-mini (xhigh)	56.70	GPT-5.4 Mini openai-gpt-5.4-mini	Imported	2026-05-06
18	gpt-5.1 (high)	50.10	GPT-5.1 openai-gpt-5.1	Imported	2026-05-06
18	o3-pro	44.50	o3 Pro openai-o3-pro	Imported	2026-05-06
19	claude-haiku-4-5	40.20	Claude Haiku 4.5 anthropic-claude-haiku-4.5	Imported	2026-05-06
1	Claude Opus 4.7	79.1%	Claude Opus 4.7 anthropic-claude-opus-4.7	Launch post	2026-04-23
2	Gemini 3.1 Pro Preview	78.2%	Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview	Launch post	2026-04-23
3	GPT-5.5	75.3%	GPT-5.5 openai-gpt-5.5	Launch post	2026-04-23
4	GPT-5.4	70.6%	GPT-5.4 openai-gpt-5.4	Launch post	2026-04-23
1	Claude Opus 4.7	77.3%	Claude Opus 4.7 anthropic-claude-opus-4.7	Launch post	2026-04-16
2	Claude Opus 4.6	75.8%	Claude Opus 4.6 anthropic-claude-opus-4.6	Launch post	2026-04-16
3	Gemini 3.1 Pro Preview	73.9%	Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview	Launch post	2026-04-16
4	GPT-5.4	68.1%	GPT-5.4 openai-gpt-5.4	Launch post	2026-04-16

Metadata

Metrics

Latest Results