MCP-Bench | BenchmarkList

Metadata

Overall Score

Rank	Subject	Overall Score	Model Match	Provenance	Sampled
1	gpt-5	0.75	—	Imported	2026-05-06
2	o3	0.71	—	Imported	2026-05-06
3	gpt-oss-120b	0.69	—	Imported	2026-05-06
4	gemini-2.5-pro	0.69	—	Imported	2026-05-06
5	claude-sonnet-4	0.68	—	Imported	2026-05-06
6	qwen3-235b-a22b-2507	0.68	—	Imported	2026-05-06
7	glm-4.5	0.67	—	Imported	2026-05-06
8	gpt-oss-20b	0.65	—	Imported	2026-05-06
9	kimi-k2	0.63	—	Imported	2026-05-06
10	qwen3-30b-a3b-instruct-2507	0.63	—	Imported	2026-05-06
11	gemini-2.5-flash-lite	0.60	—	Imported	2026-05-06
12	gpt-4o	0.59	—	Imported	2026-05-06
13	gemma-3-27b-it	0.58	—	Imported	2026-05-06
14	llama-3-3-70b-instruct	0.56	—	Imported	2026-05-06
15	gpt-4o-mini	0.56	—	Imported	2026-05-06
16	mistral-small-2503	0.53	—	Imported	2026-05-06
17	llama-3-1-70b-instruct	0.51	—	Imported	2026-05-06
18	nova-micro-v1	0.51	—	Imported	2026-05-06
19	llama-3-2-90b-vision-instruct	0.49	—	Imported	2026-05-06
20	llama-3-1-8b-instruct	0.43	—	Imported	2026-05-06