InfiniteBench | BenchmarkList

Metadata

Average score, Retrieve.PassKey, Retrieve.Number, Retrieve.KV, En.Sum, En.QA, En.MC, En.Dia, Zh.QA, Code.Debug, Code.Run, Math.Calc, Math.Find

Rank	Subject	Average score	Model Match	Provenance	Sampled
1	GPT-4	46.099167%	GPT-4 openai-gpt-4	Imported	2026-05-27
2	Claude 2	37.843333%	—	Imported	2026-05-27
3	Kimi-Chat	35.325%	—	Imported	2026-05-27
4	Yi-34B-200K	27.406667%	—	Imported	2026-05-27
5	Yi-6B-200K	24.584167%	—	Imported	2026-05-27
6	YaRN-Mistral-7B	21.460833%	—	Imported	2026-05-27
7	ChatGLM-3-6B-128K	19.4525%	—	Imported	2026-05-27