HREF | BenchmarkList

Metadata

ID: href
Category: Instruction Following
Release: 2024-12-20
Source: Source page
Snapshot: Snapshot source
Post: Announcement post

Metrics

Average, Brainstorm, Open QA, Closed QA, Extract, Generation, Rewrite, Summarize, Classify, Reasoning Over Numerical Data, Multi-Document Synthesis, Fact Checking or Attributed QA

Rank	Subject	Average	Model Match	Provenance	Sampled
1	meta-llama/Llama-3.1-70B-Instruct	48.98	Llama 3.1 70B Instruct meta-llama-llama-3.1-70b-instruct	Imported	2026-05-06
1	mistralai/Mistral-Large-Instruct-2407	48.39	—	Imported	2026-05-06
1	Qwen/Qwen2.5-72B-Instruct	46.21	Qwen2.5 72B Instruct qwen-qwen-2.5-72b-instruct	Imported	2026-05-06
4	allenai/Llama-3.1-Tulu-3-70B	43.68	—	Imported	2026-05-06
4	mistralai/Mistral-Small-Instruct-2409	42.87	—	Imported	2026-05-06
4	Qwen/Qwen1.5-110B-Chat	40.76	—	Imported	2026-05-06
7	meta-llama/Meta-Llama-3.1-8B-Instruct	38.57	—	Imported	2026-05-06
8	allenai/OLMo-2-1124-13B-Instruct	35.60	—	Imported	2026-05-06
8	01-ai/Yi-1.5-34B-Chat	35.10	—	Imported	2026-05-06
8	Qwen/Qwen2-72B-Instruct	33.71	—	Imported	2026-05-06
8	allenai/Llama-3.1-Tulu-3-8B	33.54	—	Imported	2026-05-06
12	microsoft/Phi-3-medium-4k-instruct	30.91	—	Imported	2026-05-06
12	allenai/OLMo-2-1124-7B-Instruct	28.49	—	Imported	2026-05-06
14	meta-llama/Llama-2-70b-chat-hf	23.90	—	Imported	2026-05-06
14	allenai/tulu-2-dpo-70b	22.67	—	Imported	2026-05-06
14	mistralai/Mistral-7B-Instruct-v0.3	22.66	—	Imported	2026-05-06
17	allenai/tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm	19.57	—	Imported	2026-05-06
17	meta-llama/Llama-2-13b-chat-hf	19.56	—	Imported	2026-05-06
17	WizardLMTeam/WizardLM-13B-V1.2	17.90	—	Imported	2026-05-06
20	meta-llama/Llama-2-7b-chat-hf	15.38	—	Imported	2026-05-06
20	allenai/tulu-2-dpo-13b	14.81	—	Imported	2026-05-06
22	lmsys/vicuna-13b-v1.5	12.99	—	Imported	2026-05-06
22	allenai/Llama-3.1-Tulu-3-70B-DPO	11.91	—	Imported	2026-05-06
24	allenai/tulu-2-dpo-7b	10.12	—	Imported	2026-05-06
24	lmsys/vicuna-7b-v1.5	9.50	—	Imported	2026-05-06
26	allenai/OLMo-7B-0724-Instruct-hf	7.50	—	Imported	2026-05-06
26	allenai/OLMo-7B-SFT	6.61	—	Imported	2026-05-06
26	nomic-ai/gpt4all-13b-snoozy	6.12	GPT-4 openai-gpt-4	Imported	2026-05-06
29	TheBloke/koala-13B-HF	5.66	—	Imported	2026-05-06
29	mosaicml/mpt-7b-chat	5.53	—	Imported	2026-05-06
31	TheBloke/koala-7B-HF	4.09	—	Imported	2026-05-06
31	databricks/dolly-v2-12b	3.53	—	Imported	2026-05-06
31	databricks/dolly-v2-7b	3.44	—	Imported	2026-05-06
34	OpenAssistant/oasst-sft-1-pythia-12b	2.18	—	Imported	2026-05-06

Metadata

Metrics

Latest Results