RewardBench

Reward model benchmark evaluating preference models across chat, hard chat, safety, reasoning, and prior preference-evaluation sets.

188rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Chat, Chat Hard, Safety, Reasoning, Prior Sets (0.5 weight)

Latest Results

Rows ranked by highest RewardBench Score.

Rank Subject Score Model Match Provenance Sampled
1 infly/INF-ORM-Llama3.1-70B 95.11 Imported 2026-05-06
2 ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1 94.99 Imported 2026-05-06
3 nicolinho/QRM-Gemma-2-27B 94.44 Imported 2026-05-06
4 Skywork/Skywork-Reward-Gemma-2-27B-v0.2 94.26 Imported 2026-05-06
5 nvidia/Llama-3.1-Nemotron-70B-Reward * 94.11 Imported 2026-05-06
6 Skywork/Skywork-Reward-Gemma-2-27B ⚠️ 93.80 Imported 2026-05-06
7 SF-Foundation/TextEval-Llama3.1-70B * ⚠️ 93.48 Imported 2026-05-06
8 meta-metrics/MetaMetrics-RM-v1.0 93.42 Imported 2026-05-06
9 Skywork/Skywork-Critic-Llama-3.1-70B ⚠️ 93.31 Imported 2026-05-06
10 nicolinho/QRM-Llama3.1-8B-v2 93.14 Imported 2026-05-06
11 Skywork/Skywork-Reward-Llama-3.1-8B-v0.2 93.13 Imported 2026-05-06
12 nicolinho/QRM-Llama3.1-8B ⚠️ 93.06 Imported 2026-05-06
13 LxzGordon/URM-LLaMa-3.1-8B ⚠️ 92.94 Imported 2026-05-06
14 Salesforce/SFR-LLaMa-3.1-70B-Judge-r * 92.72 Imported 2026-05-06
15 R-I-S-E/RISE-Judge-Qwen2.5-32B 92.66 Imported 2026-05-06
16 Skywork/Skywork-Reward-Llama-3.1-8B ⚠️ 92.52 Imported 2026-05-06
17 AtlaAI/Selene-1 92.41 Imported 2026-05-06
18 general-preference/GPM-Llama-3.1-8B ⚠️ 92.24 Imported 2026-05-06
19 nvidia/Nemotron-4-340B-Reward * 92.00 Imported 2026-05-06
20 Ray2333/GRM-Llama3-8B-rewardmodel-ft ⚠️ 91.54 Imported 2026-05-06
21 nicolinho/QRM-Llama3-8B ⚠️ 91.10 Imported 2026-05-06
22 SF-Foundation/TextEval-OffsetBias-12B * 91.05 Imported 2026-05-06
23 Ray2333/GRM-llama3.2-3B-rewardmodel-ft 90.92 Imported 2026-05-06
24 Salesforce/SFR-nemo-12B-Judge-r * 90.27 Imported 2026-05-06
25 internlm/internlm2-20b-reward 90.16 Imported 2026-05-06
26 Skywork/Skywork-VL-Reward-7B 90.07 Imported 2026-05-06
27 facebook/Self-taught-evaluator-llama3.1-70B * 90.01 Imported 2026-05-06
28 LxzGordon/URM-LLaMa-3-8B 89.91 Imported 2026-05-06
29 NCSOFT/Llama-3-OffsetBias-RM-8B 89.42 Imported 2026-05-06
30 AtlaAI/Selene-1-Mini-Llama-3.1-8B 89.13 Imported 2026-05-06
31 Skywork/Skywork-Critic-Llama-3.1-8B 88.96 Imported 2026-05-06
32 nvidia/Llama3-70B-SteerLM-RM * 88.77 Imported 2026-05-06
33 Salesforce/SFR-LLaMa-3.1-8B-Judge-r * 88.65 Imported 2026-05-06
34 facebook/Self-taught-Llama-3-70B * 88.63 Imported 2026-05-06
35 RLHFlow/ArmoRM-Llama3-8B-v0.1 88.60 Imported 2026-05-06
36 Ray2333/GRM-gemma2-2B-rewardmodel-ft 88.39 Imported 2026-05-06
37 google/gemini-1.5-pro-0514 * 88.20 Imported 2026-05-06
38 R-I-S-E/RISE-Judge-Qwen2.5-7B 88.19 Imported 2026-05-06
39 Cohere May 2024 * 88.16 Imported 2026-05-06
40 google/flame-1.0-24B-july-2024 * 87.81 Imported 2026-05-06
41 internlm/internlm2-7b-reward 87.59 Imported 2026-05-06
42 ZiyiYe/Con-J-Qwen2-7B ⚠️ 87.12 Imported 2026-05-06
43 google/gemini-1.5-pro-0924 86.78 Imported 2026-05-06
44 openai/gpt-4o-2024-08-06 86.73 GPT-4o (2024-08-06)
openai-gpt-4o-2024-08-06
Imported 2026-05-06
45 RLHFlow/pair-preference-model-LLaMA3-8B 85.75 Imported 2026-05-06
46 Ray2333/GRM-llama3-8B-sftreg 85.42 Imported 2026-05-06
47 opencompass/CompassJudger-1-32B-Instruct 85.22 Imported 2026-05-06
48 Cohere March 2024 * 85.11 Imported 2026-05-06
49 Ray2333/GRM-llama3-8B-distill 84.64 Imported 2026-05-06
50 Ray2333/GRM-Gemma-2B-rewardmodel-ft ⚠️ 84.47 Imported 2026-05-06
51 openai/gpt-4-0125-preview 84.34 GPT-4
openai-gpt-4
Imported 2026-05-06
52 mattshumer/Reflection-70B 84.22 Imported 2026-05-06
53 Anthropic/claude-3-5-sonnet-20240620 84.17 Imported 2026-05-06
54 meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo 84.12 Imported 2026-05-06
55 opencompass/CompassJudger-1-14B-Instruct 84.09 Imported 2026-05-06
56 meta-llama/Meta-Llama-3.1-70B-Instruct 84.05 Imported 2026-05-06
57 NCSOFT/Llama-3-OffsetBias-8B 83.97 Imported 2026-05-06
58 openai/gpt-4-turbo-2024-04-09 83.95 GPT-4 Turbo
openai-gpt-4-turbo
Imported 2026-05-06
59 sfairXC/FsfairX-LLaMA3-RM-v0.1 83.38 Imported 2026-05-06
60 openai/gpt-4o-2024-05-13 83.27 GPT-4o (2024-05-13)
openai-gpt-4o-2024-05-13
Imported 2026-05-06
61 opencompass/CompassJudger-1-7B-Instruct 83.17 Imported 2026-05-06
62 internlm/internlm2-1_8b-reward 82.17 Imported 2026-05-06
63 CIR-AMS/BTRM_Qwen2_7b_0613 81.72 Imported 2026-05-06
64 openbmb/Eurus-RM-7b 81.59 Imported 2026-05-06
65 Nexusflow/Starling-RM-34B 81.33 Imported 2026-05-06
66 google/gemma-2-27b-it 80.90 Gemma 2 27B
google-gemma-2-27b-it
Imported 2026-05-06
67 google/gemini-1.5-flash-001 80.54 Imported 2026-05-06
68 Ray2333/Gemma-2B-rewardmodel-ft ⚠️ 80.48 Imported 2026-05-06
69 allenai/tulu-v2.5-13b-preference-mix-rm 80.27 Imported 2026-05-06
70 Anthropic/claude-3-opus-20240229 80.08 Imported 2026-05-06
71 openai/gpt-4o-mini-2024-07-18 80.07 GPT-4o-mini (2024-07-18)
openai-gpt-4o-mini-2024-07-18
Imported 2026-05-06
72 weqweasdas/RM-Mistral-7B 79.82 Imported 2026-05-06
73 NousResearch/Hermes-3-Llama-3.1-70B 78.47 L Hermes 3 70B Instruct
nousresearch-hermes-3-llama-3.1-70b
Imported 2026-05-06
74 hendrydong/Mistral-RM-for-RAFT-GSHF-v0 78.47 Imported 2026-05-06
75 meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo 78.08 Imported 2026-05-06
76 Ray2333/reward-model-Mistral-7B-instruct-Unifie... 76.61 Imported 2026-05-06
77 Ahjeong/MMPO_Gemma_7b_gamma1.1_epoch3 76.52 Imported 2026-05-06
78 stabilityai/stablelm-2-12b-chat 76.42 Imported 2026-05-06
79 meta-llama/Meta-Llama-3-70B-Instruct 76.27 Llama 3 70B Instruct
meta-llama-llama-3-70b-instruct
Imported 2026-05-06
80 allenai/tulu-2-dpo-70b 76.21 Imported 2026-05-06
81 gemini-1.5-flash-8b 76.01 Imported 2026-05-06
82 Ahjeong/MMPO_Gemma_7b 75.87 Imported 2026-05-06
83 PoLL/gpt-3.5-turbo-0125_claude-3-sonnet-2024022... 75.78 GPT-3.5 Turbo
openai-gpt-3.5-turbo
Imported 2026-05-06
84 allenai/llama-3-tulu-2-dpo-70b 74.96 Imported 2026-05-06
85 NousResearch/Nous-Hermes-2-Mistral-7B-DPO 74.81 Imported 2026-05-06
86 Anthropic/claude-3-sonnet-20240229 74.58 Imported 2026-05-06
87 mistralai/Mixtral-8x7B-Instruct-v0.1 74.55 Mistral: Mixtral 8x7B Instruct
mistralai-mixtral-8x7b-instruct
Imported 2026-05-06
88 prometheus-eval/prometheus-8x7b-v2.0 74.51 Imported 2026-05-06
89 Ray2333/GRM-Gemma-2B-sftreg 74.51 Imported 2026-05-06
90 general-preference/GPM-Gemma-2B 74.49 Imported 2026-05-06
91 0-hero/Matter-0.1-7B-boost-DPO-preview 74.48 Imported 2026-05-06
92 allenai/tulu-v2.5-70b-uf-rm 73.98 Imported 2026-05-06
93 HuggingFaceH4/zephyr-7b-alpha 73.92 Imported 2026-05-06
94 upstage/SOLAR-10.7B-Instruct-v1.0 73.91 Imported 2026-05-06
95 allenai/tulu-2-dpo-13b 73.68 Imported 2026-05-06
96 opencompass/CompassJudger-1-1.5B-Instruct 73.44 Imported 2026-05-06
97 allenai/llama-3-tulu-2-8b-uf-mean-rm 73.42 Imported 2026-05-06
98 HuggingFaceH4/starchat2-15b-v0.1 73.22 Imported 2026-05-06
99 Ray2333/Gemma-2B-rewardmodel-baseline 72.90 Imported 2026-05-06
100 Anthropic/claude-3-haiku-20240307 72.89 Imported 2026-05-06
101 HuggingFaceH4/zephyr-7b-beta 72.81 Imported 2026-05-06
102 allenai/llama-3-tulu-2-dpo-8b 72.75 Imported 2026-05-06
103 0-hero/Matter-0.1-7B-DPO-preview 72.47 Imported 2026-05-06
104 jondurbin/bagel-dpo-34b-v0.5 72.15 Imported 2026-05-06
105 allenai/tulu-2-dpo-7b 72.12 Imported 2026-05-06
106 prometheus-eval/prometheus-7b-v2.0 72.04 Imported 2026-05-06
107 stabilityai/stablelm-zephyr-3b 71.46 Imported 2026-05-06
108 NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO 71.38 Imported 2026-05-06
109 ai2/tulu-2-7b-rm-v0-nectar-binarized-700k.json 71.27 Imported 2026-05-06
110 berkeley-nest/Starling-RM-7B-alpha 71.13 Imported 2026-05-06
111 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 70.58 Imported 2026-05-06
112 CohereForAI/c4ai-command-r-plus 70.57 Imported 2026-05-06
113 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 70.19 Imported 2026-05-06
114 allenai/llama-3-tulu-2-70b-uf-mean-rm 70.19 Imported 2026-05-06
115 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 70.08 Imported 2026-05-06
116 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 70.04 Imported 2026-05-06
117 weqweasdas/RM-Gemma-7B 69.67 Imported 2026-05-06
118 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 69.45 Imported 2026-05-06
119 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 69.24 Imported 2026-05-06
120 weqweasdas/RM-Gemma-7B-4096 69.22 Imported 2026-05-06
121 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 69.05 Imported 2026-05-06
122 openbmb/UltraRM-13b 69.03 Imported 2026-05-06
123 OpenAssistant/oasst-rm-2.1-pythia-1.4b-epoch-2.5 69.01 Imported 2026-05-06
124 openbmb/Eurus-7b-kto 69.00 Imported 2026-05-06
125 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 68.95 Imported 2026-05-06
126 Qwen/Qwen1.5-14B-Chat 68.64 Imported 2026-05-06
127 ai2/tulu-2-7b-rm-v0-nectar-binarized-3.8m-check... 68.08 Imported 2026-05-06
128 RLHFlow/LLaMA3-iterative-DPO-final 67.83 Imported 2026-05-06
129 HuggingFaceH4/zephyr-7b-gemma-v0.1 67.58 Imported 2026-05-06
130 ai2/tulu-2-7b-rm-v0-nectar-binarized.json 67.56 Imported 2026-05-06
131 Qwen/Qwen1.5-7B-Chat 67.50 Imported 2026-05-06
132 openbmb/MiniCPM-2B-dpo-fp32 67.30 Imported 2026-05-06
133 mightbe/Better-PairRM 67.30 Imported 2026-05-06
134 allenai/OLMo-7B-Instruct 67.27 Imported 2026-05-06
135 Qwen/Qwen1.5-72B-Chat 67.23 Imported 2026-05-06
136 ai2/tulu-2-7b-rm-v0.json 66.55 Imported 2026-05-06
137 Qwen/Qwen1.5-MoE-A2.7B-Chat 66.44 Imported 2026-05-06
138 RLHFlow/RewardModel-Mistral-7B-for-DPA-v1 66.33 Imported 2026-05-06
139 stabilityai/stablelm-2-zephyr-1_6b 65.74 Imported 2026-05-06
140 meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo 65.65 Imported 2026-05-06
141 weqweasdas/RM-Gemma-2B 65.49 Imported 2026-05-06
142 openai/gpt-3.5-turbo-0125 65.34 GPT-3.5 Turbo
openai-gpt-3.5-turbo
Imported 2026-05-06
143 allenai/tulu-v2.5-70b-preference-mix-rm 65.16 Imported 2026-05-06
144 wenbopan/Faro-Yi-9B-DPO 64.61 Imported 2026-05-06
145 meta-llama/Meta-Llama-3-8B-Instruct 64.50 Llama 3 8B Instruct
meta-llama-llama-3-8b-instruct
Imported 2026-05-06
146 ai2/llama-2-chat-ultrafeedback-60k.jsonl 64.40 Imported 2026-05-06
147 IDEA-CCNL/Ziya-LLaMA-7B-Reward 63.78 Imported 2026-05-06
148 PKU-Alignment/beaver-7b-v2.0-reward 63.66 Imported 2026-05-06
149 stabilityai/stable-code-instruct-3b 62.16 Imported 2026-05-06
150 OpenAssistant/oasst-rm-2-pythia-6.9b-epoch-1 61.50 Imported 2026-05-06
151 OpenAssistant/reward-model-deberta-v3-large-v2 61.26 Imported 2026-05-06
152 llm-blender/PairRM-hf 60.87 Imported 2026-05-06
153 PKU-Alignment/beaver-7b-v2.0-cost 59.57 Imported 2026-05-06
154 ContextualAI/archangel_sft-kto_llama13b 59.52 Imported 2026-05-06
155 ContextualAI/archangel_sft-kto_llama30b 59.01 Imported 2026-05-06
156 Qwen/Qwen1.5-1.8B-Chat 58.90 Imported 2026-05-06
157 ai2/llama-2-chat-7b-nectar-3.8m.json 58.43 Imported 2026-05-06
158 PKU-Alignment/beaver-7b-v1.0-cost 57.98 Imported 2026-05-06
159 ContextualAI/archangel_sft-dpo_llama30b 56.18 Imported 2026-05-06
160 ContextualAI/archangel_sft-kto_pythia1-4b 55.81 Imported 2026-05-06
161 ContextualAI/archangel_sft-kto_pythia6-9b 55.61 Imported 2026-05-06
162 ContextualAI/archangel_sft-kto_pythia2-8b 54.97 Imported 2026-05-06
163 Qwen/Qwen1.5-4B-Chat 54.77 Imported 2026-05-06
164 ContextualAI/archangel_sft-dpo_llama13b 54.00 Imported 2026-05-06
165 ContextualAI/archangel_sft-kto_llama7b 53.88 Imported 2026-05-06
166 ContextualAI/archangel_sft-dpo_llama7b 53.04 Imported 2026-05-06
167 Qwen/Qwen1.5-0.5B-Chat 52.98 Imported 2026-05-06
168 ContextualAI/archangel_sft-dpo_pythia2-8b 52.86 Imported 2026-05-06
169 my_model/ 52.67 Imported 2026-05-06
170 ContextualAI/archangel_sft-dpo_pythia6-9b 52.63 Imported 2026-05-06
171 ai2/llama-2-chat-nectar-180k.json 52.35 Imported 2026-05-06
172 ContextualAI/archangel_sft-dpo_pythia1-4b 52.33 Imported 2026-05-06
173 stanfordnlp/SteamSHP-flan-t5-xl 51.35 Imported 2026-05-06
174 SultanR/SmolTulu-1.7b-RM 50.94 Imported 2026-05-06
175 ContextualAI/archangel_sft-kto_pythia12-0b 50.53 Imported 2026-05-06
176 weqweasdas/hh_rlhf_rm_open_llama_3b 50.27 Imported 2026-05-06
177 ContextualAI/archangel_sft-dpo_pythia12-0b 50.09 Imported 2026-05-06
178 random 50 Imported 2026-05-06
179 stanfordnlp/SteamSHP-flan-t5-large 49.62 Imported 2026-05-06
180 allenai/tulu-v2.5-13b-uf-rm 48.06 Imported 2026-05-06
181 PKU-Alignment/beaver-7b-v1.0-reward 47.27 Imported 2026-05-06
182 allenai/Llama-3.1-70B-Instruct-RM-RB2 90.21 Imported 2026-05-06
183 allenai/Llama-3.1-8B-Instruct-RM-RB2 88.85 Imported 2026-05-06
184 allenai/Llama-3.1-8B-Base-RM-RB2 84.63 Imported 2026-05-06
185 allenai/Llama-3.1-Tulu-3-8B-SFT-RM-RB2 85.51 Imported 2026-05-06
186 allenai/Llama-3.1-Tulu-3-8B-DPO-RM-RB2 84.31 Imported 2026-05-06
187 allenai/Llama-3.1-Tulu-3-8B-RL-RM-RB2 83.69 Imported 2026-05-06
188 allenai/Llama-3.1-Tulu-3-70B-SFT-RM-RB2 88.92 Imported 2026-05-06