Vectara HHEM Hallucination Leaderboard

Leaderboard using Vectara's Hughes Hallucination Evaluation Model to measure hallucination and factual consistency in document summarization.

102rows
factual_consistency_rateprimary metric
2026-05-06sampled

Metadata

Metrics

Factual Consistency Rate, Hallucination Rate (lower is better), Answer Rate, Average Summary Length

Latest Results

Rows are parsed from the public vectara/results Hugging Face dataset. Rank follows the leaderboard convention of lower hallucination rate first; score is factual consistency rate.

Rank Subject Factual Consistency Rate Model Match Provenance Sampled
1 antgroup/finix_s1_32b- 98.20 Imported 2026-05-06
2 openai/gpt-5.4-nano-2026-03-17 96.90 GPT-5.4 Nano
openai-gpt-5.4-nano
Imported 2026-05-06
3 google/gemini-2.5-flash-lite- 96.70 Gemini 2.5 Flash Lite
google-gemini-2.5-flash-lite
Imported 2026-05-06
4 microsoft/Phi-4- 96.30 Phi 4
microsoft-phi-4
Imported 2026-05-06
5 meta-llama/Llama-3.3-70B-Instruct-Turbo- 95.90 Imported 2026-05-06
6 snowflake/snowflake-arctic-instruct- 95.70 Imported 2026-05-06
7 google/gemma-3-12b-it- 95.60 Gemma 3 12B
google-gemma-3-12b-it
Imported 2026-05-06
8 mistralai/mistral-large-2411 95.50 Mistral Large 2411
mistralai-mistral-large-2411
Imported 2026-05-06
9 qwen/qwen3-8b- 95.20 Qwen3 8B
qwen-qwen3-8b
Imported 2026-05-06
10 amazon/nova-2-lite-v1:0- 94.90 Imported 2026-05-06
11 amazon/nova-pro-v1:0- 94.90 Imported 2026-05-06
12 mistralai/mistral-small-2501 94.90 Imported 2026-05-06
13 google/gemma-4-26b-a4b-it- 94.80 Gemma 4 26B A4B
google-gemma-4-26b-a4b-it
Imported 2026-05-06
14 ibm-granite/granite-4.0-h-small- 94.80 Imported 2026-05-06
15 ai21labs/jamba-mini-2- 94.70 Imported 2026-05-06
16 deepseek-ai/DeepSeek-V3.2-Exp- 94.70 Imported 2026-05-06
17 qwen/qwen3-14b- 94.60 Qwen3 14B
qwen-qwen3-14b
Imported 2026-05-06
18 amazon/nova-micro-v1:0- 94.50 Imported 2026-05-06
19 deepseek-ai/DeepSeek-V3.1- 94.50 Imported 2026-05-06
20 openai/gpt-5.4-mini-2026-03-17 94.50 GPT-5.4 Mini
openai-gpt-5.4-mini
Imported 2026-05-06
21 openai/gpt-4.1-2025-04-14 94.40 GPT-4.1
openai-gpt-4.1
Imported 2026-05-06
22 qwen/qwen3-4b- 94.30 Imported 2026-05-06
23 xai-org/grok-3- 94.20 Imported 2026-05-06
24 qwen/qwen3-32b- 94.10 Qwen3 32B
qwen-qwen3-32b
Imported 2026-05-06
25 amazon/nova-lite-v1:0- 93.90 Imported 2026-05-06
26 deepseek-ai/DeepSeek-V3- 93.90 Imported 2026-05-06
27 deepseek-ai/DeepSeek-V3.2- 93.70 DeepSeek V3.2
deepseek-deepseek-v3.2
Imported 2026-05-06
28 google/gemma-3-4b-it- 93.60 Gemma 3 4B
google-gemma-3-4b-it
Imported 2026-05-06
29 CohereLabs/command-r-plus-08-2024 93.10 Imported 2026-05-06
30 arcee-ai/trinity-large-preview- 93.10 A Trinity Large Preview
arcee-ai-trinity-large-preview
Imported 2026-05-06
31 google/gemini-2.5-pro- 93 Gemini 2.5 Pro
google-gemini-2.5-pro
Imported 2026-05-06
32 openai/gpt-5.4-2026-03-05 93 GPT-5.4
openai-gpt-5.4
Imported 2026-05-06
33 mistralai/ministral-3b-2410 92.70 Imported 2026-05-06
34 google/gemma-3-27b-it- 92.60 Gemma 3 27B
google-gemma-3-27b-it
Imported 2026-05-06
35 google/gemma-4-31b-it- 92.60 Gemma 4 31B
google-gemma-4-31b-it
Imported 2026-05-06
36 mistralai/ministral-8b-2410 92.60 Imported 2026-05-06
37 meta-llama/Llama-4-Scout-17B-16E-Instruct- 92.30 Llama 4 Scout
meta-llama-llama-4-scout
Imported 2026-05-06
38 google/gemini-2.5-flash- 92.20 Gemini 2.5 Flash
google-gemini-2.5-flash
Imported 2026-05-06
39 google/gemini-3.1-flash-lite-preview- 91.80 Gemini 3.1 Flash Lite Preview
google-gemini-3.1-flash-lite-preview
Imported 2026-05-06
40 meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8- 91.80 Imported 2026-05-06
41 openai/gpt-5.4-pro-2026-03-05 91.70 GPT-5.4 Pro
openai-gpt-5.4-pro
Imported 2026-05-06
42 openai/gpt-5.2-low-2025-12-11 91.60 GPT-5.2
openai-gpt-5.2
Imported 2026-05-06
43 MiniMaxAI/minimax-m2p5- 90.90 Imported 2026-05-06
44 CohereLabs/command-a-03-2025 90.70 Imported 2026-05-06
45 qwen/qwen3-235b-a22b- 90.70 Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-06
46 qwen/qwen3-next-80b-a3b-thinking- 90.70 Qwen3 Next 80B A3B Thinking
qwen-qwen3-next-80b-a3b-thinking
Imported 2026-05-06
47 zai-org/GLM-4.5-AIR-FP8- 90.70 GLM GLM 4.5 Air
z-ai-glm-4.5-air
Imported 2026-05-06
48 zai-org/glm-4p7-flash- 90.70 GLM GLM 4.7 Flash
z-ai-glm-4.7-flash
Imported 2026-05-06
49 CohereLabs/c4ai-aya-expanse-8b- 90.50 Imported 2026-05-06
50 zai-org/GLM-4.6- 90.50 GLM GLM 4.6
z-ai-glm-4.6
Imported 2026-05-06
51 nvidia/Nemotron-3-Nano-30B-A3B- 90.40 Nemotron 3 Nano 30B A3B
nvidia-nemotron-3-nano-30b-a3b
Imported 2026-05-06
52 openai/gpt-4o-2024-08-06 90.40 GPT-4o (2024-08-06)
openai-gpt-4o-2024-08-06
Imported 2026-05-06
53 ai21labs/jamba-large-1.7-2025-07 90.30 Imported 2026-05-06
54 anthropic/claude-haiku-4-5-20251001 90.20 Claude Haiku 4.5
anthropic-claude-haiku-4.5
Imported 2026-05-06
55 zai-org/glm-5- 89.90 GLM GLM 5
z-ai-glm-5
Imported 2026-05-06
56 anthropic/claude-sonnet-4-20250514 89.70 Imported 2026-05-06
57 google/gemini-3.1-pro-preview- 89.60 Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-06
58 openai/gpt-5-nano-2025-08-07 89.50 GPT-5 Nano
openai-gpt-5-nano
Imported 2026-05-06
59 qwen/qwen3.5-35b-a3b- 89.50 Qwen3.5-35B-A3B
qwen-qwen3.5-35b-a3b
Imported 2026-05-06
60 qwen/qwen3.5-flash-2026-02-23 89.50 Imported 2026-05-06
61 anthropic/claude-sonnet-4-6- 89.40 Claude Sonnet 4.6
anthropic-claude-sonnet-4.6
Imported 2026-05-06
62 ibm-granite/granite-3.3-8b-instruct- 89.40 Imported 2026-05-06
63 qwen/qwen3.5-plus-2026-02-15 89.30 Imported 2026-05-06
64 openai/gpt-5.2-high-2025-12-11 89.20 GPT-5.2
openai-gpt-5.2
Imported 2026-05-06
65 CohereLabs/c4ai-aya-expanse-32b- 89.10 Imported 2026-05-06
66 anthropic/claude-opus-4-5-20251101 89.10 Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-06
67 openai/gpt-5.1-low-2025-11-13 89.10 GPT-5.1
openai-gpt-5.1
Imported 2026-05-06
68 qwen/qwen3.5-122b-a10b- 88.80 Qwen3.5-122B-A10B
qwen-qwen3.5-122b-a10b
Imported 2026-05-06
69 deepseek-ai/DeepSeek-R1- 88.70 R1
deepseek-r1
Imported 2026-05-06
70 zai-org/glm-4p7- 88.30 GLM GLM 4.7
z-ai-glm-4.7
Imported 2026-05-06
71 MiniMaxAI/minimax-m2p1- 88.20 Imported 2026-05-06
72 anthropic/claude-opus-4-1-20250805 88.20 Claude Opus 4.1
anthropic-claude-opus-4.1
Imported 2026-05-06
73 anthropic/claude-opus-4-20250514 88 Imported 2026-05-06
74 anthropic/claude-opus-4-7- 88 Claude Opus 4.7
anthropic-claude-opus-4.7
Imported 2026-05-06
75 anthropic/claude-sonnet-4-5-20250929 88 Claude Sonnet 4.5
anthropic-claude-sonnet-4.5
Imported 2026-05-06
76 openai/gpt-5.1-high-2025-11-13 87.90 GPT-5.1
openai-gpt-5.1
Imported 2026-05-06
77 qwen/qwen3.5-27b- 87.90 Qwen3.5-27B
qwen-qwen3.5-27b
Imported 2026-05-06
78 anthropic/claude-opus-4-6- 87.80 Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-06
79 inceptionlabs/mercury-2- 87.70 Imported 2026-05-06
80 MiniMaxAI/minimax-m2p7- 87.10 Imported 2026-05-06
81 openai/gpt-5-mini-2025-08-07 87.10 GPT-5 Mini
openai-gpt-5-mini
Imported 2026-05-06
82 google/gemini-3-flash-preview- 86.50 Gemini 3 Flash Preview
google-gemini-3-flash-preview
Imported 2026-05-06
83 google/gemini-3-pro-preview- 86.40 Gemini 3
google-gemini-3
Imported 2026-05-06
84 moonshotai/Kimi-K2.5- 85.80 KIMI MoonshotAI: Kimi K2.5
moonshotai-kimi-k2.5
Imported 2026-05-06
85 openai/gpt-oss-120b- 85.80 gpt-oss-120b
openai-gpt-oss-120b
Imported 2026-05-06
86 mistralai/mistral-large-2512 85.50 Mistral: Mistral Large 3 2512
mistralai-mistral-large-2512
Imported 2026-05-06
87 ai21labs/jamba-mini-1.7-2025-07 85.30 Imported 2026-05-06
88 openai/gpt-5-minimal-2025-08-07 85.30 GPT-5
openai-gpt-5
Imported 2026-05-06
89 openai/gpt-5-high-2025-08-07 84.90 GPT-5
openai-gpt-5
Imported 2026-05-06
90 xai-org/grok-4-1-fast-non-reasoning- 82.20 Imported 2026-05-06
91 moonshotai/Kimi-K2-Instruct-0905 82.10 KIMI MoonshotAI: Kimi K2 0905
moonshotai-kimi-k2-0905
Imported 2026-05-06
92 openai/o4-mini-high-2025-04-16 81.40 o4 Mini High
openai-o4-mini-high
Imported 2026-05-06
93 openai/o4-mini-low-2025-04-16 81.40 Imported 2026-05-06
94 xai-org/grok-4-1-fast-reasoning- 80.80 Imported 2026-05-06
95 mistralai/ministral-14b-2512 80.60 Mistral: Ministral 3 14B 2512
mistralai-ministral-14b-2512
Imported 2026-05-06
96 xai-org/grok-4-fast-non-reasoning- 80.30 Imported 2026-05-06
97 xai-org/grok-4-fast-reasoning- 79.80 Imported 2026-05-06
98 mistralai/ministral-8b-2512 78.30 Mistral: Ministral 3 8B 2512
mistralai-ministral-8b-2512
Imported 2026-05-06
99 mistralai/mistral-medium-2508 77.30 Imported 2026-05-06
100 openai/o3-pro- 76.70 o3 Pro
openai-o3-pro
Imported 2026-05-06
101 microsoft/Phi-4-mini-instruct- 76.50 Imported 2026-05-06
102 mistralai/ministral-3b-2512 75.80 Mistral: Ministral 3 3B 2512
mistralai-ministral-3b-2512
Imported 2026-05-06