Vectara HHEM Hallucination Leaderboard
Leaderboard using Vectara's Hughes Hallucination Evaluation Model to measure hallucination and factual consistency in document summarization.
102rows
factual_consistency_rateprimary metric
2026-05-06sampled
Metadata
Metrics
Factual Consistency Rate, Hallucination Rate (lower is better), Answer Rate, Average Summary Length
| Rank | Subject | Factual Consistency Rate | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | antgroup/finix_s1_32b- | 98.20 | — | Imported | 2026-05-06 |
| 2 | openai/gpt-5.4-nano-2026-03-17 | 96.90 | GPT-5.4 Nano openai-gpt-5.4-nano | Imported | 2026-05-06 |
| 3 | google/gemini-2.5-flash-lite- | 96.70 | Gemini 2.5 Flash Lite google-gemini-2.5-flash-lite | Imported | 2026-05-06 |
| 4 | microsoft/Phi-4- | 96.30 | Phi 4 microsoft-phi-4 | Imported | 2026-05-06 |
| 5 | meta-llama/Llama-3.3-70B-Instruct-Turbo- | 95.90 | — | Imported | 2026-05-06 |
| 6 | snowflake/snowflake-arctic-instruct- | 95.70 | — | Imported | 2026-05-06 |
| 7 | google/gemma-3-12b-it- | 95.60 | Gemma 3 12B google-gemma-3-12b-it | Imported | 2026-05-06 |
| 8 | mistralai/mistral-large-2411 | 95.50 | Mistral Large 2411 mistralai-mistral-large-2411 | Imported | 2026-05-06 |
| 9 | qwen/qwen3-8b- | 95.20 | Qwen3 8B qwen-qwen3-8b | Imported | 2026-05-06 |
| 10 | amazon/nova-2-lite-v1:0- | 94.90 | — | Imported | 2026-05-06 |
| 11 | amazon/nova-pro-v1:0- | 94.90 | — | Imported | 2026-05-06 |
| 12 | mistralai/mistral-small-2501 | 94.90 | — | Imported | 2026-05-06 |
| 13 | google/gemma-4-26b-a4b-it- | 94.80 | Gemma 4 26B A4B google-gemma-4-26b-a4b-it | Imported | 2026-05-06 |
| 14 | ibm-granite/granite-4.0-h-small- | 94.80 | — | Imported | 2026-05-06 |
| 15 | ai21labs/jamba-mini-2- | 94.70 | — | Imported | 2026-05-06 |
| 16 | deepseek-ai/DeepSeek-V3.2-Exp- | 94.70 | — | Imported | 2026-05-06 |
| 17 | qwen/qwen3-14b- | 94.60 | Qwen3 14B qwen-qwen3-14b | Imported | 2026-05-06 |
| 18 | amazon/nova-micro-v1:0- | 94.50 | — | Imported | 2026-05-06 |
| 19 | deepseek-ai/DeepSeek-V3.1- | 94.50 | — | Imported | 2026-05-06 |
| 20 | openai/gpt-5.4-mini-2026-03-17 | 94.50 | GPT-5.4 Mini openai-gpt-5.4-mini | Imported | 2026-05-06 |
| 21 | openai/gpt-4.1-2025-04-14 | 94.40 | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-06 |
| 22 | qwen/qwen3-4b- | 94.30 | — | Imported | 2026-05-06 |
| 23 | xai-org/grok-3- | 94.20 | — | Imported | 2026-05-06 |
| 24 | qwen/qwen3-32b- | 94.10 | Qwen3 32B qwen-qwen3-32b | Imported | 2026-05-06 |
| 25 | amazon/nova-lite-v1:0- | 93.90 | — | Imported | 2026-05-06 |
| 26 | deepseek-ai/DeepSeek-V3- | 93.90 | — | Imported | 2026-05-06 |
| 27 | deepseek-ai/DeepSeek-V3.2- | 93.70 | DeepSeek V3.2 deepseek-deepseek-v3.2 | Imported | 2026-05-06 |
| 28 | google/gemma-3-4b-it- | 93.60 | Gemma 3 4B google-gemma-3-4b-it | Imported | 2026-05-06 |
| 29 | CohereLabs/command-r-plus-08-2024 | 93.10 | — | Imported | 2026-05-06 |
| 30 | arcee-ai/trinity-large-preview- | 93.10 | Trinity Large Preview arcee-ai-trinity-large-preview | Imported | 2026-05-06 |
| 31 | google/gemini-2.5-pro- | 93 | Gemini 2.5 Pro google-gemini-2.5-pro | Imported | 2026-05-06 |
| 32 | openai/gpt-5.4-2026-03-05 | 93 | GPT-5.4 openai-gpt-5.4 | Imported | 2026-05-06 |
| 33 | mistralai/ministral-3b-2410 | 92.70 | — | Imported | 2026-05-06 |
| 34 | google/gemma-3-27b-it- | 92.60 | Gemma 3 27B google-gemma-3-27b-it | Imported | 2026-05-06 |
| 35 | google/gemma-4-31b-it- | 92.60 | Gemma 4 31B google-gemma-4-31b-it | Imported | 2026-05-06 |
| 36 | mistralai/ministral-8b-2410 | 92.60 | — | Imported | 2026-05-06 |
| 37 | meta-llama/Llama-4-Scout-17B-16E-Instruct- | 92.30 | Llama 4 Scout meta-llama-llama-4-scout | Imported | 2026-05-06 |
| 38 | google/gemini-2.5-flash- | 92.20 | Gemini 2.5 Flash google-gemini-2.5-flash | Imported | 2026-05-06 |
| 39 | google/gemini-3.1-flash-lite-preview- | 91.80 | Gemini 3.1 Flash Lite Preview google-gemini-3.1-flash-lite-preview | Imported | 2026-05-06 |
| 40 | meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8- | 91.80 | — | Imported | 2026-05-06 |
| 41 | openai/gpt-5.4-pro-2026-03-05 | 91.70 | GPT-5.4 Pro openai-gpt-5.4-pro | Imported | 2026-05-06 |
| 42 | openai/gpt-5.2-low-2025-12-11 | 91.60 | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-06 |
| 43 | MiniMaxAI/minimax-m2p5- | 90.90 | — | Imported | 2026-05-06 |
| 44 | CohereLabs/command-a-03-2025 | 90.70 | — | Imported | 2026-05-06 |
| 45 | qwen/qwen3-235b-a22b- | 90.70 | Qwen3 235B A22B qwen-qwen3-235b-a22b | Imported | 2026-05-06 |
| 46 | qwen/qwen3-next-80b-a3b-thinking- | 90.70 | Qwen3 Next 80B A3B Thinking qwen-qwen3-next-80b-a3b-thinking | Imported | 2026-05-06 |
| 47 | zai-org/GLM-4.5-AIR-FP8- | 90.70 | GLM 4.5 Air z-ai-glm-4.5-air | Imported | 2026-05-06 |
| 48 | zai-org/glm-4p7-flash- | 90.70 | GLM 4.7 Flash z-ai-glm-4.7-flash | Imported | 2026-05-06 |
| 49 | CohereLabs/c4ai-aya-expanse-8b- | 90.50 | — | Imported | 2026-05-06 |
| 50 | zai-org/GLM-4.6- | 90.50 | GLM 4.6 z-ai-glm-4.6 | Imported | 2026-05-06 |
| 51 | nvidia/Nemotron-3-Nano-30B-A3B- | 90.40 | Nemotron 3 Nano 30B A3B nvidia-nemotron-3-nano-30b-a3b | Imported | 2026-05-06 |
| 52 | openai/gpt-4o-2024-08-06 | 90.40 | GPT-4o (2024-08-06) openai-gpt-4o-2024-08-06 | Imported | 2026-05-06 |
| 53 | ai21labs/jamba-large-1.7-2025-07 | 90.30 | — | Imported | 2026-05-06 |
| 54 | anthropic/claude-haiku-4-5-20251001 | 90.20 | Claude Haiku 4.5 anthropic-claude-haiku-4.5 | Imported | 2026-05-06 |
| 55 | zai-org/glm-5- | 89.90 | GLM 5 z-ai-glm-5 | Imported | 2026-05-06 |
| 56 | anthropic/claude-sonnet-4-20250514 | 89.70 | — | Imported | 2026-05-06 |
| 57 | google/gemini-3.1-pro-preview- | 89.60 | Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview | Imported | 2026-05-06 |
| 58 | openai/gpt-5-nano-2025-08-07 | 89.50 | GPT-5 Nano openai-gpt-5-nano | Imported | 2026-05-06 |
| 59 | qwen/qwen3.5-35b-a3b- | 89.50 | Qwen3.5-35B-A3B qwen-qwen3.5-35b-a3b | Imported | 2026-05-06 |
| 60 | qwen/qwen3.5-flash-2026-02-23 | 89.50 | — | Imported | 2026-05-06 |
| 61 | anthropic/claude-sonnet-4-6- | 89.40 | Claude Sonnet 4.6 anthropic-claude-sonnet-4.6 | Imported | 2026-05-06 |
| 62 | ibm-granite/granite-3.3-8b-instruct- | 89.40 | — | Imported | 2026-05-06 |
| 63 | qwen/qwen3.5-plus-2026-02-15 | 89.30 | — | Imported | 2026-05-06 |
| 64 | openai/gpt-5.2-high-2025-12-11 | 89.20 | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-06 |
| 65 | CohereLabs/c4ai-aya-expanse-32b- | 89.10 | — | Imported | 2026-05-06 |
| 66 | anthropic/claude-opus-4-5-20251101 | 89.10 | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-06 |
| 67 | openai/gpt-5.1-low-2025-11-13 | 89.10 | GPT-5.1 openai-gpt-5.1 | Imported | 2026-05-06 |
| 68 | qwen/qwen3.5-122b-a10b- | 88.80 | Qwen3.5-122B-A10B qwen-qwen3.5-122b-a10b | Imported | 2026-05-06 |
| 69 | deepseek-ai/DeepSeek-R1- | 88.70 | R1 deepseek-r1 | Imported | 2026-05-06 |
| 70 | zai-org/glm-4p7- | 88.30 | GLM 4.7 z-ai-glm-4.7 | Imported | 2026-05-06 |
| 71 | MiniMaxAI/minimax-m2p1- | 88.20 | — | Imported | 2026-05-06 |
| 72 | anthropic/claude-opus-4-1-20250805 | 88.20 | Claude Opus 4.1 anthropic-claude-opus-4.1 | Imported | 2026-05-06 |
| 73 | anthropic/claude-opus-4-20250514 | 88 | — | Imported | 2026-05-06 |
| 74 | anthropic/claude-opus-4-7- | 88 | Claude Opus 4.7 anthropic-claude-opus-4.7 | Imported | 2026-05-06 |
| 75 | anthropic/claude-sonnet-4-5-20250929 | 88 | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-06 |
| 76 | openai/gpt-5.1-high-2025-11-13 | 87.90 | GPT-5.1 openai-gpt-5.1 | Imported | 2026-05-06 |
| 77 | qwen/qwen3.5-27b- | 87.90 | Qwen3.5-27B qwen-qwen3.5-27b | Imported | 2026-05-06 |
| 78 | anthropic/claude-opus-4-6- | 87.80 | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-06 |
| 79 | inceptionlabs/mercury-2- | 87.70 | — | Imported | 2026-05-06 |
| 80 | MiniMaxAI/minimax-m2p7- | 87.10 | — | Imported | 2026-05-06 |
| 81 | openai/gpt-5-mini-2025-08-07 | 87.10 | GPT-5 Mini openai-gpt-5-mini | Imported | 2026-05-06 |
| 82 | google/gemini-3-flash-preview- | 86.50 | Gemini 3 Flash Preview google-gemini-3-flash-preview | Imported | 2026-05-06 |
| 83 | google/gemini-3-pro-preview- | 86.40 | Gemini 3 google-gemini-3 | Imported | 2026-05-06 |
| 84 | moonshotai/Kimi-K2.5- | 85.80 | MoonshotAI: Kimi K2.5 moonshotai-kimi-k2.5 | Imported | 2026-05-06 |
| 85 | openai/gpt-oss-120b- | 85.80 | gpt-oss-120b openai-gpt-oss-120b | Imported | 2026-05-06 |
| 86 | mistralai/mistral-large-2512 | 85.50 | Mistral: Mistral Large 3 2512 mistralai-mistral-large-2512 | Imported | 2026-05-06 |
| 87 | ai21labs/jamba-mini-1.7-2025-07 | 85.30 | — | Imported | 2026-05-06 |
| 88 | openai/gpt-5-minimal-2025-08-07 | 85.30 | GPT-5 openai-gpt-5 | Imported | 2026-05-06 |
| 89 | openai/gpt-5-high-2025-08-07 | 84.90 | GPT-5 openai-gpt-5 | Imported | 2026-05-06 |
| 90 | xai-org/grok-4-1-fast-non-reasoning- | 82.20 | — | Imported | 2026-05-06 |
| 91 | moonshotai/Kimi-K2-Instruct-0905 | 82.10 | MoonshotAI: Kimi K2 0905 moonshotai-kimi-k2-0905 | Imported | 2026-05-06 |
| 92 | openai/o4-mini-high-2025-04-16 | 81.40 | o4 Mini High openai-o4-mini-high | Imported | 2026-05-06 |
| 93 | openai/o4-mini-low-2025-04-16 | 81.40 | — | Imported | 2026-05-06 |
| 94 | xai-org/grok-4-1-fast-reasoning- | 80.80 | — | Imported | 2026-05-06 |
| 95 | mistralai/ministral-14b-2512 | 80.60 | Mistral: Ministral 3 14B 2512 mistralai-ministral-14b-2512 | Imported | 2026-05-06 |
| 96 | xai-org/grok-4-fast-non-reasoning- | 80.30 | — | Imported | 2026-05-06 |
| 97 | xai-org/grok-4-fast-reasoning- | 79.80 | — | Imported | 2026-05-06 |
| 98 | mistralai/ministral-8b-2512 | 78.30 | Mistral: Ministral 3 8B 2512 mistralai-ministral-8b-2512 | Imported | 2026-05-06 |
| 99 | mistralai/mistral-medium-2508 | 77.30 | — | Imported | 2026-05-06 |
| 100 | openai/o3-pro- | 76.70 | o3 Pro openai-o3-pro | Imported | 2026-05-06 |
| 101 | microsoft/Phi-4-mini-instruct- | 76.50 | — | Imported | 2026-05-06 |
| 102 | mistralai/ministral-3b-2512 | 75.80 | Mistral: Ministral 3 3B 2512 mistralai-ministral-3b-2512 | Imported | 2026-05-06 |
No matching rows.