kluster.ai LLM Hallucination Detection Leaderboard
Hallucination-detection leaderboard reporting RAG hallucination rates on HaluEval-QA and non-RAG hallucination rates on UltraChat-style prompts.
17rows
combined_non_hallucination_rateprimary metric
2026-05-06sampled
Metadata
Metrics
Combined Non-Hallucination Rate, RAG Hallucination Rate (lower is better), Non-RAG Hallucination Rate (lower is better), RAG Method 1 Hallucination Rate (lower is better), RAG Method 2 Hallucination Rate (lower is better), RAG Method 3 Hallucination Rate (lower is better)
| Rank | Subject | Combined Non-Hallucination Rate | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | gemini-2.5-pro | 99.03 | Gemini 2.5 Pro google-gemini-2.5-pro | Imported | 2026-05-06 |
| 2 | claude-sonnet-4 | 98.59 | Claude Sonnet 4 anthropic-claude-sonnet-4 | Imported | 2026-05-06 |
| 3 | DeepSeek-R1-0528 | 98.48 | R1 0528 deepseek-deepseek-r1-0528 | Imported | 2026-05-06 |
| 4 | klusterai-Meta-Llama-3.3-70B-Instruct-Turbo | 98.39 | — | Imported | 2026-05-06 |
| 5 | Llama-4-Maverick-17B-128E-Instruct-FP8 | 97.98 | — | Imported | 2026-05-06 |
| 6 | gemma-3-27b-it | 97.91 | Gemma 3 27B google-gemma-3-27b-it | Imported | 2026-05-06 |
| 7 | DeepSeek-V3-0324 | 97.22 | DeepSeek V3 0324 deepseek-deepseek-chat-v3-0324 | Imported | 2026-05-06 |
| 8 | kimi-k2 | 97.03 | MoonshotAI: Kimi K2 0711 moonshotai-kimi-k2 | Imported | 2026-05-06 |
| 9 | gpt-4o | 96.66 | GPT-4o openai-gpt-4o | Imported | 2026-05-06 |
| 10 | Llama-4-Scout-17B-16E-Instruct | 96.64 | Llama 4 Scout meta-llama-llama-4-scout | Imported | 2026-05-06 |
| 11 | Qwen3-235B-A22B-FP8 | 95.88 | Qwen3 235B A22B qwen-qwen3-235b-a22b | Imported | 2026-05-06 |
| 12 | Qwen3-235B-A22B-2507-FP8 | 95.83 | Qwen3 235B A22B qwen-qwen3-235b-a22b | Imported | 2026-05-06 |
| 13 | Mistral-Small-24B-Instruct-2501 | 93.70 | Mistral: Mistral Small 3 mistralai-mistral-small-24b-instruct-2501 | Imported | 2026-05-06 |
| 14 | Qwen2.5-VL-7B-Instruct | 93.05 | — | Imported | 2026-05-06 |
| 15 | Mistral-Nemo-Instruct-2407 | 90.31 | Mistral: Mistral Nemo mistralai-mistral-nemo | Imported | 2026-05-06 |
| 16 | klusterai-Meta-Llama-3.1-8B-Instruct-Turbo | 89.70 | — | Imported | 2026-05-06 |
| 17 | Magistral-Small-2506 | 81.66 | — | Imported | 2026-05-06 |
No matching rows.