kluster.ai LLM Hallucination Detection Leaderboard

Hallucination-detection leaderboard reporting RAG hallucination rates on HaluEval-QA and non-RAG hallucination rates on UltraChat-style prompts.

17rows
combined_non_hallucination_rateprimary metric
2026-05-06sampled

Metadata

Metrics

Combined Non-Hallucination Rate, RAG Hallucination Rate (lower is better), Non-RAG Hallucination Rate (lower is better), RAG Method 1 Hallucination Rate (lower is better), RAG Method 2 Hallucination Rate (lower is better), RAG Method 3 Hallucination Rate (lower is better)

Latest Results

Rows are parsed from public Space CSV files. Score is computed as 100 minus the mean of RAG and non-RAG hallucination rates; source hallucination-rate columns are preserved.

Rank Subject Combined Non-Hallucination Rate Model Match Provenance Sampled
1 gemini-2.5-pro 99.03 Gemini 2.5 Pro
google-gemini-2.5-pro
Imported 2026-05-06
2 claude-sonnet-4 98.59 Claude Sonnet 4
anthropic-claude-sonnet-4
Imported 2026-05-06
3 DeepSeek-R1-0528 98.48 R1 0528
deepseek-deepseek-r1-0528
Imported 2026-05-06
4 klusterai-Meta-Llama-3.3-70B-Instruct-Turbo 98.39 Imported 2026-05-06
5 Llama-4-Maverick-17B-128E-Instruct-FP8 97.98 Imported 2026-05-06
6 gemma-3-27b-it 97.91 Gemma 3 27B
google-gemma-3-27b-it
Imported 2026-05-06
7 DeepSeek-V3-0324 97.22 DeepSeek V3 0324
deepseek-deepseek-chat-v3-0324
Imported 2026-05-06
8 kimi-k2 97.03 KIMI MoonshotAI: Kimi K2 0711
moonshotai-kimi-k2
Imported 2026-05-06
9 gpt-4o 96.66 GPT-4o
openai-gpt-4o
Imported 2026-05-06
10 Llama-4-Scout-17B-16E-Instruct 96.64 Llama 4 Scout
meta-llama-llama-4-scout
Imported 2026-05-06
11 Qwen3-235B-A22B-FP8 95.88 Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-06
12 Qwen3-235B-A22B-2507-FP8 95.83 Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-06
13 Mistral-Small-24B-Instruct-2501 93.70 Mistral: Mistral Small 3
mistralai-mistral-small-24b-instruct-2501
Imported 2026-05-06
14 Qwen2.5-VL-7B-Instruct 93.05 Imported 2026-05-06
15 Mistral-Nemo-Instruct-2407 90.31 Mistral: Mistral Nemo
mistralai-mistral-nemo
Imported 2026-05-06
16 klusterai-Meta-Llama-3.1-8B-Instruct-Turbo 89.70 Imported 2026-05-06
17 Magistral-Small-2506 81.66 Imported 2026-05-06