PlaceboBench | BenchmarkList

Metadata

Non-Hallucination Rate, Hallucination Rate (lower is better), Hallucinations per Answer (lower is better), Sample Count

Rank	Subject	Non-Hallucination Rate	Model Match	Provenance	Sampled
1	gemini-3-pro-preview	73.913	Gemini 3 google-gemini-3	Imported	2026-05-27
2	gpt-5.2	63.2353	GPT-5.2 openai-gpt-5.2	Imported	2026-05-27
3	claude-sonnet-4-5	62.3188	Claude Sonnet 4.5 anthropic-claude-sonnet-4.5	Imported	2026-05-27
4	accounts/fireworks/models/kimi-k2p5	53.6232	—	Imported	2026-05-27
5	gemini-3-flash-preview	44.9275	Gemini 3 Flash Preview google-gemini-3-flash-preview	Imported	2026-05-27
6	gpt-5-mini	39.1304	GPT-5 Mini openai-gpt-5-mini	Imported	2026-05-27
7	claude-opus-4-6	36.2319	Claude Opus 4.6 anthropic-claude-opus-4.6	Imported	2026-05-27