CRAG
CRAG (Comprehensive RAG Benchmark) is a factual question answering benchmark consisting of 4,409 question-answer pairs across 5 domains (finance, sports, music, movie, open domain) and 8 question categories. The benchmark includes mock APIs to simulate web and Knowledge Graph search, designed to represent the diverse and dynamic nature of real-world QA tasks with temporal dynamism ranging from years to seconds. It evaluates retrieval-augmented generation systems for trustworthy question answering.
3rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Normalized Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Nova Pro | 0.50 | Nova Pro 1.0 amazon-nova-pro-v1 | Self-reported | 2026-05-06 |
| 2 | Nova Lite | 0.44 | Nova Lite 1.0 amazon-nova-lite-v1 | Self-reported | 2026-05-06 |
| 3 | Nova Micro | 0.43 | Nova Micro 1.0 amazon-nova-micro-v1 | Self-reported | 2026-05-06 |
No matching rows.