STaRK
Semi-structured Retrieval Benchmark over textual and relational knowledge bases, covering Amazon product search, academic paper search, and biomedicine inquiries.
13rows
average_mrrprimary metric
2026-05-06sampled
Metadata
Metrics
Average MRR, STARK-AMAZON Hit@1, STARK-AMAZON Hit@5, STARK-AMAZON R@20, STARK-AMAZON MRR, STARK-MAG Hit@1, STARK-MAG Hit@5, STARK-MAG R@20, STARK-MAG MRR, STARK-PRIME Hit@1, STARK-PRIME Hit@5, STARK-PRIME R@20, STARK-PRIME MRR
| Rank | Subject | Average MRR | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | AvaTaR(gpt-4-turbo) | 48.51 | — | Imported | 2026-05-06 |
| 2 | Claude3 Reranker | 46.81 | — | Imported | 2026-05-06 |
| 3 | GPT4 Reranker | 45.51 | — | Imported | 2026-05-06 |
| 4 | GritLM-7b | 42.07 | — | Imported | 2026-05-06 |
| 5 | multi-ada-002 | 41.05 | — | Imported | 2026-05-06 |
| 6 | ada-002 | 38.27 | — | Imported | 2026-05-06 |
| 7 | voyage-l2-instruct | 33.95 | — | Imported | 2026-05-06 |
| 8 | ColBERTv2 | 33.14 | — | Imported | 2026-05-06 |
| 9 | BM25 | 28.86 | — | Imported | 2026-05-06 |
| 10 | LLM2Vec | 25.15 | — | Imported | 2026-05-06 |
| 11 | ANCE (roberta) | 25.06 | — | Imported | 2026-05-06 |
| 12 | QAGNN (roberta) | 22.08 | — | Imported | 2026-05-06 |
| 13 | DPR (roberta) | 14.05 | — | Imported | 2026-05-06 |
No matching rows.