MedBrowseComp
Medical browsing and search benchmark for multi-hop clinical research questions over live or web-grounded medical sources.
10rows
real_accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
Accuracy, Real Accuracy, Correct Count, Question Count
| Rank | Subject | Real Accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | perplexity_deep_research | 14/48 | — | Imported | 2026-05-27 |
| 2 | Gemini Pro | 12/48 | — | Imported | 2026-05-27 |
| 3 | Gemini Pro + tools | 111/453 | — | Imported | 2026-05-27 |
| 4 | perplexity_sonar_pro | 9/48 | — | Imported | 2026-05-27 |
| 5 | Gemini 2.0 Flash + tools | 75/453 | — | Imported | 2026-05-27 |
| 6 | Sonar Pro | 74/453 | — | Imported | 2026-05-27 |
| 7 | Gemini Pro (param) | 4/48 | — | Imported | 2026-05-27 |
| 8 | Gemini 2.0 Flash | 31/453 | — | Imported | 2026-05-27 |
| 9 | Gemini Pro (param) | 23/453 | — | Imported | 2026-05-27 |
| 10 | GPT-4.1 + tools | 19/453 | — | Imported | 2026-05-27 |
No matching rows.