MMLU Professional Medicine
MMLU Professional Medicine: Evaluates clinical, biomedical, medical-exam, coding, or healthcare-document reasoning.
5rows
accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
MMLU Professional Medicine accuracy
| Rank | Subject | MMLU Professional Medicine accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-4 (5-shot) | 93.75% | GPT-4.5 openai-gpt-4.5-preview | Imported | 2026-05-27 |
| 2 | GPT-4 (zero-shot) | 93.01% | GPT-4 openai-gpt-4 | Imported | 2026-05-27 |
| 3 | Flan-PaLM 540B (few-shot) | 83.8% | — | Imported | 2026-05-27 |
| 4 | GPT-3.5 (zero-shot) | 70.22% | — | Imported | 2026-05-27 |
| 5 | GPT-3.5 (5-shot) | 69.85% | — | Imported | 2026-05-27 |
No matching rows.