MedQA
MedQA: Evaluates clinical, biomedical, medical-exam, coding, or healthcare-document reasoning.
11rows
usmle_test_accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
USMLE test accuracy, USMLE dev accuracy, TWMLE dev accuracy, TWMLE test accuracy
| Rank | Subject | USMLE test accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | BioBERT-Large | 36.7 | — | Imported | 2026-05-27 |
| 2 | BioRoBERTa-Base | 36.1 | — | Imported | 2026-05-27 |
| 3 | IR-Custom | 36.1 | — | Imported | 2026-05-27 |
| 4 | IR-ES | 35.5 | — | Imported | 2026-05-27 |
| 5 | RoBERTa-Large | 35.0 | — | Imported | 2026-05-27 |
| 6 | BERT-Base-En | 34.3 | — | Imported | 2026-05-27 |
| 7 | BioBERT-Base | 34.1 | — | Imported | 2026-05-27 |
| 8 | clinicalBERT-Base | 32.4 | — | Imported | 2026-05-27 |
| 9 | PMI | 31.1 | — | Imported | 2026-05-27 |
| 10 | Max-out | 28.6 | — | Imported | 2026-05-27 |
| 11 | Chance | 25.0 | — | Imported | 2026-05-27 |
No matching rows.