HealthAdminBench
Healthcare administration agent benchmark for prior authorization, appeals, durable medical equipment, payer portals, fax, and EHR-adjacent workflows.
7rows
task_success_rateprimary metric
2026-05-27sampled
Metadata
Metrics
Task Success Rate, Score, Max Score, Avg. Steps (lower is better), Avg. Time (lower is better)
| Rank | Subject | Task Success Rate | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | openai-cua | 84.94% | — | Imported | 2026-05-27 |
| 2 | anthropic-cua | 81.77% | — | Imported | 2026-05-27 |
| 3 | gemini-3.1 | 73.39% | — | Imported | 2026-05-27 |
| 4 | qwen-3 | 58.38% | — | Imported | 2026-05-27 |
| 5 | kimi-k2-5 | 55.98% | — | Imported | 2026-05-27 |
| 6 | claude-opus-4-6 | 49.24% | — | Imported | 2026-05-27 |
| 7 | gpt-5.4 | 43.49% | — | Imported | 2026-05-27 |
No matching rows.