Health Memory Arena
Event-driven longitudinal health-agent benchmark over synthetic patient trajectories, evaluating lookup, trend, comparison, anomaly, and explanation capabilities.
17rows
total_scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Total Score, Lookup, Trend, Comparison, Anomaly, Explanation
Showing 2 latest source slices.
| Rank | Subject | Total Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Mirobody (smart) | 61.40 | — | Imported | 2026-05-06 |
| 2 | Mirobody (expert) | 58.20 | — | Imported | 2026-05-06 |
| 3 | LLM-only (gemini-3.1) | 57.90 | — | Imported | 2026-05-06 |
| 4 | Mirobody (general) | 54.80 | — | Imported | 2026-05-06 |
| 5 | HippoRAG (k=10) | 52.20 | — | Imported | 2026-05-06 |
| 6 | LLM-only (gpt-5.4) | 51.70 | — | Imported | 2026-05-06 |
| 7 | Dyggraph | 51.60 | — | Imported | 2026-05-06 |
| 8 | LLM-only (glm-5) | 47.10 | — | Imported | 2026-05-06 |
| 9 | LLM-only (sonnet-4.6) | 41.40 | — | Imported | 2026-05-06 |
| 10 | LLM-only (minimax-m2.5) | 39.70 | — | Imported | 2026-05-06 |
| 1 | Mirobody (smart-general) | 62.10 | — | Imported | 2026-05-06 |
| 2 | Mirobody (smart-expert) | 59 | — | Imported | 2026-05-06 |
| 3 | LLM-only (sonnet-4.6) | 57.60 | — | Imported | 2026-05-06 |
| 4 | LLM-only (gpt-5.4) | 53.10 | — | Imported | 2026-05-06 |
| 5 | LLM-only (glm-5.1) | 53 | — | Imported | 2026-05-06 |
| 6 | LLM-only (minimax-m2.7) | 48.50 | — | Imported | 2026-05-06 |
| 7 | Mirobody (general) | 38.40 | — | Imported | 2026-05-06 |
No matching rows.