ToolAlpaca
ToolAlpaca: Evaluates tool calling, API use, function selection, structured arguments, and multi-step tool workflows.
5rows
real_world_api_overallprimary metric
2026-05-27sampled
Metadata
Metrics
Real-world API Overall, Real-world API Procedure, Real-world API Response, Simulated Tools Overall, Simulated Tools Procedure, Simulated Tools Response, Simulated Tools Human Accept
| Rank | Subject | Real-world API Overall | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-3.5 | 72.8% | — | Imported | 2026-05-27 |
| 2 | ToolAlpaca-13B | 61.4% | — | Imported | 2026-05-27 |
| 3 | ToolAlpaca-7B | 55.3% | — | Imported | 2026-05-27 |
| 4 | Vicuna-13B | 12.3% | — | Imported | 2026-05-27 |
| 5 | Vicuna-7B | 7.9% | — | Imported | 2026-05-27 |
No matching rows.