Berkeley Function-Calling Leaderboard
Measures AI models' ability to correctly call and use functions in various contexts
109rows
overall_accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
Overall Accuracy, Non-Live AST Accuracy, Live Accuracy, Multi Turn Accuracy, Web Search Accuracy, Memory Accuracy, Total Cost (lower is better), Latency Mean (lower is better)
| Rank | Subject | Overall Accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Claude-Opus-4-5-20251101 (FC) | 77.47% | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-27 |
| 2 | Claude-Sonnet-4-5-20250929 (FC) | 73.24% | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-27 |
| 3 | Gemini-3-Pro-Preview (Prompt) | 72.51% | Gemini 3 google-gemini-3 | Imported | 2026-05-27 |
| 4 | GLM-4.6 (FC thinking) | 72.38% | GLM 4.6 z-ai-glm-4.6 | Imported | 2026-05-27 |
| 5 | Grok-4-1-fast-reasoning (FC) | 69.57% | Grok 4.1 Fast x-ai-grok-4.1-fast | Imported | 2026-05-27 |
| 6 | Claude-Haiku-4-5-20251001 (FC) | 68.7% | Claude Haiku 4.5 anthropic-claude-haiku-4.5 | Imported | 2026-05-27 |
| 7 | Gemini-3-Pro-Preview (FC) | 68.14% | Gemini 3 google-gemini-3 | Imported | 2026-05-27 |
| 8 | o3-2025-04-16 (Prompt) | 63.05% | o3 openai-o3 | Imported | 2026-05-27 |
| 9 | Grok-4-0709 (Prompt) | 62.97% | Grok 4 x-ai-grok-4 | Imported | 2026-05-27 |
| 10 | Grok-4-0709 (FC) | 61.38% | Grok 4 x-ai-grok-4 | Imported | 2026-05-27 |
| 11 | Moonshotai-Kimi-K2-Instruct (FC) | 59.06% | MoonshotAI: Kimi K2 0711 moonshotai-kimi-k2 | Imported | 2026-05-27 |
| 12 | Grok-4-1-fast-non-reasoning (FC) | 58.29% | Grok 4.1 Fast x-ai-grok-4.1-fast | Imported | 2026-05-27 |
| 13 | Command A Reasoning (FC) | 57.06% | — | Imported | 2026-05-27 |
| 14 | DeepSeek-V3.2-Exp (Prompt + Thinking) | 56.73% | DeepSeek V3.2 Exp deepseek-deepseek-v3.2-exp | Imported | 2026-05-27 |
| 15 | Gemini-2.5-Flash (FC) | 56.24% | Gemini 2.5 Flash google-gemini-2.5-flash | Imported | 2026-05-27 |
| 16 | GPT-5.2-2025-12-11 (FC) | 55.87% | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-27 |
| 17 | GPT-5-mini-2025-08-07 (FC) | 55.46% | GPT-5 Mini openai-gpt-5-mini | Imported | 2026-05-27 |
| 18 | xLAM-2-32b-fc-r (FC) | 54.66% | — | Imported | 2026-05-27 |
| 19 | DeepSeek-V3.2-Exp (FC) | 54.12% | DeepSeek V3.2 Exp deepseek-deepseek-v3.2-exp | Imported | 2026-05-27 |
| 20 | GPT-4.1-2025-04-14 (FC) | 53.96% | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-27 |
| 21 | o4-mini-2025-04-16 (FC) | 53.24% | o4 Mini openai-o4-mini | Imported | 2026-05-27 |
| 22 | xLAM-2-70b-fc-r (FC) | 53.07% | — | Imported | 2026-05-27 |
| 23 | Qwen3-235B-A22B-Instruct-2507 (Prompt) | 52.15% | Qwen3 235B A22B Instruct 2507 qwen-qwen3-235b-a22b-2507 | Imported | 2026-05-27 |
| 24 | GPT-5-nano-2025-08-07 (FC) | 51.45% | GPT-5 Nano openai-gpt-5-nano | Imported | 2026-05-27 |
| 25 | Nanbeige4-3B-Thinking-2511 (FC) | 51.4% | — | Imported | 2026-05-27 |
| 26 | Gemini-2.5-Flash (Prompt) | 50.9% | Gemini 2.5 Flash google-gemini-2.5-flash | Imported | 2026-05-27 |
| 27 | GPT-4.1-mini-2025-04-14 (FC) | 50.45% | GPT-4.1 Mini openai-gpt-4.1-mini | Imported | 2026-05-27 |
| 28 | o4-mini-2025-04-16 (Prompt) | 50.26% | o4 Mini openai-o4-mini | Imported | 2026-05-27 |
| 29 | Qwen3-32B (FC) | 48.71% | Qwen3 32B qwen-qwen3-32b | Imported | 2026-05-27 |
| 30 | o3-2025-04-16 (FC) | 48.56% | o3 openai-o3 | Imported | 2026-05-27 |
| 31 | Qwen3-235B-A22B-Instruct-2507 (FC) | 47.99% | Qwen3 235B A22B Instruct 2507 qwen-qwen3-235b-a22b-2507 | Imported | 2026-05-27 |
| 32 | Nanbeige3.5-Pro-Thinking (FC) | 47.68% | — | Imported | 2026-05-27 |
| 33 | Qwen3-32B (Prompt) | 46.78% | Qwen3 32B qwen-qwen3-32b | Imported | 2026-05-27 |
| 34 | xLAM-2-8b-fc-r (FC) | 46.68% | — | Imported | 2026-05-27 |
| 35 | Command A (FC) | 46.49% | Command A cohere-command-a | Imported | 2026-05-27 |
| 36 | BitAgent-Bounty-8B | 46.23% | — | Imported | 2026-05-27 |
| 37 | Arch-Agent-32B | 45.37% | — | Imported | 2026-05-27 |
| 38 | GPT-5.2-2025-12-11 (Prompt) | 45.27% | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-27 |
| 39 | Qwen3-8B (FC) | 42.57% | Qwen3 8B qwen-qwen3-8b | Imported | 2026-05-27 |
| 40 | ToolACE-2-8B (FC) | 42.44% | — | Imported | 2026-05-27 |
| 41 | Qwen3-30B-A3B-Instruct-2507 (FC) | 41.39% | Qwen3 30B A3B Instruct 2507 qwen-qwen3-30b-a3b-instruct-2507 | Imported | 2026-05-27 |
| 42 | xLAM-2-3b-fc-r (FC) | 41.22% | — | Imported | 2026-05-27 |
| 43 | Qwen3-14B (FC) | 41.03% | Qwen3 14B qwen-qwen3-14b | Imported | 2026-05-27 |
| 44 | Qwen3-8B (Prompt) | 40.43% | Qwen3 8B qwen-qwen3-8b | Imported | 2026-05-27 |
| 45 | GPT-4.1-2025-04-14 (Prompt) | 39.38% | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-27 |
| 46 | mistral-large-2411 (FC) | 38.37% | Mistral Large 2411 mistralai-mistral-large-2411 | Imported | 2026-05-27 |
| 47 | Qwen3-14B (Prompt) | 37.77% | Qwen3 14B qwen-qwen3-14b | Imported | 2026-05-27 |
| 48 | Mistral-Medium-2505 | 37.69% | — | Imported | 2026-05-27 |
| 49 | Mistral-Medium-2505 (FC) | 37.56% | — | Imported | 2026-05-27 |
| 50 | Llama-4-Maverick-17B-128E-Instruct-FP8 (FC) | 37.29% | — | Imported | 2026-05-27 |
| 51 | Mistral-small-2506 (FC) | 37.15% | — | Imported | 2026-05-27 |
| 52 | Gemini-2.5-Flash-Lite (FC) | 36.87% | Gemini 2.5 Flash Lite google-gemini-2.5-flash-lite | Imported | 2026-05-27 |
| 53 | Qwen3-30B-A3B-Instruct-2507 (Prompt) | 36.7% | Qwen3 30B A3B Instruct 2507 qwen-qwen3-30b-a3b-instruct-2507 | Imported | 2026-05-27 |
| 54 | Qwen3-4B-Instruct-2507 (FC) | 35.68% | — | Imported | 2026-05-27 |
| 55 | Qwen3-4B-Instruct-2507 (Prompt) | 35.52% | — | Imported | 2026-05-27 |
| 56 | Arch-Agent-3B | 35.36% | — | Imported | 2026-05-27 |
| 57 | Claude-Opus-4-5-20251101 (Prompt) | 33.47% | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-27 |
| 58 | GPT-4.1-nano-2025-04-14 (FC) | 33.05% | GPT-4.1 Nano openai-gpt-4.1-nano | Imported | 2026-05-27 |
| 59 | Mistral-Small-2506 (Prompt) | 32.38% | — | Imported | 2026-05-27 |
| 60 | Arch-Agent-1.5B | 32.14% | — | Imported | 2026-05-27 |
| 61 | Command R7B (FC) | 32.07% | Command R7B (12-2024) cohere-command-r7b-12-2024 | Imported | 2026-05-27 |
| 62 | Llama-3.3-70B-Instruct (FC) | 31.9% | Llama 3.3 70B Instruct meta-llama-llama-3.3-70b-instruct | Imported | 2026-05-27 |
| 63 | mistral-large-2411 (Prompt) | 31.84% | Mistral Large 2411 mistralai-mistral-large-2411 | Imported | 2026-05-27 |
| 64 | Hammer2.1-7b (FC) | 31.67% | — | Imported | 2026-05-27 |
| 65 | xLAM-2-1b-fc-r (FC) | 30.44% | — | Imported | 2026-05-27 |
| 66 | Gemma-3-12b-it (Prompt) | 30.43% | Gemma 3 12B google-gemma-3-12b-it | Imported | 2026-05-27 |
| 67 | GPT-4.1-mini-2025-04-14 (Prompt) | 29.73% | GPT-4.1 Mini openai-gpt-4.1-mini | Imported | 2026-05-27 |
| 68 | Hammer2.1-3b (FC) | 29.71% | — | Imported | 2026-05-27 |
| 69 | Gemma-3-27b-it (Prompt) | 29.47% | Gemma 3 27B google-gemma-3-27b-it | Imported | 2026-05-27 |
| 70 | Phi-4 (Prompt) | 28.79% | Phi 4 microsoft-phi-4 | Imported | 2026-05-27 |
| 71 | Qwen3-1.7B (FC) | 28.41% | — | Imported | 2026-05-27 |
| 72 | Llama-4-Scout-17B-16E-Instruct (FC) | 28.13% | Llama 4 Scout meta-llama-llama-4-scout | Imported | 2026-05-27 |
| 73 | Gemini-2.5-Flash-Lite (Prompt) | 28.03% | Gemini 2.5 Flash Lite google-gemini-2.5-flash-lite | Imported | 2026-05-27 |
| 74 | CoALM-70B | 27.99% | — | Imported | 2026-05-27 |
| 75 | Hammer2.1-1.5b (FC) | 27.88% | — | Imported | 2026-05-27 |
| 76 | palmyra-x-004 (FC) | 27.87% | — | Imported | 2026-05-27 |
| 77 | GPT-5-mini-2025-08-07 (Prompt) | 27.83% | GPT-5 Mini openai-gpt-5-mini | Imported | 2026-05-27 |
| 78 | Open-Mistral-Nemo-2407 (FC) | 27.63% | — | Imported | 2026-05-27 |
| 79 | GPT-5-nano-2025-08-07 (Prompt) | 27.55% | GPT-5 Nano openai-gpt-5-nano | Imported | 2026-05-27 |
| 80 | Amazon-Nova-2-Lite-v1:0 (FC) | 27.1% | — | Imported | 2026-05-27 |
| 81 | Granite-3.1-8B-Instruct (FC) | 27.1% | — | Imported | 2026-05-27 |
| 82 | Falcon3-10B-Instruct (FC) | 27.01% | — | Imported | 2026-05-27 |
| 83 | Granite-3.2-8B-Instruct (FC) | 26.87% | — | Imported | 2026-05-27 |
| 84 | CoALM-8B | 26.81% | — | Imported | 2026-05-27 |
| 85 | Llama-3.1-8B-Instruct (Prompt) | 25.83% | Llama 3.1 8B Instruct meta-llama-llama-3.1-8b-instruct | Imported | 2026-05-27 |
| 86 | MiniCPM3-4B-FC (FC) | 25.55% | — | Imported | 2026-05-27 |
| 87 | Claude-Haiku-4-5-20251001 (Prompt) | 25.26% | Claude Haiku 4.5 anthropic-claude-haiku-4.5 | Imported | 2026-05-27 |
| 88 | Amazon-Nova-Pro-v1:0 (FC) | 24.97% | — | Imported | 2026-05-27 |
| 89 | Claude-Sonnet-4-5-20250929 (Prompt) | 24.9% | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-27 |
| 90 | GPT-4.1-nano-2025-04-14 (Prompt) | 24.88% | GPT-4.1 Nano openai-gpt-4.1-nano | Imported | 2026-05-27 |
| 91 | Falcon3-7B-Instruct (FC) | 24.03% | — | Imported | 2026-05-27 |
| 92 | Qwen3-0.6B (FC) | 23.93% | — | Imported | 2026-05-27 |
| 93 | Granite-20b-FunctionCalling (FC) | 23.23% | — | Imported | 2026-05-27 |
| 94 | Qwen3-0.6B (Prompt) | 22.38% | — | Imported | 2026-05-27 |
| 95 | Amazon-Nova-Micro-v1:0 (FC) | 22.29% | — | Imported | 2026-05-27 |
| 96 | RZN-T (Prompt) | 22.25% | — | Imported | 2026-05-27 |
| 97 | MiniCPM3-4B (Prompt) | 22.08% | — | Imported | 2026-05-27 |
| 98 | Llama-3.2-3B-Instruct (FC) | 21.95% | Llama 3.2 3B Instruct meta-llama-llama-3.2-3b-instruct | Imported | 2026-05-27 |
| 99 | Bielik-11B-v2.3-Instruct (Prompt) | 21.9% | — | Imported | 2026-05-27 |
| 100 | Hammer2.1-0.5b (FC) | 21.22% | — | Imported | 2026-05-27 |
| 101 | Gemma-3-4b-it (Prompt) | 19.62% | Gemma 3 4B google-gemma-3-4b-it | Imported | 2026-05-27 |
| 102 | Open-Mistral-Nemo-2407 (Prompt) | 19.31% | — | Imported | 2026-05-27 |
| 103 | Granite-4.0-350m (FC) | 18.98% | — | Imported | 2026-05-27 |
| 104 | Falcon3-3B-Instruct (FC) | 16.25% | — | Imported | 2026-05-27 |
| 105 | Ministral-8B-Instruct-2410 (FC) | 11.1% | — | Imported | 2026-05-27 |
| 106 | Falcon3-1B-Instruct (FC) | 11.08% | — | Imported | 2026-05-27 |
| 107 | Llama-3.2-1B-Instruct (FC) | 10.82% | Llama 3.2 1B Instruct meta-llama-llama-3.2-1b-instruct | Imported | 2026-05-27 |
| 108 | Llama-3.1-Nemotron-Ultra-253B-v1 (FC) | 10% | — | Imported | 2026-05-27 |
| 109 | Gemma-3-1b-it (Prompt) | 7.17% | — | Imported | 2026-05-27 |
No matching rows.