Step 3.5 Flash
StepFun / StepFun
21scores
14benchmarks
$0.1 / $0.3 per 1M tokenscost in/out
Metadata
StepFun Closed/API
Aliases: step-3.5-flash, stepfun-step-3.5-flash, stepfun/step-3.5-flash
| Benchmark | Category | Rank | Score | Sampled |
|---|---|---|---|---|
| PinchBench | Agentic | 26 | 0.85 | 2026-05-06 |
| Tau2-Bench Telecom | Agentic | 25 | 94.4% | 2026-05-11 |
| Tau2-Bench Telecom | Agentic | 69 | 87.4% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 82 | 32.6% | 2026-05-11 |
| Terminal-Bench Hard | Agentic | 109 | 27.3% | 2026-05-11 |
| WildClawBench | Agentic | 13 | 26.70 | 2026-05-06 |
| SciCode | Coding | 92 | 40.4% | 2026-05-11 |
| SciCode | Coding | 129 | 38.5% | 2026-05-11 |
| ALL Bench LLM | General Knowledge | 33 | 18.37 | 2026-05-06 |
| Artificial Analysis Intelligence Index | Intelligence | 88 | 38.47 | 2026-05-11 |
| Artificial Analysis Intelligence Index | Intelligence | 91 | 37.8 | 2026-05-11 |
| Humanity's Last Exam | Intelligence | 57 | 22.6% | 2026-05-11 |
| Humanity's Last Exam | Intelligence | 75 | 19.1% | 2026-05-11 |
| IMO-AnswerBench | Mathematics | 4 | 0.85 | 2026-05-06 |
| ALL Bench Multimodal | Multimodal | 31 | 20.61 | 2026-05-06 |
| Artificial Analysis Openness Index | Openness | 70 | 50 | 2026-05-11 |
| GPQA Diamond | Reasoning | 74 | 83.1% | 2026-05-11 |
| GPQA Diamond | Reasoning | 82 | 82.6% | 2026-05-11 |
| LiveSecBench | Safety | 21 | 52.1 | 2026-05-27 |
| CritPt | Science | 56 | 2.5% | 2026-05-11 |
| CritPt | Science | 57 | 2.3% | 2026-05-11 |
No matching rows.