FinSearchComp
Financial data-search benchmark evaluating model+search-tool agents on time-sensitive, simple historical, and complex historical data retrieval for global and Greater China markets.
24rows
avg_accuracyprimary metric
2026-05-28sampled
Metadata
Metrics
Average Accuracy, Time-Sensitive Accuracy, Simple Historical Accuracy, Complex Historical Accuracy
| Rank | Subject | Average Accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Grok 4 (web) (Global Data Performance) | 68.9% average accuracy | — | Imported | 2026-05-28 |
| 2 | GPT-5-Thinking (web) (Global Data Performance) | 63.9% average accuracy | — | Imported | 2026-05-28 |
| 3 | Gemini 2.5 pro (web) (Global Data Performance) | 42.6% average accuracy | — | Imported | 2026-05-28 |
| 4 | DouBao (web) (Global Data Performance) | 39.1% average accuracy | — | Imported | 2026-05-28 |
| 5 | Qwen3-235B-A22B-2507 (web) (Global Data Performance) | 37.4% average accuracy | — | Imported | 2026-05-28 |
| 6 | YuanBao (DeepSeek V3) (web) (Global Data Performance) | 30.5% average accuracy | — | Imported | 2026-05-28 |
| 7 | YuanBao (HunYuan-T1-Thinking) (web) (Global Data Performance) | 29.8% average accuracy | — | Imported | 2026-05-28 |
| 7 | YuanBao (DeepSeek R1) (web) (Global Data Performance) | 29.8% average accuracy | — | Imported | 2026-05-28 |
| 7 | DouBao-Thinking (web) (Global Data Performance) | 29.8% average accuracy | — | Imported | 2026-05-28 |
| 10 | Kimi k2 (web) (Global Data Performance) | 29.5% average accuracy | — | Imported | 2026-05-28 |
| 11 | DeepSeek R1 (web) (Global Data Performance) | 17.2% average accuracy | — | Imported | 2026-05-28 |
| 12 | ERNIE X1 (web) (Global Data Performance) | 16.6% average accuracy | — | Imported | 2026-05-28 |
| 1 | DouBao (web) (Greater China Data Performance) | 54.2% average accuracy | — | Imported | 2026-05-28 |
| 2 | YuanBao (DeepSeek R1) (web) (Greater China Data Performance) | 52.5% average accuracy | — | Imported | 2026-05-28 |
| 3 | Grok 4 (web) (Greater China Data Performance) | 51.9% average accuracy | — | Imported | 2026-05-28 |
| 4 | YuanBao (HunYuan-T1-Thinking) (web) (Greater China Data Performance) | 50.5% average accuracy | — | Imported | 2026-05-28 |
| 5 | DouBao-Thinking (web) (Greater China Data Performance) | 49.0% average accuracy | — | Imported | 2026-05-28 |
| 6 | YuanBao (DeepSeek V3) (web) (Greater China Data Performance) | 48.8% average accuracy | — | Imported | 2026-05-28 |
| 7 | GPT-5-Thinking (web) (Greater China Data Performance) | 46.4% average accuracy | — | Imported | 2026-05-28 |
| 8 | ERNIE X1 (web) (Greater China Data Performance) | 40.8% average accuracy | — | Imported | 2026-05-28 |
| 9 | DeepSeek R1 (web) (Greater China Data Performance) | 40.5% average accuracy | — | Imported | 2026-05-28 |
| 10 | Kimi k2 (web) (Greater China Data Performance) | 38.3% average accuracy | — | Imported | 2026-05-28 |
| 11 | Gemini 2.5 pro (web) (Greater China Data Performance) | 36.8% average accuracy | — | Imported | 2026-05-28 |
| 12 | Qwen3-235B-A22B-2507 (web) (Greater China Data Performance) | 21.9% average accuracy | — | Imported | 2026-05-28 |
No matching rows.