SECQUE
SEC filing retrieval and question-answering benchmark for financial RAG over public company disclosures.
7rows
strict_accuracyprimary metric
2026-05-28sampled
Metadata
Metrics
Strict Accuracy, Normalized Accuracy, Financial Prompt Strict Accuracy, Financial Prompt Normalized Accuracy, Baseline CoT Strict Accuracy, Baseline CoT Normalized Accuracy, Financial CoT Strict Accuracy, Financial CoT Normalized Accuracy, Flipped Strict Accuracy, Flipped Normalized Accuracy, Average Tokens by Model (lower is better)
| Rank | Subject | Strict Accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | GPT-4o | 0.69 | GPT-4o openai-gpt-4o | Imported | 2026-05-28 |
| 2 | Llama-3.3-70B-Instruct | 0.65 | Llama 3.3 70B Instruct meta-llama-llama-3.3-70b-instruct | Imported | 2026-05-28 |
| 3 | GPT-4o-mini | 0.64 | GPT-4o-mini openai-gpt-4o-mini | Imported | 2026-05-28 |
| 4 | Qwen2.5-32B-Instruct | 0.61 | — | Imported | 2026-05-28 |
| 5 | Phi-4 | 0.56 | Phi 4 microsoft-phi-4 | Imported | 2026-05-28 |
| 6 | Meta-Llama-3.1-8B-Instruct | 0.48 | — | Imported | 2026-05-28 |
| 7 | Mistral-Nemo-Instruct-2407 | 0.46 | Mistral: Mistral Nemo mistralai-mistral-nemo | Imported | 2026-05-28 |
No matching rows.