Roboflow Vision Evals - Visual Understanding

Metadata

Score, Passed, Avg Eval Time (lower is better)

Rank	Subject	Score	Model Match	Provenance	Sampled
1	Gemini 3.5 Flash	83.58%	Gemini 3.5 Flash google-gemini-3.5-flash	Imported	2026-05-22
2	Gemini 3.1 Pro (Tools)	80.6%	Gemini 3.1 Pro Preview Custom Tools google-gemini-3.1-pro-preview-customtools	Imported	2026-05-22
3	Gemini 3 Flash (Tools)	79.1%	Gemini 3 Flash Preview google-gemini-3-flash-preview	Imported	2026-05-22
4	Gemini 3.1 Pro	77.61%	Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview	Imported	2026-05-22
5	GPT-5.4	76.12%	GPT-5.4 openai-gpt-5.4	Imported	2026-05-22