MSRVTT-QA | BenchmarkList

Metadata

Accuracy

Rank	Subject	Accuracy	Model Match	Provenance	Sampled
1	Flash-VStream	72.4%	—	Imported	2026-05-27
2	PLLaVA (34B)	68.7%	—	Imported	2026-05-27
3	Elysium	67.5%	—	Imported	2026-05-27
4	SlowFast-LLaVA-34B	67.4%	—	Imported	2026-05-27
5	Tarsier (34B)	66.4%	—	Imported	2026-05-27
6	LinVT-Qwen2-VL\n(7B)	66.2%	—	Imported	2026-05-27
7	TS-LLaVA-34B	66.2%	—	Imported	2026-05-27
8	PPLLaVA-7B	64.3%	—	Imported	2026-05-27
9	IG-VLM	63.8%	—	Imported	2026-05-27
10	ST-LLM	63.2%	—	Imported	2026-05-27
11	CAT-7B	62.1%	—	Imported	2026-05-27
12	VideoGPT+	60.6%	—	Imported	2026-05-27
13	Vista-LLaMA-7B	60.5%	—	Imported	2026-05-27
14	MiniGPT4-video-7B	59.73%	—	Imported	2026-05-27
15	LLaVA-Mini	59.5%	—	Imported	2026-05-27
16	Video-LaVIT	59.3%	—	Imported	2026-05-27
17	Video-LLaVA-7B	59.2%	—	Imported	2026-05-27
18	LLaMA-VID-13B (2 Token)	58.9%	—	Imported	2026-05-27
19	LLaMA-VID-7B (2 Token)	57.7%	—	Imported	2026-05-27
20	SUM-shot+Vicuna	56.8%	—	Imported	2026-05-27
21	Omni-VideoAssistant	55.3%	—	Imported	2026-05-27
22	Chat-UniVi-7B	55%	—	Imported	2026-05-27
23	VideoChat2	54.1%	—	Imported	2026-05-27
24	MovieChat	52.7%	—	Imported	2026-05-27
25	BT-Adapter (zero-shot)	51.2%	—	Imported	2026-05-27
26	BT-Adapter (zero-shot)	51.2%	—	Imported	2026-05-27
27	Video-ChatGPT-7B	49.3%	—	Imported	2026-05-27
28	Video Chat-7B	45%	—	Imported	2026-05-27
29	LLaMA Adapter-7B	43.8%	—	Imported	2026-05-27
30	Video LLaMA-7B	29.6%	—	Imported	2026-05-27