CG-Bench

Clue-grounded long-video question-answering benchmark evaluating MCQ accuracy, clue-grounding credibility metrics, and open-ended answer accuracy.

22rows
open_ended_accuracyprimary metric
2026-05-28sampled

Metadata

Metrics

MCQ Clue Accuracy, MCQ Long-Video Accuracy, Credibility mIoU, Credibility Recall@IoU, Credibility Accuracy@IoU, Credibility CRR, Open-Ended Accuracy

Latest Results

Rows are imported from the official CG-Bench mini-set leaderboard. The source table covers 1,118 videos and 3,000 questions for faster evaluation.

Rank Subject Open-Ended Accuracy Model Match Provenance Sampled
1 GPT-4o-08-06 39.2% open-ended acc. / 44.9% MCQ long acc. GPT-4o
openai-gpt-4o
Imported 2026-05-28
2 Claude3.5-Sonnet 35.6% open-ended acc. / 40.3% MCQ long acc. Claude 3.5 Sonnet
anthropic-claude-3.5-sonnet
Imported 2026-05-28
3 InternVL2.5 34.2% open-ended acc. / 44.2% MCQ long acc. Imported 2026-05-28
4 Qwen2-VL 33.7% open-ended acc. / 45.3% MCQ long acc. Imported 2026-05-28
5 Gemini-1.5-Pro 28.7% open-ended acc. / 37.8% MCQ long acc. Imported 2026-05-28
6 VITA 28% open-ended acc. / 33% MCQ long acc. Imported 2026-05-28
7 MiniCPM-v2.6 26.3% open-ended acc. / 29.9% MCQ long acc. Imported 2026-05-28
8 Kangaroo 25.9% open-ended acc. / 31.2% MCQ long acc. Imported 2026-05-28
9 LLaVA-OneVision 25% open-ended acc. / 30.9% MCQ long acc. Imported 2026-05-28
10 GPT-4mini-08-06 24.9% open-ended acc. / 32.6% MCQ long acc. GPT-4
openai-gpt-4
Imported 2026-05-28
11 Video-CCAM 24.8% open-ended acc. / 29.1% MCQ long acc. Imported 2026-05-28
12 Gemini-1.5-Flash 24.6% open-ended acc. / 33.5% MCQ long acc. Imported 2026-05-28
13 LongVA 24.2% open-ended acc. / 28.7% MCQ long acc. Imported 2026-05-28
14 ViLA 23.8% open-ended acc. / 28.1% MCQ long acc. Imported 2026-05-28
15 InternVL-Chat-v1.5 22.9% open-ended acc. / 28.5% MCQ long acc. Imported 2026-05-28
16 Chat-UniVi-v1.5 21.8% open-ended acc. / 26.7% MCQ long acc. Imported 2026-05-28
17 ShareGPT4Video 21.5% open-ended acc. / 27.1% MCQ long acc. Imported 2026-05-28
18 Qwen-VL-Chat 20.1% open-ended acc. / 20.7% MCQ long acc. Imported 2026-05-28
19 ST-LLM 20% open-ended acc. / 24.7% MCQ long acc. Imported 2026-05-28
20 Videochat2 18.4% open-ended acc. / 19.1% MCQ long acc. Imported 2026-05-28
21 VideoLLaMA 16% open-ended acc. / 18% MCQ long acc. Imported 2026-05-28
22 Video-LLaVA 12% open-ended acc. / 16.8% MCQ long acc. Imported 2026-05-28