ActivityNet-QA

ActivityNet-QA: Evaluates temporal, video, speech, or audio understanding beyond static text and image inputs.

3rows
overall_accuracyprimary metric
2026-05-27sampled

Metadata

Metrics

Overall accuracy, Motion accuracy, Spatial relation accuracy, Temporal relation accuracy, Free-form accuracy, WUPS@0.9, WUPS@0.0

Latest Results

Rows are parsed from the ActivityNet-QA paper arXiv LaTeX baseline table. Overall accuracy is used as the primary score; question-type accuracies and WUPS metrics are preserved.

Rank Subject Overall accuracy Model Match Provenance Sampled
1 E-SA 31.8% Imported 2026-05-27
2 E-MN 27.1% Imported 2026-05-27
3 E-VQA 25.1% Imported 2026-05-27