Clotho-AQA
Clotho-AQA: Evaluates temporal, video, speech, or audio understanding beyond static text and image inputs.
6rows
accuracyprimary metric
2026-05-27sampled
Metadata
Metrics
Majority-votes accuracy, Unfiltered accuracy, Unanimous accuracy, Top-1 accuracy, Top-5 accuracy, Top-10 accuracy
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Binary classifier (Question only) | 64.4% | — | Imported | 2026-05-27 |
| 2 | Binary classifier (Audio + question) | 63.2% | — | Imported | 2026-05-27 |
| 3 | Binary classifier (Audio only) | 58.2% | — | Imported | 2026-05-27 |
| 4 | Single-word multiclass classifier (Question only) | 55.7% | — | Imported | 2026-05-27 |
| 5 | Single-word multiclass classifier (Audio + question) | 54.2% | — | Imported | 2026-05-27 |
| 6 | Single-word multiclass classifier (Audio only) | 3.2% | — | Imported | 2026-05-27 |
No matching rows.