Social IQa
The first large-scale benchmark for commonsense reasoning about social situations. Contains 38,000 multiple choice questions probing emotional and social intelligence in everyday situations, testing commonsense understanding of social interactions and theory of mind reasoning about the implied emotions and behavior of others.
9rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Normalized Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | Phi-3.5-MoE-instruct | 0.78 | — | Self-reported | 2026-05-06 |
| 2 | Phi-3.5-mini-instruct | 0.75 | — | Self-reported | 2026-05-06 |
| 3 | Phi 4 Mini | 0.72 | — | Self-reported | 2026-05-06 |
| 4 | Gemma 2 27B | 0.54 | Gemma 2 27B google-gemma-2-27b-it | Self-reported | 2026-05-06 |
| 5 | Gemma 2 9B | 0.53 | — | Self-reported | 2026-05-06 |
| 6 | Gemma 3n E4B | 0.50 | — | Self-reported | 2026-05-06 |
| 6 | Gemma 3n E4B Instructed LiteRT Preview | 0.50 | Gemma 3n 4B google-gemma-3n-e4b-it | Self-reported | 2026-05-06 |
| 8 | Gemma 3n E2B Instructed LiteRT (Preview) | 0.49 | Gemma 3n 2B google-gemma-3n-e2b-it | Self-reported | 2026-05-06 |
| 8 | Gemma 3n E2B | 0.49 | — | Self-reported | 2026-05-06 |
No matching rows.