Social IQa

The first large-scale benchmark for commonsense reasoning about social situations. Contains 38,000 multiple choice questions probing emotional and social intelligence in everyday situations, testing commonsense understanding of social interactions and theory of mind reasoning about the implied emotions and behavior of others.

9rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Normalized Score

Latest Results

Rank Subject Score Model Match Provenance Sampled
1 Phi-3.5-MoE-instruct 0.78 Self-reported 2026-05-06
2 Phi-3.5-mini-instruct 0.75 Self-reported 2026-05-06
3 Phi 4 Mini 0.72 Self-reported 2026-05-06
4 Gemma 2 27B 0.54 Gemma 2 27B
google-gemma-2-27b-it
Self-reported 2026-05-06
5 Gemma 2 9B 0.53 Self-reported 2026-05-06
6 Gemma 3n E4B 0.50 Self-reported 2026-05-06
6 Gemma 3n E4B Instructed LiteRT Preview 0.50 Gemma 3n 4B
google-gemma-3n-e4b-it
Self-reported 2026-05-06
8 Gemma 3n E2B Instructed LiteRT (Preview) 0.49 Gemma 3n 2B
google-gemma-3n-e2b-it
Self-reported 2026-05-06
8 Gemma 3n E2B 0.49 Self-reported 2026-05-06