MultiNRC
MultiNRC benchmarks LLMs on 1,000+ culturally grounded reasoning questions by native French, Spanish, and Chinese speakers across four reasoning categor...
43rows
scoreprimary metric
2026-05-06sampled
Metadata
Metrics
Score, Confidence Interval Upper, Max Score
| Rank | Subject | Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | gpt-5-pro-2025-10-06 | 65.20 | GPT-5 Pro openai-gpt-5-pro | Imported | 2026-05-06 |
| 1 | gemini-3.1-pro-preview | 64.74 | Gemini 3.1 Pro Preview google-gemini-3.1-pro-preview | Imported | 2026-05-06 |
| 1 | gpt-5.4-pro-2026-03-05 | 62.27 | GPT-5.4 Pro openai-gpt-5.4-pro | Imported | 2026-05-06 |
| 2 | Muse Spark | 59.05 | — | Imported | 2026-05-06 |
| 2 | gemini-3-pro-preview | 58.96 | Gemini 3 google-gemini-3 | Imported | 2026-05-06 |
| 3 | gpt-5.4-2026-03-05 (xhigh thinking) | 58.29 | GPT-5.4 openai-gpt-5.4 | Imported | 2026-05-06 |
| 3 | claude-opus-4-6-thinking-max | 57.06 | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-06 |
| 6 | gpt-5-2025-08-07 | 52.13 | GPT-5 openai-gpt-5 | Imported | 2026-05-06 |
| 7 | o3-pro-2025-06-10-high | 49 | — | Imported | 2026-05-06 |
| 7 | gpt-5.1-thinking | 49 | GPT-5.1 openai-gpt-5.1 | Imported | 2026-05-06 |
| 7 | claude-opus-4-5-20251101-thinking | 48.63 | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-06 |
| 7 | claude-opus-4-6 (Non-Thinking) | 48.34 | Claude Opus 4.6 anthropic-claude-opus-4.6 | Imported | 2026-05-06 |
| 8 | o3-2025-04-16-high | 45.50 | — | Imported | 2026-05-06 |
| 8 | Gemini-2.5-Pro-Preview-06-05 | 45.12 | Gemini 2.5 Pro Preview 06-05 google-gemini-2.5-pro-preview | Imported | 2026-05-06 |
| 8 | o3-2025-04-16-medium | 44.45 | — | Imported | 2026-05-06 |
| 12 | gpt-5.2-2025-12-11 | 42.18 | GPT-5.2 openai-gpt-5.2 | Imported | 2026-05-06 |
| 12 | claude-opus-4-5-20251101 | 41.23 | Claude Opus 4.5 anthropic-claude-opus-4.5 | Imported | 2026-05-06 |
| 15 | claude-opus-4-1-20250805-thinking | 38.39 | Claude Opus 4.1 anthropic-claude-opus-4.1 | Imported | 2026-05-06 |
| 16 | claude-sonnet-4-5-20250929-thinking | 35.83 | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-06 |
| 17 | kimi-k2.5 | 35.17 | MoonshotAI: Kimi K2.5 moonshotai-kimi-k2.5 | Imported | 2026-05-06 |
| 17 | Claude-4-Opus-20250514-thinking | 33.93 | — | Imported | 2026-05-06 |
| 19 | claude-opus-4-1-20250805 | 29.67 | Claude Opus 4.1 anthropic-claude-opus-4.1 | Imported | 2026-05-06 |
| 20 | Claude-4-Opus-20250514 | 29 | — | Imported | 2026-05-06 |
| 21 | claude-sonnet-4-5-20250929 | 28.15 | Claude Sonnet 4.5 anthropic-claude-sonnet-4.5 | Imported | 2026-05-06 |
| 21 | Claude-3.7-Sonnet-thinking | 27.77 | Claude 3.7 Sonnet (thinking) anthropic-claude-3.7-sonnet-thinking | Imported | 2026-05-06 |
| 21 | Deepseek-R1-0528 | 27.58 | R1 0528 deepseek-deepseek-r1-0528 | Imported | 2026-05-06 |
| 21 | Qwen3-235B-A22B-Thinking-2507 | 27.11 | Qwen3 235B A22B Thinking 2507 qwen-qwen3-235b-a22b-thinking-2507 | Imported | 2026-05-06 |
| 21 | gemini-3.1-flash-lite-preview | 25.02 | Gemini 3.1 Flash Lite Preview google-gemini-3.1-flash-lite-preview | Imported | 2026-05-06 |
| 22 | Deepseek-R1 | 24.27 | R1 deepseek-r1 | Imported | 2026-05-06 |
| 22 | gpt-5-mini-2025-08-07 | 23.89 | GPT-5 Mini openai-gpt-5-mini | Imported | 2026-05-06 |
| 23 | DeepSeek-V3.1 | 23.60 | DeepSeek V3.1 Terminus deepseek-deepseek-v3.1-terminus | Imported | 2026-05-06 |
| 26 | o4-mini-high | 22.18 | o4 Mini High openai-o4-mini-high | Imported | 2026-05-06 |
| 27 | GPT-4.1 | 21.23 | GPT-4.1 openai-gpt-4.1 | Imported | 2026-05-06 |
| 31 | kimi-k2-instruct | 18.48 | MoonshotAI: Kimi K2 0711 moonshotai-kimi-k2 | Imported | 2026-05-06 |
| 31 | Claude-4-Sonnet-20250514 | 18.39 | — | Imported | 2026-05-06 |
| 31 | gpt-5.1-instant | 18.29 | GPT-5.1 Chat openai-gpt-5.1-chat | Imported | 2026-05-06 |
| 31 | Qwen3-235B-A22B | 17.63 | Qwen3 235B A22B qwen-qwen3-235b-a22b | Imported | 2026-05-06 |
| 31 | glm-4p5 | 17.44 | GLM 4.5 z-ai-glm-4.5 | Imported | 2026-05-06 |
| 33 | gpt-oss-120b | 15.17 | gpt-oss-120b openai-gpt-oss-120b | Imported | 2026-05-06 |
| 38 | GPT-4o | 12.42 | GPT-4o openai-gpt-4o | Imported | 2026-05-06 |
| 39 | gpt-oss-20b | 10.43 | gpt-oss-20b openai-gpt-oss-20b | Imported | 2026-05-06 |
| 39 | glm-4p5-air | 10.43 | GLM 4.5 Air z-ai-glm-4.5-air | Imported | 2026-05-06 |
| 40 | Llama-4-Maverick | 8.44 | Llama 4 Maverick meta-llama-4-maverick | Imported | 2026-05-06 |
No matching rows.