MultiNRC

MultiNRC benchmarks LLMs on 1,000+ culturally grounded reasoning questions by native French, Spanish, and Chinese speakers across four reasoning categor...

43rows
scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Score, Confidence Interval Upper, Max Score

Latest Results

Rank Subject Score Model Match Provenance Sampled
1 gpt-5-pro-2025-10-06 65.20 GPT-5 Pro
openai-gpt-5-pro
Imported 2026-05-06
1 gemini-3.1-pro-preview 64.74 Gemini 3.1 Pro Preview
google-gemini-3.1-pro-preview
Imported 2026-05-06
1 gpt-5.4-pro-2026-03-05 62.27 GPT-5.4 Pro
openai-gpt-5.4-pro
Imported 2026-05-06
2 Muse Spark 59.05 Imported 2026-05-06
2 gemini-3-pro-preview 58.96 Gemini 3
google-gemini-3
Imported 2026-05-06
3 gpt-5.4-2026-03-05 (xhigh thinking) 58.29 GPT-5.4
openai-gpt-5.4
Imported 2026-05-06
3 claude-opus-4-6-thinking-max 57.06 Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-06
6 gpt-5-2025-08-07 52.13 GPT-5
openai-gpt-5
Imported 2026-05-06
7 o3-pro-2025-06-10-high 49 Imported 2026-05-06
7 gpt-5.1-thinking 49 GPT-5.1
openai-gpt-5.1
Imported 2026-05-06
7 claude-opus-4-5-20251101-thinking 48.63 Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-06
7 claude-opus-4-6 (Non-Thinking) 48.34 Claude Opus 4.6
anthropic-claude-opus-4.6
Imported 2026-05-06
8 o3-2025-04-16-high 45.50 Imported 2026-05-06
8 Gemini-2.5-Pro-Preview-06-05 45.12 Gemini 2.5 Pro Preview 06-05
google-gemini-2.5-pro-preview
Imported 2026-05-06
8 o3-2025-04-16-medium 44.45 Imported 2026-05-06
12 gpt-5.2-2025-12-11 42.18 GPT-5.2
openai-gpt-5.2
Imported 2026-05-06
12 claude-opus-4-5-20251101 41.23 Claude Opus 4.5
anthropic-claude-opus-4.5
Imported 2026-05-06
15 claude-opus-4-1-20250805-thinking 38.39 Claude Opus 4.1
anthropic-claude-opus-4.1
Imported 2026-05-06
16 claude-sonnet-4-5-20250929-thinking 35.83 Claude Sonnet 4.5
anthropic-claude-sonnet-4.5
Imported 2026-05-06
17 kimi-k2.5 35.17 KIMI MoonshotAI: Kimi K2.5
moonshotai-kimi-k2.5
Imported 2026-05-06
17 Claude-4-Opus-20250514-thinking 33.93 Imported 2026-05-06
19 claude-opus-4-1-20250805 29.67 Claude Opus 4.1
anthropic-claude-opus-4.1
Imported 2026-05-06
20 Claude-4-Opus-20250514 29 Imported 2026-05-06
21 claude-sonnet-4-5-20250929 28.15 Claude Sonnet 4.5
anthropic-claude-sonnet-4.5
Imported 2026-05-06
21 Claude-3.7-Sonnet-thinking 27.77 Claude 3.7 Sonnet (thinking)
anthropic-claude-3.7-sonnet-thinking
Imported 2026-05-06
21 Deepseek-R1-0528 27.58 R1 0528
deepseek-deepseek-r1-0528
Imported 2026-05-06
21 Qwen3-235B-A22B-Thinking-2507 27.11 Qwen3 235B A22B Thinking 2507
qwen-qwen3-235b-a22b-thinking-2507
Imported 2026-05-06
21 gemini-3.1-flash-lite-preview 25.02 Gemini 3.1 Flash Lite Preview
google-gemini-3.1-flash-lite-preview
Imported 2026-05-06
22 Deepseek-R1 24.27 R1
deepseek-r1
Imported 2026-05-06
22 gpt-5-mini-2025-08-07 23.89 GPT-5 Mini
openai-gpt-5-mini
Imported 2026-05-06
23 DeepSeek-V3.1 23.60 DeepSeek V3.1 Terminus
deepseek-deepseek-v3.1-terminus
Imported 2026-05-06
26 o4-mini-high 22.18 o4 Mini High
openai-o4-mini-high
Imported 2026-05-06
27 GPT-4.1 21.23 GPT-4.1
openai-gpt-4.1
Imported 2026-05-06
31 kimi-k2-instruct 18.48 KIMI MoonshotAI: Kimi K2 0711
moonshotai-kimi-k2
Imported 2026-05-06
31 Claude-4-Sonnet-20250514 18.39 Imported 2026-05-06
31 gpt-5.1-instant 18.29 GPT-5.1 Chat
openai-gpt-5.1-chat
Imported 2026-05-06
31 Qwen3-235B-A22B 17.63 Qwen3 235B A22B
qwen-qwen3-235b-a22b
Imported 2026-05-06
31 glm-4p5 17.44 GLM GLM 4.5
z-ai-glm-4.5
Imported 2026-05-06
33 gpt-oss-120b 15.17 gpt-oss-120b
openai-gpt-oss-120b
Imported 2026-05-06
38 GPT-4o 12.42 GPT-4o
openai-gpt-4o
Imported 2026-05-06
39 gpt-oss-20b 10.43 gpt-oss-20b
openai-gpt-oss-20b
Imported 2026-05-06
39 glm-4p5-air 10.43 GLM GLM 4.5 Air
z-ai-glm-4.5-air
Imported 2026-05-06
40 Llama-4-Maverick 8.44 Llama 4 Maverick
meta-llama-4-maverick
Imported 2026-05-06