MaCBench | BenchmarkList

Metadata

ID: macbench
Category: Science
Release: 2024-11-25
Source: Source page
Snapshot: Snapshot source
Post: Announcement post

Metrics

Overall Score, afm-image, chem-lab-basic, chem-lab-comparison, chem-lab-equipments, chirality, cif-atomic-species, cif-crystal-system, cif-density, cif-symmetry, cif-volume, electronic-structure, handdrawn-molecules, isomers, mof-adsorption-strength-comparison, mof-adsorption-strength-order, mof-capacity-comparison, mof-capacity-order, mof-capacity-value, mof-henry-constant-comparison, mof-henry-constant-order, mof-working-capacity-comparison, mof-working-capacity-order, mof-working-capacity-value, org-schema-wo-smiles, org-schema, organic-molecules, spectral-analysis, tables-qa, us-patent-figures, us-patent-plots, xrd-pattern-matching, xrd-pattern-shape, xrd-peak-position, xrd-relative-intensity

Rank	Subject	Overall Score	Model Match	Provenance	Sampled
1	llama-4-maverick-17b-128e-instruct	0.70	Llama 4 Maverick meta-llama-4-maverick	Imported	2026-05-06
2	Claude-3.5-Sonnet	0.67	Claude 3.5 Sonnet anthropic-claude-3.5-sonnet	Imported	2026-05-06
3	llama-4-scout-17b-16e-instruct	0.63	Llama 4 Scout meta-llama-llama-4-scout	Imported	2026-05-06
4	Gemini-1.5-Pro	0.57	—	Imported	2026-05-06
5	mistralai/Pixtral-Large-Instruct	0.57	—	Imported	2026-05-06
6	GPT-4o	0.54	GPT-4o (2024-08-06) openai-gpt-4o-2024-08-06	Imported	2026-05-06
7	Mistral-Small-3.1-24B-Instruct	0.53	Mistral: Mistral Small 3.1 24B mistralai-mistral-small-3.1-24b-instruct	Imported	2026-05-06
8	grok-2-vision-1212	0.46	—	Imported	2026-05-06
9	Llama 3.2 90B Vision	0.36	—	Imported	2026-05-06
10	llama-3.2-11b-vision-preview	0.32	—	Imported	2026-05-06
11	JanusPro-7B	0.20	—	Imported	2026-05-06

Metadata

Metrics

Latest Results