MaCBench
Chemistry and materials multimodal benchmark evaluating VLMs across lab, molecular, crystallography, MOF, spectroscopy, patent-figure, table-QA, and XRD tasks.
Metadata
Metrics
Overall Score, afm-image, chem-lab-basic, chem-lab-comparison, chem-lab-equipments, chirality, cif-atomic-species, cif-crystal-system, cif-density, cif-symmetry, cif-volume, electronic-structure, handdrawn-molecules, isomers, mof-adsorption-strength-comparison, mof-adsorption-strength-order, mof-capacity-comparison, mof-capacity-order, mof-capacity-value, mof-henry-constant-comparison, mof-henry-constant-order, mof-working-capacity-comparison, mof-working-capacity-order, mof-working-capacity-value, org-schema-wo-smiles, org-schema, organic-molecules, spectral-analysis, tables-qa, us-patent-figures, us-patent-plots, xrd-pattern-matching, xrd-pattern-shape, xrd-peak-position, xrd-relative-intensity
| Rank | Subject | Overall Score | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | llama-4-maverick-17b-128e-instruct | 0.70 | Llama 4 Maverick meta-llama-4-maverick | Imported | 2026-05-06 |
| 2 | Claude-3.5-Sonnet | 0.67 | Claude 3.5 Sonnet anthropic-claude-3.5-sonnet | Imported | 2026-05-06 |
| 3 | llama-4-scout-17b-16e-instruct | 0.63 | Llama 4 Scout meta-llama-llama-4-scout | Imported | 2026-05-06 |
| 4 | Gemini-1.5-Pro | 0.57 | — | Imported | 2026-05-06 |
| 5 | mistralai/Pixtral-Large-Instruct | 0.57 | — | Imported | 2026-05-06 |
| 6 | GPT-4o | 0.54 | GPT-4o (2024-08-06) openai-gpt-4o-2024-08-06 | Imported | 2026-05-06 |
| 7 | Mistral-Small-3.1-24B-Instruct | 0.53 | Mistral: Mistral Small 3.1 24B mistralai-mistral-small-3.1-24b-instruct | Imported | 2026-05-06 |
| 8 | grok-2-vision-1212 | 0.46 | — | Imported | 2026-05-06 |
| 9 | Llama 3.2 90B Vision | 0.36 | — | Imported | 2026-05-06 |
| 10 | llama-3.2-11b-vision-preview | 0.32 | — | Imported | 2026-05-06 |
| 11 | JanusPro-7B | 0.20 | — | Imported | 2026-05-06 |
No matching rows.