Factorio Learning Environment

Interactive Factorio automation benchmark for LLM agents, tracking production score, milestones, automation milestones, and lab-task success rate.

6rows
production_scoreprimary metric
2026-05-28sampled

Metadata

Metrics

Production Score, Milestones, Automation Milestones, Lab Tasks Success Rate

Latest Results

Rows are imported from the official Factorio Learning Environment static JSON leaderboard. The public page sorts models by productionScore.

Rank Subject Production Score Model Match Provenance Sampled
1 Claude 3.5-Sonnet 293206 production score Imported 2026-05-28
2 Gemini-2-Flash 115782 production score Imported 2026-05-28
3 GPT4o 87599 production score Imported 2026-05-28
4 Llama-3.3-70b 54998 production score Imported 2026-05-28
5 Deepseek-v3 48585 production score Imported 2026-05-28
6 GPT4o-Mini 26756 production score Imported 2026-05-28