OpenClaw Arena Config Leaderboard

OpenClaw Arena configuration leaderboard measuring how different SOUL.md-style personal-agent configurations affect GPT-4.1 performance.

10rows
average_scoreprimary metric
2026-05-06sampled

Metadata

Metrics

Avg Score, Tasks Completed

Latest Results

Rows are parsed from the public OpenClaw Arena static config leaderboard for gpt-4.1. Source config display names are preserved.

Rank Subject Avg Score Model Match Provenance Sampled
1 baseline 0.89 Imported 2026-05-06
2 content-creator 0.87 Imported 2026-05-06
3 finance-tracker 0.87 Imported 2026-05-06
4 memory-master 0.86 Imported 2026-05-06
5 research-pro 0.86 Imported 2026-05-06
6 health-coach 0.85 Imported 2026-05-06
7 dev-productivity 0.84 Imported 2026-05-06
8 daily-organizer 0.82 Imported 2026-05-06
9 proactive-agent 0.79 Imported 2026-05-06
10 research-basic 0.76 Imported 2026-05-06