SWE-Lancer

Real freelance software engineering tasks from Upwork, scored by end-to-end tests and payout value.

2rows
accuracyprimary metric
2025-07-17sampled

Metadata

Metrics

Accuracy, Earned USD

Latest Results

Leaderboard page last built on 2025-07-28. The OpenAI launch post notes this July 17, 2025 update removed the internet-connectivity requirement.

Rank Subject Accuracy Model Match Provenance Sampled
1 o1 28.4% o1
openai-o1
Imported 2025-07-17
2 4o 8.1% GPT-4o
openai-gpt-4o
Imported 2025-07-17