SWE-Lancer
Real freelance software engineering tasks from Upwork, scored by end-to-end tests and payout value.
2rows
accuracyprimary metric
2025-07-17sampled
Metadata
Metrics
Accuracy, Earned USD
| Rank | Subject | Accuracy | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | o1 | 28.4% | o1 openai-o1 | Imported | 2025-07-17 |
| 2 | 4o | 8.1% | GPT-4o openai-gpt-4o | Imported | 2025-07-17 |
No matching rows.