OpenAI Evals

OpenAI Evals: Evaluates broad language-model knowledge, reasoning, commonsense, instruction following, or exam-style accuracy.

0rows
scoreprimary metric
sampled

Metadata

Metrics

Score

Latest Results

Rank Subject Score Model Match Provenance Sampled