BTZSC

Benchmark for zero-shot text classification across cross-encoders, embedding models, rerankers, and LLMs, covering 22 English single-label datasets across sentiment, topic, intent, and emotion task families.

35rows
macro_f1primary metric
2026-05-06sampled

Metadata

Metrics

Macro F1, Accuracy, Macro Precision, Macro Recall, Sentiment Macro F1, Emotion Macro F1, Intent Macro F1, Topic Macro F1

Latest Results

Rows are parsed from one public JSON result file per model in btzsc/btzsc-results. Source fractional scores are converted to percentages.

Rank Subject Macro F1 Model Match Provenance Sampled
1 Qwen3-Reranker-8B 72.24 Imported 2026-05-06
2 Mistral-Nemo-Instruct-2407 66.97 Mistral: Mistral Nemo
mistralai-mistral-nemo
Imported 2026-05-06
3 Qwen3-8B 66.49 Qwen3 8B
qwen-qwen3-8b
Imported 2026-05-06
4 Qwen3-4B 64.86 Imported 2026-05-06
5 gte-large-en-v1.5 61.74 Imported 2026-05-06
6 Qwen3-Reranker-0.6B 60.56 Imported 2026-05-06
7 e5-large-v2 59.74 Imported 2026-05-06
8 e5-base-v2 59.66 Imported 2026-05-06
9 deberta-v3-large-nli-triplet 59.58 Imported 2026-05-06
10 Qwen3-Embedding-8B 59.13 Imported 2026-05-06
11 deberta-v3-large-nli 59.11 Imported 2026-05-06
12 gte-modernbert-base 58.64 Imported 2026-05-06
13 gte-base-en-v1.5 58.42 Imported 2026-05-06
14 Qwen3-Embedding-0.6B 57.97 Imported 2026-05-06
15 gte-reranker-modernbert-base 57.85 Imported 2026-05-06
16 e5-mistral-7b-instruct 57.56 Imported 2026-05-06
17 bge-base-en-v1.5 56.83 Imported 2026-05-06
18 bge-large-en-v1.5 55.48 Imported 2026-05-06
19 modernbert-large-nli 55.17 Imported 2026-05-06
20 deberta-v3-base-nli 55.04 Imported 2026-05-06
21 modernbert-large-nli-triplet 54.87 Imported 2026-05-06
22 bge-reranker-large 53.48 Imported 2026-05-06
23 modernbert-base-nli 53.43 Imported 2026-05-06
24 bert-large-uncased-nli 53.39 Imported 2026-05-06
25 bert-large-uncased-nli-triplet 52.46 Imported 2026-05-06
26 bart-large-mnli 50.79 Imported 2026-05-06
27 nli-roberta-base 48.85 Imported 2026-05-06
28 bert-base-uncased-nli 48.76 Imported 2026-05-06
29 bge-reranker-base 47.11 Imported 2026-05-06
30 Phi-4-mini-instruct 43.09 Imported 2026-05-06
31 Llama-3.2-3B-Instruct 43.02 Llama 3.2 3B Instruct
meta-llama-llama-3.2-3b-instruct
Imported 2026-05-06
32 ms-marco-MiniLM-L6-v2 42.18 Imported 2026-05-06
33 all-MiniLM-L6-v2 36.60 Imported 2026-05-06
34 gemma-3-1b-it 35.91 Imported 2026-05-06
35 gemma-3-270m-it 27.94 Imported 2026-05-06