AI-Secure LLM Trustworthy Leaderboard

Trustworthiness benchmark for LLMs covering toxicity, stereotypes, adversarial robustness, out-of-distribution robustness, adversarial demonstrations, privacy, ethics, and fairness.

37rows
trustworthy_averageprimary metric
2026-05-06sampled

Metadata

Metrics

Trustworthy average, toxicity, stereotype, adv, ood, adv demo, privacy, ethics, fairness

Latest Results

Rows are parsed from per-model result JSON files; source model names and SHA labels are preserved.

Rank Subject Trustworthy average Model Match Provenance Sampled
1 anthropic/claude-2.0 0.85 Imported 2026-05-06
2 openai/gpt-4o-2024-05-13 0.83 GPT-4o (2024-05-13)
openai-gpt-4o-2024-05-13
Imported 2026-05-06
3 meta-llama/Meta-Llama-3-8B-Instruct 0.81 Llama 3 8B Instruct
meta-llama-llama-3-8b-instruct
Imported 2026-05-06
4 vertexai/gemini-pro-1.0 0.81 Imported 2026-05-06
5 openai/gpt-4o-mini-2024-07-18 0.76 GPT-4o-mini (2024-07-18)
openai-gpt-4o-mini-2024-07-18
Imported 2026-05-06
6 meta-llama/Llama-2-7b-chat-hf 0.75 Imported 2026-05-06
7 openai/gpt-3.5-turbo-0301 0.72 GPT-3.5 Turbo
openai-gpt-3.5-turbo
Imported 2026-05-06
8 compressed-llm/llama-2-13b-chat-gptq 0.72 Imported 2026-05-06
9 compressed-llm/llama-2-13b-chat-awq 0.71 Imported 2026-05-06
10 compressed-llm/llama-2-13b-chat-awq 0.71 Imported 2026-05-06
11 compressed-llm/llama-2-13b-chat-awq 0.70 Imported 2026-05-06
12 openai/gpt-4-0314 0.69 GPT-4 (older v0314)
openai-gpt-4-0314
Imported 2026-05-06
13 google/gemma-2b-it 0.67 Imported 2026-05-06
14 google/gemma-7b-it 0.67 Imported 2026-05-06
15 allenai/tulu-2-13b 0.67 Imported 2026-05-06
16 compressed-llm/vicuna-13b-v1.3_gptq 0.66 Imported 2026-05-06
17 allenai/tulu-2-7b 0.64 Imported 2026-05-06
18 HuggingFaceH4/zephyr-7b-beta 0.63 Imported 2026-05-06
19 compressed-llm/vicuna-13b-v1.3_gptq 0.63 Imported 2026-05-06
20 compressed-llm/llama-2-13b-awq 0.63 Imported 2026-05-06
21 compressed-llm/llama-2-13b-awq 0.62 Imported 2026-05-06
22 compressed-llm/llama-2-13b-gptq 0.62 Imported 2026-05-06
23 mosaicml/mpt-7b-chat 0.62 Imported 2026-05-06
24 compressed-llm/llama-2-13b-awq 0.62 Imported 2026-05-06
25 compressed-llm/vicuna-13b-v1.3-gptq 0.62 Imported 2026-05-06
26 compressed-llm/llama-2-13b-gptq 0.61 Imported 2026-05-06
27 compressed-llm/llama-2-13b-gptq 0.61 Imported 2026-05-06
28 lmsys/vicuna-7b-v1.3 0.61 Imported 2026-05-06
29 compressed-llm/vicuna-13b-v1.3-awq 0.60 Imported 2026-05-06
30 compressed-llm/vicuna-13b-v1.3-awq 0.60 Imported 2026-05-06
31 tiiuae/falcon-7b-instruct 0.59 Imported 2026-05-06
32 compressed-llm/vicuna-13b-v1.3-awq 0.59 Imported 2026-05-06
33 Open-Orca/Mistral-7B-OpenOrca 0.59 Imported 2026-05-06
34 togethercomputer/RedPajama-INCITE-7B-Instruct 0.57 Imported 2026-05-06
35 compressed-llm/llama-2-13b-chat-gptq 0.55 Imported 2026-05-06
36 compressed-llm/llama-2-13b-chat-gptq 0.55 Imported 2026-05-06
37 chavinlo/alpaca-native 0.46 Imported 2026-05-06