BrowserART

Browser Agent Red teaming Toolkit benchmark for evaluating whether browser agents pursue harmful web tasks despite refusal training in their underlying chat models.

8rows
harmful_behavior_pursuit_rateprimary metric
2026-05-28sampled

Metadata

Metrics

Human Rewrite ASR (lower is better), Direct Ask ASR (lower is better), Prefix ASR (lower is better), GCG ASR (lower is better), Random Search ASR (lower is better), Human Rewrite ASR (lower is better), Ensemble ASR (lower is better)

Latest Results

Rows are imported from the public arXiv source LaTeX Table 2 for BrowserART OpenHands browser agents. ASR metrics are attack success rates; lower is better for safety.

Rank Subject Human Rewrite ASR Model Match Provenance Sampled
1 OpenHands + Opus-3 40% Imported 2026-05-28
2 OpenHands + o1-preview 63% Imported 2026-05-28
3 OpenHands + Gemini-1.5 65% Imported 2026-05-28
4 OpenHands + Sonnet-3.5 70% Imported 2026-05-28
5 OpenHands + Llama-3.1 73% Imported 2026-05-28
6 OpenHands + o1-mini 84% Imported 2026-05-28
7 OpenHands + GPT-4o 98% Imported 2026-05-28
8 OpenHands + GPT-4-turbo 99% Imported 2026-05-28