Medical STT Benchmark
Speech-to-text benchmark for long-form medical dialogue, ranking cloud and local transcription systems with Medical Word Error Rate on the PriMock57 dataset.
42rows
medical_werprimary metric
2026-04-28sampled
Metadata
Metrics
Medical WER (lower is better), WER (lower is better), Accuracy, Drug Medical WER (lower is better), Avg Speed (lower is better), Best WER (lower is better), Worst WER (lower is better), WER Std (lower is better), Files Evaluated
| Rank | Subject | Medical WER | Model Match | Provenance | Sampled |
|---|---|---|---|---|---|
| 1 | google-gemini-3-pro-preview | 0.01 | — | Imported | 2026-04-28 |
| 2 | google-gemini-2.5-pro | 0.02 | — | Imported | 2026-04-28 |
| 3 | vibevoice-asr-9b | 0.02 | — | Imported | 2026-04-28 |
| 4 | google-gemini-3-flash-preview | 0.02 | — | Imported | 2026-04-28 |
| 5 | soniox-stt-async-v4 | 0.02 | — | Imported | 2026-04-28 |
| 6 | elevenlabs-scribe_v2 | 0.03 | — | Imported | 2026-04-28 |
| 7 | assemblyai-universal-3-pro-medical | 0.03 | — | Imported | 2026-04-28 |
| 8 | qwen3-asr-1.7b | 0.03 | — | Imported | 2026-04-28 |
| 9 | deepgram-nova-3-medical | 0.03 | — | Imported | 2026-04-28 |
| 10 | mai-transcribe-1 | 0.03 | — | Imported | 2026-04-28 |
| 11 | elevenlabs-scribe_v1 | 0.04 | — | Imported | 2026-04-28 |
| 12 | google-gemini-2.5-flash | 0.04 | — | Imported | 2026-04-28 |
| 13 | openai-gpt-4o-mini-transcribe-2025-12-15 | 0.04 | — | Imported | 2026-04-28 |
| 14 | parakeet-tdt-1.1b | 0.04 | — | Imported | 2026-04-28 |
| 15 | voxtral-mini-transcribe-v1-chat | 0.04 | — | Imported | 2026-04-28 |
| 16 | voxtral-mini-transcribe-v2 | 0.04 | — | Imported | 2026-04-28 |
| 17 | voxtral-mini-4b-realtime | 0.04 | — | Imported | 2026-04-28 |
| 18 | groq-whisper-large-v3-turbo | 0.04 | — | Imported | 2026-04-28 |
| 19 | cohere-transcribe-03-2026 | 0.04 | — | Imported | 2026-04-28 |
| 20 | openai-whisper-1 | 0.05 | — | Imported | 2026-04-28 |
| 21 | canary_1b_flash_lcs | 0.05 | — | Imported | 2026-04-28 |
| 22 | groq-whisper-large-v3 | 0.05 | — | Imported | 2026-04-28 |
| 23 | parakeet-parakeet-tdt-0.6b-v2 | 0.05 | — | Imported | 2026-04-28 |
| 24 | mlx-community_whisper-large-v3-turbo | 0.05 | — | Imported | 2026-04-28 |
| 25 | whisperkit-large-v3-v20240930_turbo | 0.05 | — | Imported | 2026-04-28 |
| 26 | openai-gpt-4o-mini-transcribe | 0.05 | — | Imported | 2026-04-28 |
| 27 | qwen3-asr-0.6b | 0.05 | — | Imported | 2026-04-28 |
| 28 | kyutai-stt-pytorch-stt-2.6b-en | 0.05 | — | Imported | 2026-04-28 |
| 29 | glm-asr-nano-2512 | 0.06 | — | Imported | 2026-04-28 |
| 30 | parakeet-parakeet-tdt-0.6b-v3 | 0.06 | — | Imported | 2026-04-28 |
| 31 | nemotron-speech-streaming-0.6b | 0.07 | — | Imported | 2026-04-28 |
| 32 | openai-gpt-4o-transcribe | 0.08 | — | Imported | 2026-04-28 |
| 33 | gemma-4-e4b-it | 0.08 | — | Imported | 2026-04-28 |
| 34 | nvidia_canary-qwen-2.5b | 0.08 | — | Imported | 2026-04-28 |
| 35 | canary-1b-v2 | 0.09 | — | Imported | 2026-04-28 |
| 36 | granite-speech-3.3-2b | 0.11 | — | Imported | 2026-04-28 |
| 37 | apple-speechanalyzer | 0.12 | — | Imported | 2026-04-28 |
| 38 | gemma-4-e2b-it | 0.12 | — | Imported | 2026-04-28 |
| 39 | azure-foundry-phi4 | 0.14 | — | Imported | 2026-04-28 |
| 40 | kyutai-stt-pytorch-stt-1b-en_fr | 0.20 | — | Imported | 2026-04-28 |
| 41 | google-medasr | 0.26 | — | Imported | 2026-04-28 |
| 42 | mms-1b-all | 0.53 | — | Imported | 2026-04-28 |
No matching rows.