WER
Word Error Rate. Percentage of words that are wrong compared to edited ground truth. Lower is better.
Model Evaluation
This page is the hub for model-vs-model comparison reports. Use it to evaluate whether a new training run actually improves scanner transcription quality before promoting it.
Word Error Rate. Percentage of words that are wrong compared to edited ground truth. Lower is better.
Character Error Rate. Useful when numbers, unit IDs, and short tokens matter. Lower is better.
Difference between model metrics. Negative delta for Model B means B improved over A.
Shows where a model performs better call-by-call, not just in aggregate averages.
Select a report to view its summary and call grid below.
Report: whisper_medium_v100_scan_vs_whisper_medium_v101_scan_20260419_111337_training_result_set.html
← Scroll horizontally inside the report to see all model columns →