Skip to content

Commit aca3f5e

Browse files
author
sanchit-gandhi
committed
[training] compute normalised wer
1 parent c2b90bd commit aca3f5e

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

training/eval.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,10 @@ def wer(asr_model_name_or_path, prompts, audios, device, per_device_eval_batch_s
2929
batch_size=int(per_device_eval_batch_size),
3030
)
3131

32-
word_error = 100 * metric.compute(
33-
predictions=[t["text"].lower() for t in transcriptions], references=[t.lower() for t in prompts]
34-
)
32+
normalizer = asr_pipeline.tokenizer.normalize
33+
normalized_predictions = [normalizer(t["text"]) for t in transcriptions]
34+
normalized_references = [normalizer(t) for t in prompts]
35+
36+
word_error = 100 * metric.compute(predictions=normalized_predictions, references=normalized_references)
3537

3638
return word_error, [t["text"] for t in transcriptions]

0 commit comments

Comments
 (0)