-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
When doing grid search on the decoding parameters for a hybrid model, doing such a cartesian product does not make so much sense. One should first tune model-related scales such as prior scale and tdp scale. Then tdp values, and only given the optimal values of the mentioned parameters one tunes the lm and pronunciation scales. Moreover, we should definitely consider the obligatory use of a high altas and small beam for the first two steps - only lm scale should not be tuned together with altas.
Please also consider that experience shows that the only tdp value that is worth it to tune is the exit penalty for silence and non-word.
Originally posted by @Marvin84 in #110 (comment)
Metadata
Metadata
Assignees
Labels
No labels