- 
                Notifications
    
You must be signed in to change notification settings  - Fork 16
 
Partial enabling of transition types for different label scorers #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| INITIAL_LABEL, | ||
| INITIAL_BLANK, | ||
| }; | ||
| break; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's related to this PR #152, but TRANSDUCER and LM need additonally SENTENCE_END transition types.
| case TransitionPresetType::LM: | ||
| enabledTransitionTypes_ = { | ||
| LABEL_TO_LABEL, | ||
| INITIAL_LABEL, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AedTreeBuilder and labelsync searches are still WIP and not merged yet, but at some point we will also need a preset for AED and I guess this will be the same as this one. So will we then just add a preset which is exactly the same, just with a different name? And if yes, should LM or AED be the default preset of the StatefulOnnxLabelScorer? I mean in the end it's the same, but it might become confusing because of the naming.
| : Core::Component(config) {} | ||
| : Core::Component(config), | ||
| enabledTransitionTypes_() { | ||
| enableTransitionTypes(config); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I have to withdraw may approve. I just found a bug.
Here, you are calling enableTransitionTypes(config) and in enableTransitionTypes() the function defaultPreset() is called. As this is done in the constructor of LabelScorer, the defaultPreset() function of LabelScorer will always be called instead of the implementation of the derived classes one is using.
As a fix, I suggest to put enableTransitionTypes(config) to the constructor of every derived class and remove it from the base class constructor here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eugen: yes pelase
| ScoresWithTimes result{{requests.size(), 0.0}, {requests.size(), 0}}; | ||
| ScoresWithTimes result{ | ||
| .scores = std::vector<Score>(requests.size(), 0.0), | ||
| .timeframes{requests.size(), 0}}; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .timeframes{requests.size(), 0}}; | |
| .timeframes{requests.size(), 0} | |
| }; | 
| virtual TransitionPresetType defaultPreset() const { | ||
| return TransitionPresetType::NONE; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to move method definitions out of the class. Can either be inline in the header or just in the .cc file.
| virtual TransitionPresetType defaultPreset() const override { | ||
| return TransitionPresetType::LM; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move definition out of class declaration.
| virtual TransitionPresetType defaultPreset() const override { | ||
| return TransitionPresetType::CTC; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method definition not in class declaration
| virtual TransitionPresetType defaultPreset() const override { | ||
| return TransitionPresetType::CTC; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see others
| virtual TransitionPresetType defaultPreset() const override { | ||
| return TransitionPresetType::ALL; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see others
| virtual TransitionPresetType defaultPreset() const override { | ||
| return TransitionPresetType::ALL; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see others
This adds a parameterto the LabelScorer base class which allows users to specify a list of transition types which will always get assigned score 0 and not affect the ScoringContext. This can be especially helpful when building combined label scorers with multiple sub-scorers of different label topology such as for example CTC + LSTM LM where the LM needs to ignore all blank transitions as well as label loops and CTC on the other hand needs to ignore the sentence-end.ignored-transition-typesDo achieve this,
extendedScoringContext,computeScoreWithTimeandcomputeScoresWithTimesare implemented in the baseLabelScorerand call protected functions{extendedScoringContext|computeScoreWithTime|computeScoresWithTimes}Internalwhich are overridden in child classes.Depends on modifications to the transition type handling from #138.
Edit:
This logic was changed so that instead of a blacklist, the LabelScorer has a parameter
transition-presetwhich enables a preconfigured list of used transition types for specific models and any additional ones can be enabled via another parameterextra-transition-types. The available presets for now are"default","none","ctc","transducer"and"lm". When thedefaultpreset is used, it takes a preset that is specified by each LabelScorer subclass individually and should be the most common use-case for that class. For example forCombineLabelScorerthe default will use preset "all", forNoContextOnnxLabelScorerthe default will use preset "ctc" and so on.