Skip to content

Conversation

@SimBe195
Copy link
Collaborator

@SimBe195 SimBe195 commented Oct 1, 2025

This adds a parameter ignored-transition-types to the LabelScorer base class which allows users to specify a list of transition types which will always get assigned score 0 and not affect the ScoringContext. This can be especially helpful when building combined label scorers with multiple sub-scorers of different label topology such as for example CTC + LSTM LM where the LM needs to ignore all blank transitions as well as label loops and CTC on the other hand needs to ignore the sentence-end.

Do achieve this, extendedScoringContext, computeScoreWithTime and computeScoresWithTimes are implemented in the base LabelScorer and call protected functions {extendedScoringContext|computeScoreWithTime|computeScoresWithTimes}Internal which are overridden in child classes.

Depends on modifications to the transition type handling from #138.

Edit:
This logic was changed so that instead of a blacklist, the LabelScorer has a parameter transition-preset which enables a preconfigured list of used transition types for specific models and any additional ones can be enabled via another parameter extra-transition-types. The available presets for now are "default", "none", "ctc", "transducer" and "lm". When the default preset is used, it takes a preset that is specified by each LabelScorer subclass individually and should be the most common use-case for that class. For example for CombineLabelScorer the default will use preset "all", for NoContextOnnxLabelScorer the default will use preset "ctc" and so on.

Base automatically changed from tdp_label_scorer to master October 8, 2025 12:59
@curufinwe curufinwe changed the title Option to disable transition types in LabelScorer Add option to disable some transition types in Nn::LabelScorer Oct 10, 2025
@SimBe195 SimBe195 changed the title Add option to disable some transition types in Nn::LabelScorer Partial enabling of transition types for different label scorers Oct 10, 2025
INITIAL_LABEL,
INITIAL_BLANK,
};
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's related to this PR #152, but TRANSDUCER and LM need additonally SENTENCE_END transition types.

Comment on lines +165 to +168
case TransitionPresetType::LM:
enabledTransitionTypes_ = {
LABEL_TO_LABEL,
INITIAL_LABEL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AedTreeBuilder and labelsync searches are still WIP and not merged yet, but at some point we will also need a preset for AED and I guess this will be the same as this one. So will we then just add a preset which is exactly the same, just with a different name? And if yes, should LM or AED be the default preset of the StatefulOnnxLabelScorer? I mean in the end it's the same, but it might become confusing because of the naming.

: Core::Component(config) {}
: Core::Component(config),
enabledTransitionTypes_() {
enableTransitionTypes(config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I have to withdraw may approve. I just found a bug.
Here, you are calling enableTransitionTypes(config) and in enableTransitionTypes() the function defaultPreset() is called. As this is done in the constructor of LabelScorer, the defaultPreset() function of LabelScorer will always be called instead of the implementation of the derived classes one is using.
As a fix, I suggest to put enableTransitionTypes(config) to the constructor of every derived class and remove it from the base class constructor here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eugen: yes pelase

ScoresWithTimes result{{requests.size(), 0.0}, {requests.size(), 0}};
ScoresWithTimes result{
.scores = std::vector<Score>(requests.size(), 0.0),
.timeframes{requests.size(), 0}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.timeframes{requests.size(), 0}};
.timeframes{requests.size(), 0}
};

Comment on lines +181 to +183
virtual TransitionPresetType defaultPreset() const {
return TransitionPresetType::NONE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to move method definitions out of the class. Can either be inline in the header or just in the .cc file.

Comment on lines +89 to +91
virtual TransitionPresetType defaultPreset() const override {
return TransitionPresetType::LM;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move definition out of class declaration.

Comment on lines +48 to +50
virtual TransitionPresetType defaultPreset() const override {
return TransitionPresetType::CTC;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method definition not in class declaration

Comment on lines +64 to +66
virtual TransitionPresetType defaultPreset() const override {
return TransitionPresetType::CTC;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see others

Comment on lines +71 to +73
virtual TransitionPresetType defaultPreset() const override {
return TransitionPresetType::ALL;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see others

Comment on lines +76 to +78
virtual TransitionPresetType defaultPreset() const override {
return TransitionPresetType::ALL;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see others

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet