Skip to content

Update Tree-/LexiconfreeTimesyncBeamSearch to perform intermediate pruning between multiple LabelScorers#172

Merged
SimBe195 merged 13 commits intomasterfrom
multiple_label_scorers
Feb 24, 2026
Merged

Update Tree-/LexiconfreeTimesyncBeamSearch to perform intermediate pruning between multiple LabelScorers#172
SimBe195 merged 13 commits intomasterfrom
multiple_label_scorers

Conversation

@SimBe195
Copy link
Copy Markdown
Collaborator

@SimBe195 SimBe195 commented Jan 8, 2026

Add the option to configure multiple label scorers with intermediate pruning after each one in LexiconfreeTimesyncBeamSearch and TreeTimesyncBeamSearch. This is for efficiency gains so that for example when searching with CTC + LSTM LM some pre-pruning can be done based on the CTC scores before doing the more expensive LSTM calculations.

Speech::ModelCombination has a new parameter num-label-scorers and then builds a list of LabelScorers of that size. The search algorithms then use this list and iteratively score extensions and apply intermediate pruning for each scorer in this list in the decodeStep. To this end, the max-beam-size and score-threshold config parameters were changed from ParameterInt and ParameterFloat to ParameterIntVector and ParameterFloatVector. For example with 3 LabelScorers the order of steps would be:

  1. Build extensions list
  2. Score each extension with scorer-1
  3. Apply score-pruning with first score-threshold to extensions
  4. Apply beam-pruning with first beam-threshold to extensions
  5. Score each extension with scorer-2
  6. Apply score-pruning with second score-threshold to extensions
  7. Apply beam-pruning with second beam-threshold to extensions
  8. Score each extension with scorer-3
  9. Apply score-pruning with third score-threshold to extensions
  10. Create full LabelHypotheses from extensions
  11. Perform recombination of LabelHypotheses
  12. Apply beam-pruning with third beam-threshold to LabelHypotheses

Comment thread src/Search/LexiconfreeTimesyncBeamSearch/LexiconfreeTimesyncBeamSearch.cc Outdated
Comment thread src/Search/TreeTimesyncBeamSearch/TreeTimesyncBeamSearch.cc
Comment thread src/Search/TreeTimesyncBeamSearch/TreeTimesyncBeamSearch.cc
Comment thread src/Search/TreeTimesyncBeamSearch/TreeTimesyncBeamSearch.cc
Base automatically changed from scaled_label_scorer to master January 9, 2026 12:40
@SimBe195 SimBe195 changed the title Use multiple LabelScores with intermediate pruning in SearchV2 Use multiple LabelScorers with intermediate pruning in SearchV2 Jan 9, 2026
Copy link
Copy Markdown
Contributor

@curufinwe curufinwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me. In mainly looked at the interfaces and parameter names and did not thoroughly look at the changes to the search algorithm. I only have 1 request for changes (add logging) and one clarification question.

Comment thread src/Search/TreeTimesyncBeamSearch/TreeTimesyncBeamSearch.cc Outdated
Comment thread src/Search/TreeTimesyncBeamSearch/TreeTimesyncBeamSearch.cc Outdated
Comment thread src/Nn/LabelScorer/ScaledLabelScorer.cc
Comment thread src/Nn/LabelScorer/ScaledLabelScorer.hh Outdated
Comment thread src/Nn/LabelScorer/ScaledLabelScorer.cc Outdated
@larissakl
Copy link
Copy Markdown
Contributor

Maybe we should also adjust the documentation at the begin of the search algorithms. At the moment it says "Uses a LabelScorer to context initialization/extension and scoring.". I would maybe somehow mention that one can use multiple LabelScorers with intermediate pruning.

@curufinwe curufinwe changed the title Use multiple LabelScorers with intermediate pruning in SearchV2 Update Tree-/LexiconfreeTimesyncBeamSearch to perform intermediate pruning between multiple LabelScorers Feb 24, 2026
@SimBe195 SimBe195 merged commit 198d582 into master Feb 24, 2026
2 checks passed
@SimBe195 SimBe195 deleted the multiple_label_scorers branch February 24, 2026 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants