Add dense score retrieval for certain ScoreAccessors#223
Conversation
| size_t size() const { | ||
| return terms.empty() ? 0ul : terms.front().scores.size(); | ||
| } |
There was a problem hiding this comment.
I think this might be confusing because one would usually assume to get the size of the vector terms when calling size() for this struct, not size of the scores in terms. Maybe we could rename the function to have it more explicit?
There was a problem hiding this comment.
I don't mind either way, but it is consistent in the view that it returns the maximum index + 1 for the array accessor below. In that sense it is like a vector.
|
|
||
| std::vector<DenseScoreTerm> denseScoreTerms(denseScores->terms.begin(), denseScores->terms.end()); | ||
| for (auto& term : denseScoreTerms) { | ||
| term.scale *= scale_; |
There was a problem hiding this comment.
Because the terms might come from a combined label scorer that does scaling differently for each term. Then the scaled label scorer here should multiply its own scale and not override the scales of the combined label scorer.
| if (denseScores and tokenIdx < denseScores->size()) { | ||
| extScore += (*denseScores)[tokenIdx]; | ||
| } | ||
| else { | ||
| extScore += (*scoreAccessor)->getScore(transitionType, tokenIdx); | ||
| } |
There was a problem hiding this comment.
Why do you use an if-else block here and not the same syntax as below?
| else if (denseScores->size() != denseSize) { | ||
| return std::nullopt; | ||
| } | ||
| denseScoreTerms.insert(denseScoreTerms.end(), denseScores->terms.begin(), denseScores->terms.end()); |
There was a problem hiding this comment.
I don't get the implementation for this ScoreAccessor. Why do you just collect the dense score terms of all subAccessors in one vector? Don't we need to sum them up?
| size_t size() const { | ||
| return terms.empty() ? 0ul : terms.front().scores.size(); | ||
| } |
There was a problem hiding this comment.
I don't mind either way, but it is consistent in the view that it returns the maximum index + 1 for the array accessor below. In that sense it is like a vector.
| std::vector<DenseScoreTerm> terms; | ||
|
|
||
| DenseScoreSpan(std::vector<DenseScoreTerm>&& terms) | ||
| : terms(std::move(terms)) {} |
There was a problem hiding this comment.
Maybe we could add an assertion here to check that all terms are of the same length.
|
|
||
| std::vector<DenseScoreTerm> denseScoreTerms(denseScores->terms.begin(), denseScores->terms.end()); | ||
| for (auto& term : denseScoreTerms) { | ||
| term.scale *= scale_; |
There was a problem hiding this comment.
Because the terms might come from a combined label scorer that does scaling differently for each term. Then the scaled label scorer here should multiply its own scale and not override the scales of the combined label scorer.
| for (auto& term : denseScoreTerms) { | ||
| term.scale *= scale_; | ||
| } | ||
| return DenseScoreSpan(std::move(denseScoreTerms)); |
There was a problem hiding this comment.
Why do we need to create a new DenseScoreSpan here instead of returning the modified one?
| return std::nullopt; | ||
| } | ||
|
|
||
| std::vector<Nn::DenseScoreTerm> denseScoreTerms(denseScores->terms.begin(), denseScores->terms.end()); |
There was a problem hiding this comment.
| std::vector<Nn::DenseScoreTerm> denseScoreTerms(denseScores->terms.begin(), denseScores->terms.end()); | |
| std::vector<Nn::DenseScoreTerm> denseScoreTerms(denseScores->terms.begin(), denseScores->terms.end()); |
But actually: why do we need a copy? We could also just update denseScores, no?
| Core::Ref<ScoreAccessor> scoreAccessor_; | ||
| const bool negateInput_; | ||
| std::shared_ptr<Nn::Prior<Score>> prior_; |
There was a problem hiding this comment.
why this whitespace change?
| denseScoreTerms.push_back(Nn::DenseScoreTerm{denseScores->size() > 0ul ? std::span<Score const>(&prior_->at(0), prior_->size()) : std::span<Score const>(), prior_->scale()}); | ||
| } | ||
|
|
||
| return Nn::DenseScoreSpan(std::move(denseScoreTerms)); |
There was a problem hiding this comment.
Why create a new DenseScoreSpan instead of returning the modified one?
| auto const& scoreAccessor = scoreAccessors[hypIndexToContextIndexMap_[ext.baseHypIndex]]; | ||
|
|
||
| if (scoreAccessor) { | ||
| ext.score += (*scoreAccessor)->getScore(ext.transitionType, ext.nextToken); |
There was a problem hiding this comment.
Shouldn't we also use the dense scores here?
Some ScoreAccessors (e.g.
VectorScoreAccessor) represent a dense span of score values over the vocabulary that don't depend on the transition type. For these ScoreAccessor classes we can implement an optimized function to retrieve the full dense score span directly and thus fetch all scores at once. This avoids repeated virtualscoreAccessor->getScore(transitionType, labelIndex)function calls in the search algorithms for every possible extension. In the future it can also be used to introduce local pruning over the dense score span when it's available.Originally I modeled the DenseScoreSpan just as a
std::span<Score const>but that doesn't support scaling and interpolation as it is used inCombineLabelScorer,PriorLabelScorerandScaledLabelScorer. So now it is instead modeled as a struct containing multiple terms each of which consist of a span + a scale.In a test segment with a speech LLM and lexiconfree labelsync search, the dense score accessors reduce the search time from 13.39s to 12.85s (RTF 0.336 to 0.323).