Fix completion info extraction for offline best-of-n and self-consistency#223
Open
smirnovlad wants to merge 1 commit into
Open
Fix completion info extraction for offline best-of-n and self-consistency#223smirnovlad wants to merge 1 commit into
smirnovlad wants to merge 1 commit into
Conversation
…ency get_completion_info() was receiving List[str] (from detect_steps()) instead of List[StepCandidate], causing the isinstance guard to always return defaults. This meant context_limit_hit_rate and max_steps_hit_rate were always 0. Capture completion info eagerly while StepCandidate objects are still available, store it in trajectory dicts, and propagate through to final results.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
get_completion_info()was receivingList[str](fromdetect_steps()) instead ofList[StepCandidate]in offline best-of-n and self-consistency strategies, so theisinstanceguard always returned defaults (None,False,False)context_limit_hit_rateandmax_steps_hit_rateto always be 0 for these two strategiesStepCandidateobjects are still available, store in trajectory dicts, and propagate through to final resultsTest plan
metrics.jsonreports correctcontext_limit_hit_count/max_steps_hit_count