Fix IFEval correctness bugs in if_functions and IFEvalVerifier#1683
Conversation
There was a problem hiding this comment.
Code Review
This pull request bundles several correctness fixes across the codebase, including logic corrections for CSV header handling, prevention of division-by-zero errors in DPO and IFEval verifiers, and updates to GPT-4o pricing. It also fixes operand direction and tolerance logic in IFEval constraint functions and adds corresponding regression tests. Feedback suggests enhancing the choice validation functions by using regular expressions with word boundaries to avoid false positive matches resulting from simple substring checks.
| def validate_choice(text: str, options: list) -> bool: | ||
| for option in options: | ||
| if text in option: | ||
| if option in text: |
There was a problem hiding this comment.
Similar to the implementation in open_instruct/if_functions.py, using substring matching here can lead to false positives (e.g., matching "B" in "Banana"). Using regular expressions with word boundaries or lookarounds would make the validation more robust.
| if option in text: | |
| if re.search(rf"(?<!\w){re.escape(str(option))}(?!\w)", text): |
…terize tests, dedupe if_functions Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ubstring false positives Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…in separate PR) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…word-boundary fix Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bundles a bunch of IFEval correctness fixes:
open_instruct/if_functions.py: fixvalidate_choiceoperand direction (wastext in option, nowoption in text); add ±10% tolerance tovalidate_frequency_capital_words"around" quantifier (supersedes Fix validate_choice substring check with swapped operands #1615, fix(if_functions): fix validate_choice operand direction #1646).open_instruct/ground_truth_utils.py: guardIFEvalVerifier.__call__againstZeroDivisionErrorwhen the constraint's instruction list is empty (supersedes fix(ifeval): guard against ZeroDivisionError in IFEvalVerifier when instruction list is empty #1655).