In continuous_eval/metrics/generation/text/llm_based.py line 72-77, the response is split into a score and reasoning. If the split fails, the score in line 76 should check if "yes" is in response.lower() rather than score_txt.lower() since score_txt has not been defined yet.
AS-IS:
try:
score_txt, reasoning = response.split("\n", 1)
score = float("yes" in score_txt.lower())
except ValueError:
score = float("yes" in score_txt.lower())
reasoning = response
TO-BE:
try:
score_txt, reasoning = response.split("\n", 1)
score = float("yes" in score_txt.lower())
except ValueError:
score = float("yes" in response.lower())
reasoning = response
In continuous_eval/metrics/generation/text/llm_based.py line 72-77, the response is split into a score and reasoning. If the split fails, the score in line 76 should check if "yes" is in
response.lower()rather thanscore_txt.lower()sincescore_txthas not been defined yet.AS-IS:
TO-BE: