Conversation
src/together/cli/api/evaluation.py
Outdated
| ) | ||
| elif type == "compare": | ||
| # Check if either model-a or model-b config/name is provided | ||
| model_a_provided = model_a_field or any( |
There was a problem hiding this comment.
| model_a_provided = model_a_field or any( | |
| model_a_provided = model_a_field is not None or any( |
There was a problem hiding this comment.
By the way, should this be any or all?
There was a problem hiding this comment.
If it's "all" then in case of incomplete set of parameters it will fail with an error "model_an and model_b are required for compare evaluation" which is not a correct error explanation. In this case the only check is "at least something is present", more granularity is added below
| ) | ||
| parameters: Union[ClassifyParameters, ScoreParameters, CompareParameters] | ||
| # Build parameters based on type | ||
| if type == "classify": |
There was a problem hiding this comment.
Can we also check that the parameters which are not applicable for the evaluation type (e.g., model-a and model-b parameters for classify and score) are not passed? For instance, we could raise a ValueError if the user passed parameters that are not applicable for the evaluation type
Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>
Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>
Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>
| if model_b_field is not None: | ||
| # Simple mode: model_b_field is provided | ||
| if any(model_b_config_params): | ||
| raise click.BadParameter( | ||
| "Cannot specify both --model-b-field and config parameters (--model-b-name, etc.). " | ||
| "Use either --model-b-field alone if your input file has pre-generated responses,\ | ||
| or config parameters if you generate it on our end" | ||
| ) | ||
| model_b_final = model_b_field | ||
| elif any(model_b_config_params): | ||
| # Config mode: config parameters are provided | ||
| if not all(model_b_config_params): | ||
| raise click.BadParameter( | ||
| "All model config parameters are required when using detailed configuration: " | ||
| "--model-b-name, --model-b-max-tokens, --model-b-temperature, " | ||
| "--model-b-system-template, --model-b-input-template" | ||
| ) | ||
| model_b_final = { | ||
| "model_name": model_b_name, | ||
| "max_tokens": model_b_max_tokens, | ||
| "temperature": model_b_temperature, | ||
| "system_template": model_b_system_template, | ||
| "input_template": model_b_input_template, | ||
| } |
There was a problem hiding this comment.
Is it possible to move these checks inside client.evaluation.create? Looks like right now we do not have strong validation for inputs if people use the Python SDK directly instead of the CLI
There was a problem hiding this comment.
Makes sense. That's what I've done
Have you read the Contributing Guidelines?
Sure
Describe your changes
This PR adds support for the following new features: