-
Notifications
You must be signed in to change notification settings - Fork 21
Add support for evals API #339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/together/cli/api/evaluation.py
Outdated
| ) | ||
| elif type == "compare": | ||
| # Check if either model-a or model-b config/name is provided | ||
| model_a_provided = model_a_field or any( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| model_a_provided = model_a_field or any( | |
| model_a_provided = model_a_field is not None or any( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, should this be any or all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's "all" then in case of incomplete set of parameters it will fail with an error "model_an and model_b are required for compare evaluation" which is not a correct error explanation. In this case the only check is "at least something is present", more granularity is added below
| ) | ||
| parameters: Union[ClassifyParameters, ScoreParameters, CompareParameters] | ||
| # Build parameters based on type | ||
| if type == "classify": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also check that the parameters which are not applicable for the evaluation type (e.g., model-a and model-b parameters for classify and score) are not passed? For instance, we could raise a ValueError if the user passed parameters that are not applicable for the evaluation type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
Co-authored-by: Max Ryabinin <[email protected]>
Co-authored-by: Max Ryabinin <[email protected]>
Co-authored-by: Max Ryabinin <[email protected]>
| if model_b_field is not None: | ||
| # Simple mode: model_b_field is provided | ||
| if any(model_b_config_params): | ||
| raise click.BadParameter( | ||
| "Cannot specify both --model-b-field and config parameters (--model-b-name, etc.). " | ||
| "Use either --model-b-field alone if your input file has pre-generated responses,\ | ||
| or config parameters if you generate it on our end" | ||
| ) | ||
| model_b_final = model_b_field | ||
| elif any(model_b_config_params): | ||
| # Config mode: config parameters are provided | ||
| if not all(model_b_config_params): | ||
| raise click.BadParameter( | ||
| "All model config parameters are required when using detailed configuration: " | ||
| "--model-b-name, --model-b-max-tokens, --model-b-temperature, " | ||
| "--model-b-system-template, --model-b-input-template" | ||
| ) | ||
| model_b_final = { | ||
| "model_name": model_b_name, | ||
| "max_tokens": model_b_max_tokens, | ||
| "temperature": model_b_temperature, | ||
| "system_template": model_b_system_template, | ||
| "input_template": model_b_input_template, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to move these checks inside client.evaluation.create? Looks like right now we do not have strong validation for inputs if people use the Python SDK directly instead of the CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. That's what I've done
Have you read the Contributing Guidelines?
Sure
Describe your changes
This PR adds support for the following new features: