[ATH-3800] Create functionality to run the evals in datasets#13
[ATH-3800] Create functionality to run the evals in datasets#13codyathina wants to merge 3 commits intomainfrom
Conversation
WalkthroughThe changes introduce a new static method Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Service as AthinaApiService
participant Dataset as Dataset
participant EvalRunAPI as EvalRun API (/api/v1/eval_run)
participant DatasetEventAPI as DatasetEvent API (/api/v1/dataset_event/{datasetId})
Client->>Service: call run_evals(eval_configs, dataset_id, is_conditional_node)
Service->>EvalRunAPI: POST request to create evaluation run
EvalRunAPI-->>Service: Return response (Status, Data)
alt Success (Status 200/201)
Service->>DatasetEventAPI: POST request to create dataset event
DatasetEventAPI-->>Service: Return dataset event response
Service-->>Client: Return dataset event response
else Failure
Service-->>Client: Raise CustomException
end
Client->>Dataset: call run_evals(dataset_id, eval_configs, is_conditional_node)
Dataset->>DatasetEventAPI: POST request to trigger dataset event
DatasetEventAPI-->>Dataset: Return dataset event response
alt Success
Dataset-->>Client: Return dataset event response
else Failure
Dataset-->>Client: Raise CustomException
end
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
athina_client/services/athina_api_service.py (3)
664-673: Consider adding input validation for parametersThe method correctly sets up the API request, but it could benefit from basic parameter validation.
Consider validating that
eval_configsis not empty before making the API call, as sending an empty list might result in unexpected behavior:def create_eval_run(create_eval_run: bool, eval_configs: List[Dict[str, Any]], dataset_id: Optional[str] = None): """ Wrapper for POST /api/v1/eval_run Parameters: - create_eval_run (bool): Flag indicating whether to create a new eval run. - eval_configs (List[Dict]): Evaluation configurations. - dataset_id (Optional[str]): Optional dataset ID. Returns: - eval_run_id (str): The ID of the created evaluation run. """ try: + if not eval_configs: + raise CustomException("eval_configs cannot be empty", "Please provide at least one evaluation configuration") endpoint = f"{AthinaApiService._base_url()}/api/v1/eval_run"
676-679: Maintain consistent error handling pattern with other methodsFor consistency with other methods in the class, consider handling 401 errors separately to provide the specific "check your API key" message.
- if response.status_code not in [200, 201]: + if response.status_code == 401: + error_message = response_json.get("error", "Unknown Error") + details_message = "please check your athina api key and try again" + raise CustomException(error_message, details_message) + elif response.status_code not in [200, 201]: error_message = response_json.get("error", "Unknown Error") details_message = response_json.get("details", {}).get("message", "No details") raise CustomException(error_message, details_message)Apply the same change to the
create_dataset_eventmethod for consistency.
652-661: Enhance docstring to match style of other methods in the classThe current docstring is minimal compared to other methods in the class.
Consider expanding the docstring to be more descriptive and include information about potential exceptions, similar to other methods:
def create_eval_run(create_eval_run: bool, eval_configs: List[Dict[str, Any]], dataset_id: Optional[str] = None): """ - Wrapper for POST /api/v1/eval_run + Creates an evaluation run by calling the Athina API. Parameters: - create_eval_run (bool): Flag indicating whether to create a new eval run. - eval_configs (List[Dict]): Evaluation configurations. - dataset_id (Optional[str]): Optional dataset ID. Returns: - - eval_run_id (str): The ID of the created evaluation run. + - The ID of the created evaluation run. + + Raises: + - CustomException: If the API call fails or returns an error. """Apply a similar improvement to the
create_dataset_eventmethod's docstring.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
athina_client/services/athina_api_service.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
athina_client/services/athina_api_service.py (1)
athina_client/errors/exceptions.py (1)
CustomException(5-16)
🪛 Ruff (0.8.2)
athina_client/services/athina_api_service.py
650-650: Undefined name Optional
(F821)
683-683: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
685-685: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
689-689: Undefined name Optional
(F821)
722-722: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
724-724: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
| @staticmethod | ||
| @retry(stop_max_attempt_number=2, wait_fixed=1000) | ||
| def create_dataset_event(dataset_id: str, event_type: str, payload: Dict[str, Any], is_conditional_node: Optional[bool] = False): | ||
| """ | ||
| Wrapper for POST /api/v1/dataset_event/{datasetId} | ||
|
|
||
| Parameters: | ||
| - dataset_id (str): The dataset ID. | ||
| - event_type (str): Type of the event. | ||
| - payload (Dict): Payload associated with the event. | ||
| - is_conditional_node (bool, optional): Flag indicating if the node is conditional. | ||
|
|
||
| Returns: | ||
| - Dataset event response data. | ||
| """ | ||
| try: | ||
| endpoint = f"{AthinaApiService._base_url()}/api/v1/dataset_event/{dataset_id}" | ||
| response = requests.post( | ||
| endpoint, | ||
| headers=AthinaApiService._headers(), | ||
| json={ | ||
| "event_type": event_type, | ||
| "payload": payload, | ||
| "isConditionalNode": is_conditional_node | ||
| } | ||
| ) | ||
| response_json = response.json() | ||
|
|
||
| if response.status_code not in [200, 201]: | ||
| error_message = response_json.get("error", "Unknown Error") | ||
| details_message = response_json.get("details", {}).get("message", "No details") | ||
| raise CustomException(error_message, details_message) | ||
|
|
||
| return response_json["data"] | ||
| except requests.RequestException as e: | ||
| raise CustomException("Request failed", str(e)) | ||
| except Exception as e: | ||
| raise CustomException("Unexpected error occurred", str(e)) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Improve exception handling and maintain consistency in implementation style
The implementation of this method is good, but there are similar issues as in the previous method:
- Exception handling should use
raise ... from errpattern - The docstring could be more descriptive, similar to other methods in the class
- raise CustomException("Request failed", str(e))
+ raise CustomException("Request failed", str(e)) from e
- raise CustomException("Unexpected error occurred", str(e))
+ raise CustomException("Unexpected error occurred", str(e)) from eAlso, consider standardizing how response status codes are checked. Some methods in the class check for 401 separately before checking other error codes, while these new methods use a different pattern.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @staticmethod | |
| @retry(stop_max_attempt_number=2, wait_fixed=1000) | |
| def create_dataset_event(dataset_id: str, event_type: str, payload: Dict[str, Any], is_conditional_node: Optional[bool] = False): | |
| """ | |
| Wrapper for POST /api/v1/dataset_event/{datasetId} | |
| Parameters: | |
| - dataset_id (str): The dataset ID. | |
| - event_type (str): Type of the event. | |
| - payload (Dict): Payload associated with the event. | |
| - is_conditional_node (bool, optional): Flag indicating if the node is conditional. | |
| Returns: | |
| - Dataset event response data. | |
| """ | |
| try: | |
| endpoint = f"{AthinaApiService._base_url()}/api/v1/dataset_event/{dataset_id}" | |
| response = requests.post( | |
| endpoint, | |
| headers=AthinaApiService._headers(), | |
| json={ | |
| "event_type": event_type, | |
| "payload": payload, | |
| "isConditionalNode": is_conditional_node | |
| } | |
| ) | |
| response_json = response.json() | |
| if response.status_code not in [200, 201]: | |
| error_message = response_json.get("error", "Unknown Error") | |
| details_message = response_json.get("details", {}).get("message", "No details") | |
| raise CustomException(error_message, details_message) | |
| return response_json["data"] | |
| except requests.RequestException as e: | |
| raise CustomException("Request failed", str(e)) | |
| except Exception as e: | |
| raise CustomException("Unexpected error occurred", str(e)) | |
| @staticmethod | |
| @retry(stop_max_attempt_number=2, wait_fixed=1000) | |
| def create_dataset_event(dataset_id: str, event_type: str, payload: Dict[str, Any], is_conditional_node: Optional[bool] = False): | |
| """ | |
| Wrapper for POST /api/v1/dataset_event/{datasetId} | |
| Parameters: | |
| - dataset_id (str): The dataset ID. | |
| - event_type (str): Type of the event. | |
| - payload (Dict): Payload associated with the event. | |
| - is_conditional_node (bool, optional): Flag indicating if the node is conditional. | |
| Returns: | |
| - Dataset event response data. | |
| """ | |
| try: | |
| endpoint = f"{AthinaApiService._base_url()}/api/v1/dataset_event/{dataset_id}" | |
| response = requests.post( | |
| endpoint, | |
| headers=AthinaApiService._headers(), | |
| json={ | |
| "event_type": event_type, | |
| "payload": payload, | |
| "isConditionalNode": is_conditional_node | |
| } | |
| ) | |
| response_json = response.json() | |
| if response.status_code not in [200, 201]: | |
| error_message = response_json.get("error", "Unknown Error") | |
| details_message = response_json.get("details", {}).get("message", "No details") | |
| raise CustomException(error_message, details_message) | |
| return response_json["data"] | |
| except requests.RequestException as e: | |
| raise CustomException("Request failed", str(e)) from e | |
| except Exception as e: | |
| raise CustomException("Unexpected error occurred", str(e)) from e |
🧰 Tools
🪛 Ruff (0.8.2)
689-689: Undefined name Optional
(F821)
722-722: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
724-724: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
vivek-athina
left a comment
There was a problem hiding this comment.
We are not supposed to expose the internal concepts to the user in this feature request.
In fact just think how would a user outside Athina understand dataset_event and eval_run.
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
athina_client/datasets/dataset.py (1)
1-5:⚠️ Potential issueMissing import for CustomException.
The
run_evalsmethod raises aCustomException, but this class isn't imported in the file. Make sure to import it to avoid runtime errors.from typing import Any, Dict, List, Optional from dataclasses import dataclass, field from athina_client.services import AthinaApiService from athina_client.constants import MAX_DATASET_ROWS +from athina_client.exceptions import CustomException
🧹 Nitpick comments (3)
examples/datasets.ipynb (2)
212-218: Consider enhancing documentation for eval configurations.The
eval_configsstructure is introduced without explanation of its purpose and structure. Adding a comment explaining what each field represents would improve usability for developers who are new to the API.eval_configs = [ { + # Maps the column "run_prompt" from the dataset to the "text" field in the evaluation "source_column":"run_prompt", "source_column_type":"string", "destination_field":"text" } ]
222-222: Add response structure details.The code prints the raw response without explaining its structure. Consider adding a more detailed handling of the response to help users understand what to expect.
- print("Eval run triggered successfully:", response) + print("Eval run triggered successfully!") + print(f"Evaluation ID: {response.get('id', 'N/A')}") + print(f"Status: {response.get('status', 'N/A')}") + print("Full response:", response)athina_client/datasets/dataset.py (1)
370-397: Add usage example to docstring.The method has a comprehensive docstring, but adding a usage example would improve developer experience, similar to what you've done for the
update_cellsmethod.Returns: Dict[str, Any]: The response from the dataset event trigger. Raises: CustomException: If the evaluation trigger fails. + + Example: + ```python + eval_configs = [ + { + "source_column": "run_prompt", + "source_column_type": "string", + "destination_field": "text" + } + ] + result = Dataset.run_evals("dataset-123", eval_configs) + ``` """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
📒 Files selected for processing (2)
athina_client/datasets/dataset.py(1 hunks)examples/datasets.ipynb(1 hunks)
🔇 Additional comments (2)
examples/datasets.ipynb (1)
221-222: Ensure variable visibility in example.The notebook uses
dataset.idwhich refers to the dataset created in the previous cell. If someone runs this cell independently, they might encounter an error ifdatasetisn't defined.Consider adding a fallback option or clarifying comment:
- response = Dataset.run_evals(eval_configs=eval_configs, dataset_id=dataset.id) + # Make sure dataset.id is defined from a previous cell or use a specific dataset_id + dataset_id = dataset.id if 'dataset' in locals() else "your-dataset-id-here" + response = Dataset.run_evals(eval_configs=eval_configs, dataset_id=dataset_id)athina_client/datasets/dataset.py (1)
391-395: LGTM - Implementation correctly delegates to AthinaApiService.The implementation correctly delegates to the
AthinaApiService.run_evals()method with all the required parameters and proper error handling.
vivek-athina
left a comment
There was a problem hiding this comment.
Let's park this PR for now
Summary by CodeRabbit
Datasetclass with a method to programmatically trigger evaluation runs using specified configurations.