[ATH-3800] Create functionality to run the evals in datasets by codyathina · Pull Request #13 · athina-ai/athina-client

codyathina · 2025-04-10T01:06:38Z

Summary by CodeRabbit

New Features
- Introduced support for initiating evaluation runs with configurable parameters, allowing users to trigger evaluations directly.
- Added functionality for logging dataset events by capturing event details, enhancing tracking and analysis capabilities.
- Enhanced the Dataset class with a method to programmatically trigger evaluation runs using specified configurations.
- Added example code in the Jupyter notebook for users to utilize the new evaluation functionality.

coderabbitai · 2025-04-10T01:06:45Z

Walkthrough

The changes introduce a new static method run_evals to both the AthinaApiService and Dataset classes within the athina_client module. This method orchestrates the evaluation process by creating an evaluation run and triggering a dataset event. Each class's implementation includes error handling and raises a CustomException for failures. Additionally, a new code cell has been added to a Jupyter notebook to demonstrate how to use the Dataset.run_evals method with evaluation configurations.

Changes

File	Change Summary
athina_client/.../athina_api_service.py	Added a new static method `run_evals` to orchestrate evaluations, along with internal methods `_create_eval_run` (POST to `/api/v1/eval_run`) and `_create_dataset_event` (POST to `/api/v1/dataset_event/{datasetId}`), both with error handling.
athina_client/.../dataset.py	Added a new static method `run_evals` to trigger evaluation runs for datasets with error handling.
examples/.../datasets.ipynb	Added a new code cell demonstrating the use of `Dataset.run_evals` with evaluation configurations and error handling.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Service as AthinaApiService
  participant Dataset as Dataset
  participant EvalRunAPI as EvalRun API (/api/v1/eval_run)
  participant DatasetEventAPI as DatasetEvent API (/api/v1/dataset_event/{datasetId})

  Client->>Service: call run_evals(eval_configs, dataset_id, is_conditional_node)
  Service->>EvalRunAPI: POST request to create evaluation run
  EvalRunAPI-->>Service: Return response (Status, Data)
  alt Success (Status 200/201)
    Service->>DatasetEventAPI: POST request to create dataset event
    DatasetEventAPI-->>Service: Return dataset event response
    Service-->>Client: Return dataset event response
  else Failure
    Service-->>Client: Raise CustomException
  end

  Client->>Dataset: call run_evals(dataset_id, eval_configs, is_conditional_node)
  Dataset->>DatasetEventAPI: POST request to trigger dataset event
  DatasetEventAPI-->>Dataset: Return dataset event response
  alt Success
    Dataset-->>Client: Return dataset event response
  else Failure
    Dataset-->>Client: Raise CustomException
  end

Poem

Hopping through lines with joyful might,
I scurry and munch on bugs in sight.
A new method to run and create,
With careful hops, we celebrate fate.
In every call, a dance so bright,
Code and rabbits, a wondrous sight!
🐰✨

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai plan to trigger planning for file edits and PR creation.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

athina_client/services/athina_api_service.py (3)

664-673: Consider adding input validation for parameters

The method correctly sets up the API request, but it could benefit from basic parameter validation.

Consider validating that eval_configs is not empty before making the API call, as sending an empty list might result in unexpected behavior:

    def create_eval_run(create_eval_run: bool, eval_configs: List[Dict[str, Any]], dataset_id: Optional[str] = None):
        """
        Wrapper for POST /api/v1/eval_run
        
        Parameters:
        - create_eval_run (bool): Flag indicating whether to create a new eval run.
        - eval_configs (List[Dict]): Evaluation configurations.
        - dataset_id (Optional[str]): Optional dataset ID.

        Returns:
        - eval_run_id (str): The ID of the created evaluation run.
        """
        try:
+           if not eval_configs:
+               raise CustomException("eval_configs cannot be empty", "Please provide at least one evaluation configuration")
            endpoint = f"{AthinaApiService._base_url()}/api/v1/eval_run"

676-679: Maintain consistent error handling pattern with other methods

For consistency with other methods in the class, consider handling 401 errors separately to provide the specific "check your API key" message.

-            if response.status_code not in [200, 201]:
+            if response.status_code == 401:
+                error_message = response_json.get("error", "Unknown Error")
+                details_message = "please check your athina api key and try again"
+                raise CustomException(error_message, details_message)
+            elif response.status_code not in [200, 201]:
                 error_message = response_json.get("error", "Unknown Error")
                 details_message = response_json.get("details", {}).get("message", "No details")
                 raise CustomException(error_message, details_message)

Apply the same change to the create_dataset_event method for consistency.

652-661: Enhance docstring to match style of other methods in the class

The current docstring is minimal compared to other methods in the class.

Consider expanding the docstring to be more descriptive and include information about potential exceptions, similar to other methods:

    def create_eval_run(create_eval_run: bool, eval_configs: List[Dict[str, Any]], dataset_id: Optional[str] = None):
        """
-        Wrapper for POST /api/v1/eval_run
+        Creates an evaluation run by calling the Athina API.
        
        Parameters:
        - create_eval_run (bool): Flag indicating whether to create a new eval run.
        - eval_configs (List[Dict]): Evaluation configurations.
        - dataset_id (Optional[str]): Optional dataset ID.

        Returns:
-        - eval_run_id (str): The ID of the created evaluation run.
+        - The ID of the created evaluation run.
+
+        Raises:
+        - CustomException: If the API call fails or returns an error.
        """

Apply a similar improvement to the create_dataset_event method's docstring.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59ae448 and fca427a.

📒 Files selected for processing (1)

athina_client/services/athina_api_service.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

athina_client/services/athina_api_service.py (1)

athina_client/errors/exceptions.py (1)

CustomException (5-16)

🪛 Ruff (0.8.2)

athina_client/services/athina_api_service.py

650-650: Undefined name Optional

(F821)

683-683: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

685-685: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

689-689: Undefined name Optional

(F821)

722-722: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

724-724: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

coderabbitai · 2025-04-10T01:08:27Z

athina_client/services/athina_api_service.py

+    @staticmethod
+    @retry(stop_max_attempt_number=2, wait_fixed=1000)
+    def create_dataset_event(dataset_id: str, event_type: str, payload: Dict[str, Any], is_conditional_node: Optional[bool] = False):
+        """
+        Wrapper for POST /api/v1/dataset_event/{datasetId}
+
+        Parameters:
+        - dataset_id (str): The dataset ID.
+        - event_type (str): Type of the event.
+        - payload (Dict): Payload associated with the event.
+        - is_conditional_node (bool, optional): Flag indicating if the node is conditional.
+
+        Returns:
+        - Dataset event response data.
+        """
+        try:
+            endpoint = f"{AthinaApiService._base_url()}/api/v1/dataset_event/{dataset_id}"
+            response = requests.post(
+                endpoint,
+                headers=AthinaApiService._headers(),
+                json={
+                    "event_type": event_type,
+                    "payload": payload,
+                    "isConditionalNode": is_conditional_node
+                }
+            )
+            response_json = response.json()
+
+            if response.status_code not in [200, 201]:
+                error_message = response_json.get("error", "Unknown Error")
+                details_message = response_json.get("details", {}).get("message", "No details")
+                raise CustomException(error_message, details_message)
+
+            return response_json["data"]
+        except requests.RequestException as e:
+            raise CustomException("Request failed", str(e))
+        except Exception as e:
+            raise CustomException("Unexpected error occurred", str(e))


🛠️ Refactor suggestion

Improve exception handling and maintain consistency in implementation style

The implementation of this method is good, but there are similar issues as in the previous method:

Exception handling should use raise ... from err pattern

The docstring could be more descriptive, similar to other methods in the class

- raise CustomException("Request failed", str(e)) + raise CustomException("Request failed", str(e)) from e - raise CustomException("Unexpected error occurred", str(e)) + raise CustomException("Unexpected error occurred", str(e)) from e

Also, consider standardizing how response status codes are checked. Some methods in the class check for 401 separately before checking other error codes, while these new methods use a different pattern.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@staticmethod

@retry(stop_max_attempt_number=2, wait_fixed=1000)

def create_dataset_event(dataset_id: str, event_type: str, payload: Dict[str, Any], is_conditional_node: Optional[bool] = False):

"""

Wrapper for POST /api/v1/dataset_event/{datasetId}

Parameters:

- dataset_id (str): The dataset ID.

- event_type (str): Type of the event.

- payload (Dict): Payload associated with the event.

- is_conditional_node (bool, optional): Flag indicating if the node is conditional.

Returns:

- Dataset event response data.

"""

try:

endpoint = f"{AthinaApiService._base_url()}/api/v1/dataset_event/{dataset_id}"

response = requests.post(

endpoint,

headers=AthinaApiService._headers(),

json={

"event_type": event_type,

"payload": payload,

"isConditionalNode": is_conditional_node

}

)

response_json = response.json()

if response.status_code not in [200, 201]:

error_message = response_json.get("error", "Unknown Error")

details_message = response_json.get("details", {}).get("message", "No details")

raise CustomException(error_message, details_message)

return response_json["data"]

except requests.RequestException as e:

raise CustomException("Request failed", str(e))

except Exception as e:

raise CustomException("Unexpected error occurred", str(e))

@staticmethod

@retry(stop_max_attempt_number=2, wait_fixed=1000)

def create_dataset_event(dataset_id: str, event_type: str, payload: Dict[str, Any], is_conditional_node: Optional[bool] = False):

"""

Wrapper for POST /api/v1/dataset_event/{datasetId}

Parameters:

- dataset_id (str): The dataset ID.

- event_type (str): Type of the event.

- payload (Dict): Payload associated with the event.

- is_conditional_node (bool, optional): Flag indicating if the node is conditional.

Returns:

- Dataset event response data.

"""

try:

endpoint = f"{AthinaApiService._base_url()}/api/v1/dataset_event/{dataset_id}"

response = requests.post(

endpoint,

headers=AthinaApiService._headers(),

json={

"event_type": event_type,

"payload": payload,

"isConditionalNode": is_conditional_node

}

)

response_json = response.json()

if response.status_code not in [200, 201]:

error_message = response_json.get("error", "Unknown Error")

details_message = response_json.get("details", {}).get("message", "No details")

raise CustomException(error_message, details_message)

return response_json["data"]

except requests.RequestException as e:

raise CustomException("Request failed", str(e)) from e

except Exception as e:

raise CustomException("Unexpected error occurred", str(e)) from e

🧰 Tools

🪛 Ruff (0.8.2)

689-689: Undefined name Optional

(F821)

722-722: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

724-724: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

athina_client/services/athina_api_service.py

vivek-athina

We are not supposed to expose the internal concepts to the user in this feature request.

In fact just think how would a user outside Athina understand dataset_event and eval_run.

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

athina_client/datasets/dataset.py (1)
1-5: ⚠️ Potential issue

Missing import for CustomException.

The run_evals method raises a CustomException, but this class isn't imported in the file. Make sure to import it to avoid runtime errors.
 from typing import Any, Dict, List, Optional
 from dataclasses import dataclass, field
 from athina_client.services import AthinaApiService
 from athina_client.constants import MAX_DATASET_ROWS
+from athina_client.exceptions import CustomException

🧹 Nitpick comments (3)

examples/datasets.ipynb (2)
212-218: Consider enhancing documentation for eval configurations.

The eval_configs structure is introduced without explanation of its purpose and structure. Adding a comment explaining what each field represents would improve usability for developers who are new to the API.
 eval_configs = [
     {
+        # Maps the column "run_prompt" from the dataset to the "text" field in the evaluation
         "source_column":"run_prompt",
         "source_column_type":"string",
         "destination_field":"text"
     }
 ]
222-222: Add response structure details.

The code prints the raw response without explaining its structure. Consider adding a more detailed handling of the response to help users understand what to expect.
-    print("Eval run triggered successfully:", response)
+    print("Eval run triggered successfully!")
+    print(f"Evaluation ID: {response.get('id', 'N/A')}")
+    print(f"Status: {response.get('status', 'N/A')}")
+    print("Full response:", response)
athina_client/datasets/dataset.py (1)
370-397: Add usage example to docstring.

The method has a comprehensive docstring, but adding a usage example would improve developer experience, similar to what you've done for the update_cells method.
         Returns:
             Dict[str, Any]: The response from the dataset event trigger.
 
         Raises:
             CustomException: If the evaluation trigger fails.
+
+        Example:
+            ```python
+            eval_configs = [
+                {
+                    "source_column": "run_prompt",
+                    "source_column_type": "string",
+                    "destination_field": "text"
+                }
+            ]
+            result = Dataset.run_evals("dataset-123", eval_configs)
+            ```
         """

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite

📥 Commits

Reviewing files that changed from the base of the PR and between 8f8fe0c and c19562d.

📒 Files selected for processing (2)

athina_client/datasets/dataset.py (1 hunks)
examples/datasets.ipynb (1 hunks)

🔇 Additional comments (2)

examples/datasets.ipynb (1)

221-222: Ensure variable visibility in example.

The notebook uses dataset.id which refers to the dataset created in the previous cell. If someone runs this cell independently, they might encounter an error if dataset isn't defined.

Consider adding a fallback option or clarifying comment:
-    response = Dataset.run_evals(eval_configs=eval_configs, dataset_id=dataset.id)
+    # Make sure dataset.id is defined from a previous cell or use a specific dataset_id
+    dataset_id = dataset.id if 'dataset' in locals() else "your-dataset-id-here"
+    response = Dataset.run_evals(eval_configs=eval_configs, dataset_id=dataset_id)
athina_client/datasets/dataset.py (1)

391-395: LGTM - Implementation correctly delegates to AthinaApiService.

The implementation correctly delegates to the AthinaApiService.run_evals() method with all the required parameters and proper error handling.

vivek-athina

Let's park this PR for now

[ATH-3800] Create functionality to run the evals in datasets

fca427a

codyathina requested a review from vivek-athina April 10, 2025 01:06

coderabbitai bot reviewed Apr 10, 2025

View reviewed changes

vivek-athina requested review from Akhilathina and removed request for vivek-athina April 10, 2025 02:35

vivek-athina suggested changes Apr 10, 2025

View reviewed changes

[ATH-3800] create run evals with prviate methods

8f8fe0c

codyathina requested a review from vivek-athina April 11, 2025 14:08

added dataset wrapper and example

c19562d

coderabbitai bot reviewed Apr 12, 2025

View reviewed changes

vivek-athina suggested changes Apr 12, 2025

View reviewed changes

vivek-athina marked this pull request as draft April 12, 2025 07:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ATH-3800] Create functionality to run the evals in datasets#13

[ATH-3800] Create functionality to run the evals in datasets#13
codyathina wants to merge 3 commits intomainfrom
ATH-3800-create-functionality-run-evals-in-datasets

codyathina commented Apr 10, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 10, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 10, 2025

Uh oh!

Uh oh!

vivek-athina left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

vivek-athina left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

codyathina commented Apr 10, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vivek-athina left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

vivek-athina left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codyathina commented Apr 10, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 10, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)