-
Notifications
You must be signed in to change notification settings - Fork 457
[Docs] AGE-3418 Docs for evaluation SDK #2943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…r feedback and Langchain observability. Adjusted paths in existing documentation to reflect new file structure.
…atalayer/results path.
This commit introduces a new guide on managing testsets, including creating, listing, retrieving, and upserting testsets using the Agenta SDK. Additionally, a Jupyter notebook is added to demonstrate these functionalities with practical examples.
…entation and examples related to evaluation processes in Agenta.
This commit introduces a comprehensive guide on creating custom evaluators and utilizing built-in evaluators to assess application outputs. The new documentation covers the structure, inputs, return values, and practical examples for both custom and built-in evaluators, enhancing the overall understanding of evaluation processes in Agenta.
This commit introduces a new guide on defining and configuring applications for evaluation with the Agenta SDK. It covers the basic application structure, input handling, return values, and provides practical examples, enhancing user understanding of application setup and usage.
This commit introduces a new guide on running evaluations using the Agenta SDK. It provides an overview of the process, enhancing user understanding of programmatic evaluation execution.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Member
Author
… API key setup and refining the evaluation process steps. This update improves user experience by ensuring necessary credentials are configured for LLM-based evaluators and clarifies the overall evaluation workflow.
…-judge evaluator and refine testset description. This enhances clarity for users configuring evaluations with the Agenta SDK.
…arameters in the Agenta SDK. This update improves clarity for users running evaluations and ensures proper configuration of evaluation details.
…ameter and updating example outputs. This enhances clarity and consistency in the Jupyter notebook and markdown files related to testset creation.
… Agenta SDK This commit introduces two new guides: one for managing testsets, detailing creation, listing, and retrieval processes, and another for configuring evaluators, covering both custom and built-in evaluators. These additions enhance user understanding of evaluation workflows and improve the overall documentation structure by replacing the previous running evaluations guide.
… structure. This change enhances readability and maintains focus on the evaluation creation process using the Agenta SDK.
ashrafchowdury
approved these changes
Nov 12, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
documentation
Improvements or additions to documentation
Evaluation
size:XXL
This PR changes 1000+ lines, ignoring generated files.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit introduces a new guide on running evaluations using the Agenta SDK. It provides an overview of the process, enhancing user understanding of programmatic evaluation execution.