Skip to content

feature: Giskard v3 #18

Draft
kevinmessiaen wants to merge 21 commits intomainfrom
feature/giskard-v3
Draft

feature: Giskard v3 #18
kevinmessiaen wants to merge 21 commits intomainfrom
feature/giskard-v3

Conversation

@kevinmessiaen
Copy link
Member

No description provided.

kevinmessiaen and others added 18 commits December 18, 2025 12:00
- Replace manual pydantic object construction (TestCase, InteractionSpec, Scenario) with fluent builder pattern scenario().interact().check()
- Update all OSS documentation files including quickstart, core concepts, single-turn, multi-turn, custom checks, tutorials, and API reference
- Add asyncio.run() disclaimers in quickstart and core concepts sections to guide users on running async code
- Improve code examples consistency across all documentation
- Update all examples to use scenario() as primary API
- Replace trace.interactions[-1] with trace.last in Python code and JSONPath keys
- Standardize JSONPath keys to use trace.last.outputs pattern (all keys start with trace.)
- Fix imports: use 'from giskard.checks import' instead of 'from giskard.checks.builtin import'
- Clarify Scenario vs TestCase: minimize TestCase references (internal detail)
- Update core concepts, quickstart, API reference, and all tutorials
- Ensure consistency across single-turn, multi-turn, and custom checks documentation
Align quickstart examples with scenario terminology and rich result
rendering, and add screenshots for simple and structured outputs.
Reframe core concepts around specs, interactions, and traces, and add a clear execution flow with fluent API mapping.
Co-authored-by: Henrique Chaves <44180294+henchaves@users.noreply.github.com>
Update OpenAI model examples to gpt-5-mini to avoid deprecated gpt-4o usage.
Remove notebook mention and keep script guidance for asyncio.run().
Refresh single-turn examples with high-risk scenarios and clearer guardrail
rationale to emphasize production impact.
Reframe multi-turn examples around higher-risk workflows and add rationale
notes to highlight guardrail failure points across turns.
@kevinmessiaen kevinmessiaen changed the title Giskard v3 feature: Giskard v3 Feb 9, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kevinmessiaen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request represents a major update to Giskard, focusing on the introduction and comprehensive documentation of the new 'Giskard Checks' library. It involves upgrading the Python environment, streamlining project dependencies, and completely restructuring the documentation to provide clear, detailed guides for AI testing with the new framework. These changes aim to modernize the project's foundation and improve the user experience for developers working with Giskard's AI testing capabilities.

Highlights

  • Python Version Upgrade: The project's required Python version has been updated from 3.12 to 3.13, reflected in both the .python-version file and pyproject.toml.
  • Dependency Refactoring: Significant changes were made to project dependencies in pyproject.toml, including the removal of several direct dependencies and the introduction of Git-based dependencies for giskard-core, giskard-agents, and giskard-checks. Additionally, PyTorch CPU-specific configuration has been removed.
  • Documentation Overhaul for Giskard Checks: The documentation structure has been completely revamped. Old SDK and notebook documentation sections have been removed, and a comprehensive new section for 'Giskard Checks' has been introduced. This includes new guides for AI testing, API references, getting started, and tutorials, along with custom sidebar configurations and a new Sphinx template function for dynamic table of contents rendering.
  • Pre-commit Hook Update: The detect-secrets pre-commit hook in .pre-commit-config.yaml has been updated to exclude 'your-api-key' from secret scans.
  • CSS Styling Enhancements: New CSS variables and styling rules have been added to source/_static/custom.css to enhance the appearance of the left sidebar in the documentation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .pre-commit-config.yaml
    • Updated detect-secrets hook to exclude your-api-key from scans.
  • .python-version
    • Upgraded Python version from 3.12 to 3.13.
  • pyproject.toml
    • Updated requires-python to >=3.13,<4.0.
    • Added Git-based dependencies for giskard-core, giskard-agents, and giskard-checks.
    • Removed various development dependencies including notebook, nbsphinx, pandoc, giskard[llm], ragas, ipywidgets, and torch.
    • Removed tool.uv.index and tool.uv.sources related to PyTorch CPU configuration.
  • source/_static/custom.css
    • Added --border CSS variable.
    • Applied new styling rules for #left-sidebar a.current and #left-sidebar ul ul:is(.dark *)::before.
  • source/_templates/sidebars/sidebar_oss_checks.html
    • Added a new HTML template for the oss/checks sidebar navigation.
  • source/conf.py
    • Removed nbsphinx from Sphinx extensions.
    • Added html_sidebars configuration to use a custom sidebar for oss/checks/** paths.
    • Removed nbsphinx_execute and nbsphinx_prolog configurations.
    • Updated the 'Open Source' navigation link to 'Checks'.
    • Added a setup function with a toctree_from_doc custom template function for dynamic TOC rendering.
  • source/index.rst
    • Removed extraneous blank lines for cleaner formatting.
  • source/oss/checks/ai-testing/core-concepts.rst
    • New file added, detailing core concepts of Giskard Checks (Interaction, InteractionSpec, Trace, Check, Scenario, Fluent API).
  • source/oss/checks/ai-testing/custom-checks.rst
    • New file added, providing guidance on creating custom checks, including function-based, class-based, LLM-based, async, stateful, and composite checks.
  • source/oss/checks/ai-testing/index.rst
    • New file added, serving as the main index for AI Testing with Checks, linking to quickstart, core concepts, single-turn, multi-turn, and custom checks.
  • source/oss/checks/ai-testing/multi-turn.rst
    • New file added, explaining multi-turn scenarios for testing conversational flows, stateful conversations, agent workflows, dynamic interactions, error recovery, conversational RAG, and task completion.
  • source/oss/checks/ai-testing/quickstart.rst
    • New file added, a quickstart guide for Giskard Checks, demonstrating simple and dynamic interactions, and structured output validation.
  • source/oss/checks/ai-testing/single-turn.rst
    • New file added, covering single-turn evaluation for RAG systems, classification, summarization, safety, instruction following, structured output, and batch evaluation.
  • source/oss/checks/api-reference/builtin-checks.rst
    • New file added, documenting built-in checks like from_fn, StringMatchingCheck, EqualityCheck, Groundedness, Conformity, and LLMJudge.
  • source/oss/checks/api-reference/core.rst
    • New file added, detailing core API components like Check, CheckResult, CheckStatus, Interaction, Trace, InteractionSpec, Scenario, TestCase, and Extractors.
  • source/oss/checks/api-reference/index.rst
    • New file added, the main API reference for Giskard Checks, linking to core, built-in checks, scenarios, and testing utilities.
  • source/oss/checks/api-reference/scenarios.rst
    • New file added, documenting scenario-related classes like Scenario, ScenarioResult, TestCase, TestCaseResult, and runners.
  • source/oss/checks/api-reference/testing.rst
    • New file added, documenting testing utilities like WithSpy, debugging helpers, serialization, custom fixtures, and performance tracking.
  • source/oss/checks/getting-started/installation.rst
    • New file added, providing installation and configuration instructions for Giskard Checks.
  • source/oss/checks/index.rst
    • New file added, the main overview for Giskard Checks, detailing features, quick links, and use cases.
  • source/oss/checks/tutorials/chatbot-testing.rst
    • New file added, a tutorial on testing conversational AI systems, covering context handling, tone consistency, and multi-turn dialogue.
  • source/oss/checks/tutorials/index.rst
    • New file added, the main index for Giskard Checks tutorials, linking to RAG evaluation, agent testing, chatbot testing, and content moderation.
  • source/oss/checks/tutorials/rag-evaluation.rst
    • New file added, a tutorial on building a test suite for RAG systems, covering retrieval quality, groundedness, and answer quality.
  • source/oss/checks/tutorials/testing-agents.rst
    • New file added, a tutorial on testing AI agents, including tool selection, reasoning quality, multi-step workflows, error handling, and state management.
  • source/oss/notebooks/index.rst
    • Removed file, indicating a shift away from this notebook structure.
  • source/oss/sdk/business.rst
    • Removed file, as content is likely integrated into the new 'Checks' documentation.
  • source/oss/sdk/index.rst
    • Removed file, indicating a restructuring of the main SDK documentation.
  • source/oss/sdk/legacy.rst
    • Removed file, deprecating legacy documentation.
  • source/oss/sdk/reference/datasets.rst
    • Removed file, indicating a restructuring of the SDK reference.
  • source/oss/sdk/reference/index.rst
    • Removed file, indicating a restructuring of the SDK reference.
  • source/oss/sdk/reference/models.rst
    • Removed file, indicating a restructuring of the SDK reference.
  • source/oss/sdk/reference/raget/evaluation.rst
    • Removed file, indicating a restructuring of the RAGET reference.
  • source/oss/sdk/reference/raget/index.rst
    • Removed file, indicating a restructuring of the RAGET reference.
  • source/oss/sdk/reference/raget/knowledge_base.rst
    • Removed file, indicating a restructuring of the RAGET reference.
  • source/oss/sdk/reference/raget/metrics.rst
    • Removed file, indicating a restructuring of the RAGET reference.
  • source/oss/sdk/reference/raget/question_generation.rst
    • Removed file, indicating a restructuring of the RAGET reference.
  • source/oss/sdk/reference/raget/testset_generation.rst
    • Removed file, indicating a restructuring of the RAGET reference.
  • source/oss/sdk/reference/scan/index.rst
    • Removed file, indicating a restructuring of the LLM Scan reference.
  • source/oss/sdk/reference/scan/llm_detectors.rst
    • Removed file, indicating a restructuring of the LLM Scan reference.
  • source/oss/sdk/reference/scan/report.rst
    • Removed file, indicating a restructuring of the LLM Scan reference.
  • source/oss/sdk/reference/suite.rst
    • Removed file, indicating a restructuring of the SDK reference.
  • source/oss/sdk/reference/tests.rst
    • Removed file, indicating a restructuring of the SDK reference.
  • source/oss/sdk/security.rst
    • Removed file, as content is likely integrated into the new 'Checks' documentation.
  • source/toctree_oss.rst
    • Removed oss/sdk/legacy entry from the table of contents.
Activity
  • The pull request is titled "Giskard v3", suggesting a major version update.
  • The author, kevinmessiaen, has implemented extensive changes across dependency management and documentation structure.
  • The empty description implies that the changes are self-explanatory through the code and file modifications, or that a more detailed explanation will follow in a separate communication.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a major refactoring for "Giskard v3", focusing on a new "Checks" feature for testing AI applications. It adds extensive documentation for this new feature, including quickstarts, tutorials, and API references, while removing a significant amount of old documentation related to the previous SDK, RAGET, and LLM Scan features. The changes are substantial and well-structured. My review focuses on improving code quality, fixing errors in documentation examples, and addressing potential security and accessibility issues in the configuration.

hooks:
- id: detect-secrets
args: ["--baseline", ".secrets.baseline"]
args: ["--baseline", ".secrets.baseline", "--exclude-secrets", "your-api-key"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The placeholder "your-api-key" has been added to the detect-secrets configuration. This is a significant security risk as it might be overlooked and could lead to real secrets being excluded from scans if this pattern is copied. This placeholder should be removed. If a specific secret needs to be excluded, it should be done using its ID or a more specific regex.

        args: ["--baseline", ".secrets.baseline"]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"your-api-key" is not a real api key and is used in example code.

Comment on lines 136 to 141
docs_version = os.getenv("READTHEDOCS_VERSION", "latest")
if docs_version == "latest" or docs_version == "stable":
branch = "main"
else:
branch = docs_version.replace("-", "/")
branch = "main"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The branch variable is unconditionally set to "main" on line 141, which makes the preceding if/else block that also sets branch redundant. This dead code should be removed to improve clarity.

# Do not execute the notebooks when building the docs
branch = "main"

Comment on lines +192 to +193
from sphinx.environment.adapters.toctree import TocTree
from sphinx import addnodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The imports for TocTree and addnodes are inside a nested function toctree_from_doc. According to PEP 8, imports should usually be at the top of the file. While this might be done to avoid polluting the global namespace, it's better to move them to the top of conf.py for consistency and readability.

kevinmessiaen and others added 3 commits February 9, 2026 10:02
Refresh check references, expand comparison checks, and add semantic similarity details.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants