feature: Giskard v3 by kevinmessiaen · Pull Request #18 · Giskard-AI/giskard-docs

kevinmessiaen · 2026-02-09T02:24:26Z

No description provided.

- Replace manual pydantic object construction (TestCase, InteractionSpec, Scenario) with fluent builder pattern scenario().interact().check() - Update all OSS documentation files including quickstart, core concepts, single-turn, multi-turn, custom checks, tutorials, and API reference - Add asyncio.run() disclaimers in quickstart and core concepts sections to guide users on running async code - Improve code examples consistency across all documentation

- Update all examples to use scenario() as primary API - Replace trace.interactions[-1] with trace.last in Python code and JSONPath keys - Standardize JSONPath keys to use trace.last.outputs pattern (all keys start with trace.) - Fix imports: use 'from giskard.checks import' instead of 'from giskard.checks.builtin import' - Clarify Scenario vs TestCase: minimize TestCase references (internal detail) - Update core concepts, quickstart, API reference, and all tutorials - Ensure consistency across single-turn, multi-turn, and custom checks documentation

Align quickstart examples with scenario terminology and rich result rendering, and add screenshots for simple and structured outputs.

Reframe core concepts around specs, interactions, and traces, and add a clear execution flow with fluent API mapping.

Co-authored-by: Henrique Chaves <44180294+henchaves@users.noreply.github.com>

Update OpenAI model examples to gpt-5-mini to avoid deprecated gpt-4o usage.

Remove notebook mention and keep script guidance for asyncio.run().

Refresh single-turn examples with high-risk scenarios and clearer guardrail rationale to emphasize production impact.

Reframe multi-turn examples around higher-risk workflows and add rationale notes to highlight guardrail failure points across turns.

…s-doc

gemini-code-assist · 2026-02-09T02:25:23Z

Summary of Changes

Hello @kevinmessiaen, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request represents a major update to Giskard, focusing on the introduction and comprehensive documentation of the new 'Giskard Checks' library. It involves upgrading the Python environment, streamlining project dependencies, and completely restructuring the documentation to provide clear, detailed guides for AI testing with the new framework. These changes aim to modernize the project's foundation and improve the user experience for developers working with Giskard's AI testing capabilities.

Highlights

Python Version Upgrade: The project's required Python version has been updated from 3.12 to 3.13, reflected in both the .python-version file and pyproject.toml.
Dependency Refactoring: Significant changes were made to project dependencies in pyproject.toml, including the removal of several direct dependencies and the introduction of Git-based dependencies for giskard-core, giskard-agents, and giskard-checks. Additionally, PyTorch CPU-specific configuration has been removed.
Documentation Overhaul for Giskard Checks: The documentation structure has been completely revamped. Old SDK and notebook documentation sections have been removed, and a comprehensive new section for 'Giskard Checks' has been introduced. This includes new guides for AI testing, API references, getting started, and tutorials, along with custom sidebar configurations and a new Sphinx template function for dynamic table of contents rendering.
Pre-commit Hook Update: The detect-secrets pre-commit hook in .pre-commit-config.yaml has been updated to exclude 'your-api-key' from secret scans.
CSS Styling Enhancements: New CSS variables and styling rules have been added to source/_static/custom.css to enhance the appearance of the left sidebar in the documentation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

.pre-commit-config.yaml
- Updated detect-secrets hook to exclude your-api-key from scans.
.python-version
- Upgraded Python version from 3.12 to 3.13.
pyproject.toml
- Updated requires-python to >=3.13,<4.0.
- Added Git-based dependencies for giskard-core, giskard-agents, and giskard-checks.
- Removed various development dependencies including notebook, nbsphinx, pandoc, giskard[llm], ragas, ipywidgets, and torch.
- Removed tool.uv.index and tool.uv.sources related to PyTorch CPU configuration.
source/_static/custom.css
- Added --border CSS variable.
- Applied new styling rules for #left-sidebar a.current and #left-sidebar ul ul:is(.dark *)::before.
source/_templates/sidebars/sidebar_oss_checks.html
- Added a new HTML template for the oss/checks sidebar navigation.
source/conf.py
- Removed nbsphinx from Sphinx extensions.
- Added html_sidebars configuration to use a custom sidebar for oss/checks/** paths.
- Removed nbsphinx_execute and nbsphinx_prolog configurations.
- Updated the 'Open Source' navigation link to 'Checks'.
- Added a setup function with a toctree_from_doc custom template function for dynamic TOC rendering.
source/index.rst
- Removed extraneous blank lines for cleaner formatting.
source/oss/checks/ai-testing/core-concepts.rst
- New file added, detailing core concepts of Giskard Checks (Interaction, InteractionSpec, Trace, Check, Scenario, Fluent API).
source/oss/checks/ai-testing/custom-checks.rst
- New file added, providing guidance on creating custom checks, including function-based, class-based, LLM-based, async, stateful, and composite checks.
source/oss/checks/ai-testing/index.rst
- New file added, serving as the main index for AI Testing with Checks, linking to quickstart, core concepts, single-turn, multi-turn, and custom checks.
source/oss/checks/ai-testing/multi-turn.rst
- New file added, explaining multi-turn scenarios for testing conversational flows, stateful conversations, agent workflows, dynamic interactions, error recovery, conversational RAG, and task completion.
source/oss/checks/ai-testing/quickstart.rst
- New file added, a quickstart guide for Giskard Checks, demonstrating simple and dynamic interactions, and structured output validation.
source/oss/checks/ai-testing/single-turn.rst
- New file added, covering single-turn evaluation for RAG systems, classification, summarization, safety, instruction following, structured output, and batch evaluation.
source/oss/checks/api-reference/builtin-checks.rst
- New file added, documenting built-in checks like from_fn, StringMatchingCheck, EqualityCheck, Groundedness, Conformity, and LLMJudge.
source/oss/checks/api-reference/core.rst
- New file added, detailing core API components like Check, CheckResult, CheckStatus, Interaction, Trace, InteractionSpec, Scenario, TestCase, and Extractors.
source/oss/checks/api-reference/index.rst
- New file added, the main API reference for Giskard Checks, linking to core, built-in checks, scenarios, and testing utilities.
source/oss/checks/api-reference/scenarios.rst
- New file added, documenting scenario-related classes like Scenario, ScenarioResult, TestCase, TestCaseResult, and runners.
source/oss/checks/api-reference/testing.rst
- New file added, documenting testing utilities like WithSpy, debugging helpers, serialization, custom fixtures, and performance tracking.
source/oss/checks/getting-started/installation.rst
- New file added, providing installation and configuration instructions for Giskard Checks.
source/oss/checks/index.rst
- New file added, the main overview for Giskard Checks, detailing features, quick links, and use cases.
source/oss/checks/tutorials/chatbot-testing.rst
- New file added, a tutorial on testing conversational AI systems, covering context handling, tone consistency, and multi-turn dialogue.
source/oss/checks/tutorials/index.rst
- New file added, the main index for Giskard Checks tutorials, linking to RAG evaluation, agent testing, chatbot testing, and content moderation.
source/oss/checks/tutorials/rag-evaluation.rst
- New file added, a tutorial on building a test suite for RAG systems, covering retrieval quality, groundedness, and answer quality.
source/oss/checks/tutorials/testing-agents.rst
- New file added, a tutorial on testing AI agents, including tool selection, reasoning quality, multi-step workflows, error handling, and state management.
source/oss/notebooks/index.rst
- Removed file, indicating a shift away from this notebook structure.
source/oss/sdk/business.rst
- Removed file, as content is likely integrated into the new 'Checks' documentation.
source/oss/sdk/index.rst
- Removed file, indicating a restructuring of the main SDK documentation.
source/oss/sdk/legacy.rst
- Removed file, deprecating legacy documentation.
source/oss/sdk/reference/datasets.rst
- Removed file, indicating a restructuring of the SDK reference.
source/oss/sdk/reference/index.rst
- Removed file, indicating a restructuring of the SDK reference.
source/oss/sdk/reference/models.rst
- Removed file, indicating a restructuring of the SDK reference.
source/oss/sdk/reference/raget/evaluation.rst
- Removed file, indicating a restructuring of the RAGET reference.
source/oss/sdk/reference/raget/index.rst
- Removed file, indicating a restructuring of the RAGET reference.
source/oss/sdk/reference/raget/knowledge_base.rst
- Removed file, indicating a restructuring of the RAGET reference.
source/oss/sdk/reference/raget/metrics.rst
- Removed file, indicating a restructuring of the RAGET reference.
source/oss/sdk/reference/raget/question_generation.rst
- Removed file, indicating a restructuring of the RAGET reference.
source/oss/sdk/reference/raget/testset_generation.rst
- Removed file, indicating a restructuring of the RAGET reference.
source/oss/sdk/reference/scan/index.rst
- Removed file, indicating a restructuring of the LLM Scan reference.
source/oss/sdk/reference/scan/llm_detectors.rst
- Removed file, indicating a restructuring of the LLM Scan reference.
source/oss/sdk/reference/scan/report.rst
- Removed file, indicating a restructuring of the LLM Scan reference.
source/oss/sdk/reference/suite.rst
- Removed file, indicating a restructuring of the SDK reference.
source/oss/sdk/reference/tests.rst
- Removed file, indicating a restructuring of the SDK reference.
source/oss/sdk/security.rst
- Removed file, as content is likely integrated into the new 'Checks' documentation.
source/toctree_oss.rst
- Removed oss/sdk/legacy entry from the table of contents.

Activity

The pull request is titled "Giskard v3", suggesting a major version update.
The author, kevinmessiaen, has implemented extensive changes across dependency management and documentation structure.
The empty description implies that the changes are self-explanatory through the code and file modifications, or that a more detailed explanation will follow in a separate communication.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a major refactoring for "Giskard v3", focusing on a new "Checks" feature for testing AI applications. It adds extensive documentation for this new feature, including quickstarts, tutorials, and API references, while removing a significant amount of old documentation related to the previous SDK, RAGET, and LLM Scan features. The changes are substantial and well-structured. My review focuses on improving code quality, fixing errors in documentation examples, and addressing potential security and accessibility issues in the configuration.

gemini-code-assist · 2026-02-09T02:27:58Z

.pre-commit-config.yaml

    hooks:
      - id: detect-secrets
-        args: ["--baseline", ".secrets.baseline"]
+        args: ["--baseline", ".secrets.baseline", "--exclude-secrets", "your-api-key"]


The placeholder "your-api-key" has been added to the detect-secrets configuration. This is a significant security risk as it might be overlooked and could lead to real secrets being excluded from scans if this pattern is copied. This placeholder should be removed. If a specific secret needs to be excluded, it should be done using its ID or a more specific regex.

args: ["--baseline", ".secrets.baseline"]

"your-api-key" is not a real api key and is used in example code.

pyproject.toml

source/oss/checks/ai-testing/multi-turn.rst

source/oss/checks/ai-testing/single-turn.rst

gemini-code-assist · 2026-02-09T02:27:58Z

source/conf.py

 docs_version = os.getenv("READTHEDOCS_VERSION", "latest")
 if docs_version == "latest" or docs_version == "stable":
    branch = "main"
 else:
    branch = docs_version.replace("-", "/")
 branch = "main"


The branch variable is unconditionally set to "main" on line 141, which makes the preceding if/else block that also sets branch redundant. This dead code should be removed to improve clarity.

# Do not execute the notebooks when building the docs branch = "main"

gemini-code-assist · 2026-02-09T02:27:58Z

source/conf.py

+            from sphinx.environment.adapters.toctree import TocTree
+            from sphinx import addnodes


The imports for TocTree and addnodes are inside a nested function toctree_from_doc. According to PEP 8, imports should usually be at the top of the file. While this might be done to avoid polluting the global namespace, it's better to move them to the top of conf.py for consistency and readability.

source/oss/checks/ai-testing/core-concepts.rst

Refresh check references, expand comparison checks, and add semantic similarity details. Co-authored-by: Cursor <cursoragent@cursor.com>

kevinmessiaen and others added 18 commits December 18, 2025 12:00

chore: remove giskard legacy from doc

9564072

custom sidebar for checks docs

6189a9a

fix context for partial toctree

1f13434

draft checks docs

3b1ea05

small fixes

c6ee339

chore: update dependencies in pyproject.toml and uv.lock

88bf862

docs(checks): refresh quickstart scenario flow

12b0955

Align quickstart examples with scenario terminology and rich result rendering, and add screenshots for simple and structured outputs.

docs(checks): clarify core concepts flow

dbedbaf

Reframe core concepts around specs, interactions, and traces, and add a clear execution flow with fluent API mapping.

Apply suggestions from code review

dce3721

Co-authored-by: Henrique Chaves <44180294+henchaves@users.noreply.github.com>

docs(checks): switch snippets to gpt-5-mini

f7ed564

Update OpenAI model examples to gpt-5-mini to avoid deprecated gpt-4o usage.

Use expected value for GreaterThan

57bcca8

docs(checks): clarify async run note

b366598

Remove notebook mention and keep script guidance for asyncio.run().

docs(checks): sharpen single-turn risk examples

e586f47

Refresh single-turn examples with high-risk scenarios and clearer guardrail rationale to emphasize production impact.

docs(checks): refresh multi-turn risk scenarios

fd69646

Reframe multi-turn examples around higher-risk workflows and add rationale notes to highlight guardrail failure points across turns.

Merge remote-tracking branch 'origin/main' into feature/giskard-check…

00f6eb3

…s-doc

chore: remove link of removed page

b31b849

kevinmessiaen changed the title ~~Giskard v3~~ feature: Giskard v3 Feb 9, 2026

gemini-code-assist bot reviewed Feb 9, 2026

View reviewed changes

kevinmessiaen and others added 3 commits February 9, 2026 10:02

docs: update check names in docs

7f22fd0

Refresh check references, expand comparison checks, and add semantic similarity details. Co-authored-by: Cursor <cursoragent@cursor.com>

docs: fix sidebar configuration

a8bb09c

docs(oss): Upgrade to main

ba0d398

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: Giskard v3 #18

feature: Giskard v3 #18
kevinmessiaen wants to merge 21 commits intomainfrom
feature/giskard-v3

kevinmessiaen commented Feb 9, 2026

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 9, 2026

Uh oh!

kevinmessiaen Feb 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Feb 9, 2026

Uh oh!

gemini-code-assist bot Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

		from sphinx.environment.adapters.toctree import TocTree
		from sphinx import addnodes

Conversation

kevinmessiaen commented Feb 9, 2026

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

kevinmessiaen Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants