Skip to content
Draft
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
9564072
chore: remove giskard legacy from doc
kevinmessiaen Dec 18, 2025
6189a9a
custom sidebar for checks docs
mattbit Dec 23, 2025
1f13434
fix context for partial toctree
mattbit Dec 23, 2025
3b1ea05
draft checks docs
mattbit Dec 23, 2025
c6ee339
small fixes
mattbit Dec 24, 2025
8e63877
docs(oss-checks): update examples to use fluent builder pattern
kevinmessiaen Jan 26, 2026
88bf862
chore: update dependencies in pyproject.toml and uv.lock
kevinmessiaen Jan 26, 2026
0bfd74f
docs(checks): update documentation for scenario-based approach
kevinmessiaen Jan 28, 2026
12b0955
docs(checks): refresh quickstart scenario flow
kevinmessiaen Jan 28, 2026
dbedbaf
docs(checks): clarify core concepts flow
kevinmessiaen Jan 28, 2026
dce3721
Apply suggestions from code review
kevinmessiaen Jan 29, 2026
f7ed564
docs(checks): switch snippets to gpt-5-mini
kevinmessiaen Jan 30, 2026
57bcca8
Use expected value for `GreaterThan`
kevinmessiaen Jan 30, 2026
b366598
docs(checks): clarify async run note
kevinmessiaen Jan 30, 2026
e586f47
docs(checks): sharpen single-turn risk examples
kevinmessiaen Feb 2, 2026
fd69646
docs(checks): refresh multi-turn risk scenarios
kevinmessiaen Feb 2, 2026
00f6eb3
Merge remote-tracking branch 'origin/main' into feature/giskard-check…
kevinmessiaen Feb 9, 2026
b31b849
chore: remove link of removed page
kevinmessiaen Feb 9, 2026
7f22fd0
docs: update check names in docs
kevinmessiaen Feb 9, 2026
a8bb09c
docs: fix sidebar configuration
kevinmessiaen Feb 11, 2026
ba0d398
docs(oss): Upgrade to main
kevinmessiaen Mar 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ repos:
rev: v1.5.0
hooks:
- id: detect-secrets
args: ["--baseline", ".secrets.baseline"]
args: ["--baseline", ".secrets.baseline", "--exclude-secrets", "your-api-key"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The placeholder "your-api-key" has been added to the detect-secrets configuration. This is a significant security risk as it might be overlooked and could lead to real secrets being excluded from scans if this pattern is copied. This placeholder should be removed. If a specific secret needs to be excluded, it should be done using its ID or a more specific regex.

        args: ["--baseline", ".secrets.baseline"]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"your-api-key" is not a real api key and is used in example code.

exclude: |
(?x)^(
.*\.lock$|
Expand Down
2 changes: 1 addition & 1 deletion .python-version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.12
3.13
24 changes: 6 additions & 18 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,29 +6,26 @@ authors = [
{name = "Giskard Team", email = "hello@giskard.ai"}
]
readme = "README.md"
requires-python = ">=3.10,<4.0"
dependencies = []
requires-python = ">=3.13,<4.0"
dependencies = [
"giskard-core @ git+ssh://git@github.com/Giskard-AI/giskard-oss.git@feature/giskard-v3#subdirectory=libs/giskard-core",
"giskard-agents @ git+ssh://git@github.com/Giskard-AI/giskard-oss.git@feature/giskard-v3#subdirectory=libs/giskard-agents",
"giskard-checks @ git+ssh://git@github.com/Giskard-AI/giskard-oss.git@feature/giskard-v3#subdirectory=libs/giskard-checks",
]

[dependency-groups]
dev = [
"sphinxawesome-theme==5.3.2; python_version>='3.12'",
"myst-parser==4.0.1; python_version>='3.12'",
"notebook==7.4.7",
"nbsphinx==0.9.7; python_version>='3.12'",
"sphinx-click==6.1.0; python_version>='3.12'",
"sphinx-autobuild==2025.8.25; python_version>='3.12'",
"sphinx-autodoc-typehints==2.3.0; python_version>='3.12'",
"sphinx-design==0.6.1; python_version>='3.12'",
"sphinx-tabs>=3.4.7; python_version>='3.12'",
"sphinxext-opengraph[social_cards]>=0.12.0; python_version>='3.12'",
"sphinx-notfound-page>=1.1.0; python_version>='3.12'",
"pandoc>=2.4",
"sphinxcontrib-mermaid>=0.9.0; python_version>='3.12'",
"giskard[llm]==2.18.0; python_version>='3.10' and python_version<'3.13'",
"pyarrow<21.0.1; python_version>='3.12'",
"ragas>=0.3.7,<=0.3.7",
"ipywidgets>=8.1.7",
"torch>=2.8.0",
"sphinx-autobuild>=2024.10.3",
"giskard-hub>=2.1.0",
"sphinxext-rediraffe"
Expand All @@ -39,12 +36,3 @@ dev = [
Homepage = "https://github.com/Giskard-AI/giskard-hub"
Repository = "https://github.com/Giskard-AI/giskard-hub"
Documentation = "https://docs.giskard.ai/"

[[tool.uv.index]]
name = "pytorch_cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true

[tool.uv.sources]
# Use CPU-only PyTorch for non-macOS systems, default PyPI for macOS
torch = { index = "pytorch_cpu", marker = "platform_system != 'Darwin'" }
9 changes: 9 additions & 0 deletions source/_static/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
--sidebar-heading-color: #0f1729;
--non-selected-color: rgba(15, 23, 41, 0.6);
--link-color: inherit;
--border: 0 0% 100% / 0.10;
}

.dark {
Expand Down Expand Up @@ -420,6 +421,14 @@ header nav a:not(.text-foreground):hover {
color: rgba(198, 255, 255, 0.8) !important;
}

#left-sidebar a.current {
border: none;
}

#left-sidebar ul ul:is(.dark *)::before {
background-color: hsl(var(--border));
}

/* Recently selected navbar item styling */
header nav a.recently-selected,
html[data-content_root="./"] header nav a.recently-selected,
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions source/_templates/sidebars/sidebar_oss_checks.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
<nav class="table w-full min-w-full my-6 lg:my-8">
{{ toctree_from_doc('oss/checks/index', collapse=False, maxdepth=20, includehidden=True, titles_only=False) }}
</nav>
68 changes: 51 additions & 17 deletions source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ def update_sidebar_templates():

extensions = [
"myst_parser",
"nbsphinx",
"sphinx_design",
"sphinx.ext.todo",
"sphinx.ext.napoleon",
Expand Down Expand Up @@ -126,28 +125,20 @@ def update_sidebar_templates():
html_js_files = ["custom.js"]
html_favicon = "_static/favicon.ico"

html_sidebars = {
"oss/checks/**": [
"sidebar_main_nav_links.html",
"sidebars/sidebar_oss_checks.html",
],
}

# Do not execute the notebooks when building the docs
docs_version = os.getenv("READTHEDOCS_VERSION", "latest")
if docs_version == "latest" or docs_version == "stable":
branch = "main"
else:
branch = docs_version.replace("-", "/")
branch = "main"
Comment on lines 129 to 134
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The branch variable is unconditionally set to "main" on line 141, which makes the preceding if/else block that also sets branch redundant. This dead code should be removed to improve clarity.

# Do not execute the notebooks when building the docs
branch = "main"


# -- Options for nbsphinx ----------------------------------------------------
nbsphinx_execute = "never"
# fmt: off
nbsphinx_prolog = """
.. raw:: html

<div class="open-in-colab__wrapper">
<a href="https://colab.research.google.com/github/Giskard-AI/giskard-hub/blob/""" + branch + """/script-docs/{{ env.doc2path(env.docname, base=None) }}" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" style="display: inline; margin: 0" alt="Open In Colab"/></a>
<a href="https://github.com/Giskard-AI/giskard-hub/tree/""" + branch + """/script-docs/{{ env.doc2path(env.docname, base=None) }}" target="_blank"><img src="https://img.shields.io/badge/github-view%20source-black.svg" style="display: inline; margin: 0" alt="View Notebook on GitHub"/></a>
</div>
"""
# fmt: on


theme_options = ThemeOptions(
show_prev_next=True,
show_scrolltop=True,
Expand All @@ -158,7 +149,7 @@ def update_sidebar_templates():
"Overview": "/index",
"Hub UI": "/hub/ui/index",
"Hub SDK": "/hub/sdk/index",
"Open Source": "/oss/sdk/index",
"Checks": "/oss/checks/index",
},
)
html_theme_options = asdict(theme_options)
Expand Down Expand Up @@ -193,6 +184,49 @@ def update_sidebar_templates():
ogp_image = "https://docs.giskard.ai/_static/open-graph-image.png"


# Add custom template function to render toctree from a specific document
def setup(app):
def html_page_context(app, pagename, templatename, context, doctree):
def toctree_from_doc(docname, **kwargs):
"""Render toctree starting from a specific document"""
from sphinx.environment.adapters.toctree import TocTree
from sphinx import addnodes
Comment on lines +190 to +191
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The imports for TocTree and addnodes are inside a nested function toctree_from_doc. According to PEP 8, imports should usually be at the top of the file. While this might be done to avoid polluting the global namespace, it's better to move them to the top of conf.py for consistency and readability.

source_doctree = app.env.get_doctree(docname)
toctrees = list(source_doctree.findall(addnodes.toctree))

if not toctrees:
return ""

toctree_adapter = TocTree(app.env)
resolved = [
toctree_adapter.resolve(
pagename, # Use current page context, not the toctree source
app.builder,
toctree,
prune=False,
maxdepth=kwargs.get("maxdepth", -1),
titles_only=kwargs.get("titles_only", False),
collapse=kwargs.get("collapse", False),
includehidden=kwargs.get("includehidden", False),
)
for toctree in toctrees
]

resolved = [r for r in resolved if r is not None]
if not resolved:
return ""

result = resolved[0]
for toctree in resolved[1:]:
result.extend(toctree.children)

return app.builder.render_partial(result)["fragment"]

context["toctree_from_doc"] = toctree_from_doc

app.connect("html-page-context", html_page_context)


# make github links resolve
def linkcode_resolve(domain, info):
if domain != "py":
Expand Down
4 changes: 0 additions & 4 deletions source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ Giskard Hub
Ready to unlock the full potential of enterprise-grade AI testing? Try **Giskard Hub** with a free trial and discover advanced team collaboration, continuous red teaming, and enterprise security features.

:doc:`Start your free enterprise trial </start/enterprise-trial>` and see how Giskard Hub can transform your AI testing workflow.

Open source
-----------

Expand Down Expand Up @@ -74,7 +73,6 @@ The library provides a set of tools for testing and evaluating LLMs, including:
**⚖️ Unsure about the difference between Open Source and Hub?**

Check out our :doc:`/start/comparison` guide to learn more about the different features.

Open research
-------------

Expand Down Expand Up @@ -107,8 +105,6 @@ Some work has been funded by the `the European Commission <https://commission.eu
.. tip::

Are you interested in supporting our research? Check out our `Open Collective funding page for Phare <https://opencollective.com/phare-llm-benchmark>`_.


.. include:: toctree.rst
.. include:: toctree_hub_ui.rst
.. include:: toctree_hub_sdk.rst
Expand Down
179 changes: 179 additions & 0 deletions source/oss/checks/ai-testing/core-concepts.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
=============
Core Concepts
=============

Understanding the key concepts in Giskard Checks will help you write effective tests for your AI applications.


Overview
--------

Giskard Checks is built around a few core primitives that work together:

* **Interaction**: A single turn of data exchange (inputs and outputs)
* **InteractionSpec**: A specification for generating interactions dynamically
* **Trace**: An immutable snapshot of all interactions in a scenario
* **Check**: A validation that runs on a trace and returns a result
* **Scenario**: A list of steps (interactions and checks) executed sequentially

At runtime, the flow looks like this:

1. A Scenario is created with a sequence of steps.

2. For each step in order:

a. Each InteractionSpec is resolved into a concrete Interaction.
b. The Interaction is appended to the Trace.
c. Checks run against the current Trace.

3. Results are returned as a ScenarioResult.

Interaction
-----------

An ``Interaction`` represents a single turn of data exchange with the system under test.
Interactions are computed at execution time by resolving ``InteractionSpec`` objects into the trace.

**Properties:**

* ``inputs``: The input to your system (string, dict, Pydantic model, etc.)
* ``outputs``: The output from your system (any serializable type)
* ``metadata``: Optional dictionary for additional context (timings, model info, etc.)

Interactions are **immutable**, as they represent something that has already happened.


InteractionSpec
---------------

An ``InteractionSpec`` describes *how* to generate an interaction and is used to describe a scenario.
When you call ``.interact(...)`` in the fluent API, it adds an ``InteractionSpec`` to the scenario sequence.
Inputs and outputs can be static values or dynamic callables, and you can mix both.

.. code-block:: python

from giskard.checks import InteractionSpec
from openai import OpenAI
import random

def generate_random_question() -> str:
return f"What is 2 + {random.randint(0, 10)}?"

def generate_answer(inputs: str) -> str:
client = OpenAI()
response = client.chat.completions.create(
model="gpt-5-mini",
messages=[{"role": "user", "content": inputs}],
)
return response.choices[0].message.content

spec = InteractionSpec(
inputs=generate_random_question,
outputs=generate_answer,
metadata={
"category": "math",
"difficulty": "easy"
}
)

Specs are resolved into interactions during scenario execution. This is common in multi-turn scenarios, where inputs and outputs are generated based on previous interactions. See :doc:`multi-turn` for practical examples.

Trace
-----

A ``Trace`` is an immutable snapshot of all data exchanged with the system under test. In its simplest form, it is a list of interactions.

.. code-block:: python

from giskard.checks import Trace, Interaction

trace = Trace(interactions=[
Interaction(inputs="Hello", outputs="Hi there!"),
Interaction(inputs="How are you?", outputs="I'm doing well, thanks!")
])

Traces are typically created during scenario execution by resolving each ``InteractionSpec`` into a frozen interaction.


Checks
------

A ``Check`` validates something about a trace and returns a ``CheckResult``. There's a library of built-in checks, but you can also create your own.

When referencing values in a trace, use JSONPath expressions that start with ``trace.``. The ``last`` property is a shortcut for ``interactions[-1]`` and can be used in both JSONPath keys and Python code.

.. code-block:: python

from giskard.checks import Groundedness, Trace

check = Groundedness(
answer_key="trace.last.outputs",
context="Giskard Checks is a testing framework for AI systems."
)


Scenario
--------

A ``Scenario`` is a list of steps (interactions and checks) that are executed sequentially with a shared trace. Scenarios work for both single-turn and multi-turn tests.

.. code-block:: python

from giskard.checks import scenario

test_scenario = (
scenario("test_with_checks")
.interact(inputs="test input", outputs="test output")
.check(check1)
.check(check2)
)

result = await test_scenario.run()

.. note::
The ``run()`` method is asynchronous. When running in a script, use ``asyncio.run()``:

.. code-block:: python

import asyncio

async def main():
result = await test_scenario.run()
return result

result = asyncio.run(main())

In async contexts (like pytest with ``@pytest.mark.asyncio``), you can use ``await`` directly.

This will give us a result object with the results of the checks.


Fluent API Mapping
------------------

The fluent API is the preferred user-facing entry point and maps directly to the core primitives above:

* ``scenario(name)`` creates a ``Scenario`` builder.
* ``.interact(...)`` adds an ``InteractionSpec`` to the scenario sequence.
* ``.check(...)`` adds a ``Check`` to the scenario sequence.
* ``.run()`` resolves specs to interactions, builds the ``Trace``, runs checks, and returns a ``ScenarioResult``.

For example, we can test a simple conversation flow with two turns:

.. code-block:: python

from giskard.checks import scenario, Conformity

test_scenario = (
scenario("conversation_flow")
.interact(inputs="Hello", outputs=generate_answer)
.check(Conformity(key="trace.last.outputs", rule="response should be a friendly greeting"))
.interact(inputs="Who invented the HTML?", outputs=generate_answer)
.check(Conformity(key="trace.last.outputs", rule="response should mention Tim Berners-Lee as the inventor of HTML"))
)

# Run with asyncio.run() if in a script
import asyncio
result = await test_scenario.run() # or: result = asyncio.run(test_scenario.run())

For a practical introduction to the fluent API, see :doc:`quickstart`.
Loading
Loading