Skip to content

Add hallucination check #2369

@linear

Description

@linear

Summary

Add a built-in LLM-based check that detects fabricated facts, invented details, or false claims in model outputs that are not supported by any provided context.

Motivation

Hallucination detection is a core AI safety concern. DeepEval ships HallucinationMetric and Patronus AI built a specialized model (Lynx) that beats GPT-4 on hallucination detection. This is distinct from Groundedness — Groundedness checks if claims are grounded in context, while Hallucination specifically detects fabricated information.

Implementation Guide

Steps

  1. Create template: src/giskard/checks/prompts/judges/hallucination.j2
    • Given an answer and optional context, detect fabricated facts
    • Identify specific hallucinated claims with explanations
    • Consider: invented statistics, fake citations, non-existent entities, fabricated details
  2. Create check: src/giskard/checks/judges/hallucination.py
    • Subclass BaseLLMCheck, register as "hallucination"
    • Support:
      • answer_key: JSONPathStr — JSONPath for answer (default: trace.last.outputs)
      • context: str | list[str] | None = None — reference context
      • context_key: JSONPathStr | None = None — JSONPath for context
  3. Add tests

Distinction from Groundedness

  • Groundedness: "Is the answer supported by the context?" — checks grounding
  • Hallucination: "Does the answer contain fabricated information?" — checks fabrication, can work with or without context

Example usage

from giskard.checks import Hallucination, Scenario

scenario = (
    Scenario(name="no_hallucination")
    .interact(
        inputs="What year was Python created?",
        outputs="Python was created in 1991 by Guido van Rossum.",
        metadata={"context": "Python was first released in 1991."}
    )
    .check(Hallucination(context_key="trace.last.metadata.context"))
)

Related issues

Acceptance Criteria

  • Detects fabricated facts and invented details
  • Works with and without provided context
  • Identifies specific hallucinated claims in the reason
  • Distinct behavior from Groundedness
  • Tests cover: factual answer passes, hallucinated answer fails, no context mode

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions