Skip to content

epagecareers/hallucination-audit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hallucination-audit

Deterministic, domain-weighted evidence alignment engine for text evaluation.

hallucination-audit is a configurable CLI tool that evaluates text against user-defined trusted domains and produces claim-level evidence alignment reports.

The system is fully deterministic and does not rely on generative AI models for evaluation. It is designed for reproducibility, transparency, and governance-oriented workflows.


Motivation

As large language models and automated content systems become more common, evaluation pipelines often rely on other generative models acting as “judges.”
This can introduce instability and reduce reproducibility.

This project explores an alternative approach based on:

  • Deterministic scoring
  • Explicit trust configuration
  • Domain-weighted retrieval
  • Transparent claim-level reporting

Rather than labeling content as “true” or “false,” the system reports how well claims align with selected reference domains.


Core Features

  • Sentence-level claim extraction
  • BM25-based lexical retrieval
  • Domain-weighted evidence scoring
  • User-defined trust profiles
  • Separation of trusted and untrusted evidence
  • Deterministic contradiction heuristics
  • Structured terminal and JSON output

Installation

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\Activate
pip install -e .

Example Usage

hallucination-audit data/sample.jsonl \
  --sources data/sample_sources \
  --trust-profile data/trust_profile.json \
  --format terminal

Evaluation is controlled my a trust profile { "trusted_domains": ["reuters.com", "bbc.com"], "excluded_domains": ["randomblog.com"], "domain_weights": { "reuters.com": 1.0, "bbc.com": 0.9, "*": 0.35 } } Trusted domains influence verdict determination.

Untrusted domains may be displayed but do not drive support decisions.

Methodology

Extract candidate claims at the sentence level.

Retrieve top-k source chunks using BM25.

Apply domain weights from the trust profile.

Identify the most relevant sentence within each chunk.

Compute lexical overlap ratios.

Apply deterministic support / contradiction heuristics.

The system intentionally avoids semantic inference models in order to maintain reproducibility and auditability.

Design Rationale

Deterministic evaluation improves auditability.

Trust profiles decouple domain preference from scoring logic.

Domain weighting enables configurable epistemic modeling.

Sentence-level contradiction checks reduce chunk-level false positives.

Local source indexing allows evaluation independent of external APIs.

Limitations

Lexical overlap does not equal semantic entailment.

No cross-document reasoning.

No entity disambiguation or knowledge graph integration.

Retrieval quality depends on source corpus coverage.

Domain trust does not imply epistemic correctness.

Future Directions

Synonym normalization and canonicalization

Entity-aware scoring

Persistent indexing layer

Trust graph modeling

Adversarial robustness testing

API / service layer for enterprise integration

License

MIT License.

About

Configurable evidence-alignment engine for AI and news evaluation using user-defined trusted sources.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages