A production-ready framework for validating AI/LLM outputs against user-defined assertions, confidence scoring, and edge case testing.
pip install llm-validatorgit clone https://github.com/SoulSniper-V2/lintai.git
cd lintai
pip install -e .- β Assertion-Based Validation - Define expected behavior with simple rules
- π Confidence Scoring - Get quantified trust metrics for outputs
- π§ͺ Edge Case Testing - Systematically test boundary conditions
- π€ Multi-Model Support - Works with OpenAI, Anthropic, Gemini, local LLMs
- π Regression Tracking - Track validation scores over time
- π CI/CD Integration - Run validations in GitHub Actions pipelines
- π Auto-Release to PyPI - Tags automatically publish to PyPI
LintAI is also available as a GitHub Marketplace Action for CI/CD pipelines:
name: Validate AI Output
on: [push]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Validate LLM Output
uses: SoulSniper-V2/lintai@v0.1
with:
prompt: "Summarize this document"
output: "${{ steps.generate.outputs.result }}"
assertions-config: "./assertions.json"
pass-threshold: 80{
"assertions": [
{
"name": "max_length",
"type": "MAX_LENGTH",
"params": { "max_chars": 1000 },
"weight": 0.3
},
{
"name": "contains_steps",
"type": "CONTAINS_TEXT",
"params": { "text": "step 1" },
"weight": 0.5
},
{
"name": "no_profanity",
"type": "NO_PATTERN",
"params": { "pattern": "badword|offensive" },
"weight": 0.2
}
]
}| Input | Required | Default | Description |
|---|---|---|---|
prompt |
Yes | - | Original prompt sent to LLM |
output |
Yes | - | LLM output to validate |
assertions-config |
Yes | - | Path to JSON assertions config |
pass-threshold |
No | 70 | Minimum score to pass (0-100) |
fail-on-warning |
No | false | Fail if any warnings |
| Output | Description |
|---|---|
passed |
Whether validation passed (true/false) |
score |
Confidence score (0-100) |
failed-assertions |
Number of failed assertions |
warnings-count |
Number of warnings |
# Initialize a validation config
lintai init-config
# Validate with a config file
lintai validate --config validators/my_config.yaml
# Batch validation from JSONL
lintai batch --input test_cases.jsonl --output results.jsonlfrom llm_validator import LLMValidator, Assertion, AssertionType
# Initialize validator
validator = LLMValidator(
model="gpt-4",
api_key="your-key"
)
# Define assertions
assertions = [
Assertion(
name="max_length",
type=AssertionType.MAX_LENGTH,
params={"max_tokens": 500},
weight=0.3
),
Assertion(
name="no_profanity",
type=AssertionType.NO_PATTERN,
params={"pattern": r"(?i)badword|offensive"},
weight=0.5
),
Assertion(
name="contains_action_plan",
type=AssertionType.CONTAINS_TEXT,
params={"text": "step 1", "count": 1},
weight=0.2
)
]
# Validate output
result = validator.validate(
prompt="Create a plan to increase sales",
output="Here is a step by step plan...",
assertions=assertions
)
print(f"Confidence Score: {result.score}/100")
print(f"Passed: {result.passed}")
print(f"Failed: {result.failed_assertions}")# Run validation from config
llm-validate --config validators/sales_plan.yaml
# Quick test
llm-validate --prompt "Summarize this" --output "The text says..." --rules "max_tokens:100"
# Batch validation
llm-validate --input test_cases.jsonl --output results.jsonlcd frontend
npm install
npm run devAccess at http://localhost:5173
llm-validator/
βββ llm_validator/
β βββ __init__.py
β βββ core.py # Main validation logic
β βββ assertions.py # Assertion types
β βββ models.py # Data models
β βββ providers.py # LLM provider integration
βββ frontend/
β βββ src/
β β βββ App.jsx
β β βββ components/
β βββ package.json
β βββ vite.config.js
βββ tests/
β βββ test_core.py
β βββ test_assertions.py
βββ validators/ # Example validation configs
βββ README.md
βββ requirements.txt
| Type | Description | Example |
|---|---|---|
MAX_LENGTH |
Output within token/char limit | max_tokens: 1000 |
MIN_LENGTH |
Output meets minimum length | min_words: 50 |
CONTAINS_TEXT |
Output has required text | text: "step 1" |
NO_PATTERN |
Output doesn't match pattern | pattern: "error|fail" |
REGEX_MATCH |
Output matches regex | pattern: r"^\d+\." |
SENTIMENT |
Output sentiment check | min_positive: 0.6 |
JSON_VALID |
Output is valid JSON | schema: ./schema.json |
KEYWORD_COUNT |
Keywords present | keywords: ["AI", "ML"] |
CUSTOM |
Python function validation | function: my_validator.py |
The validator calculates a weighted confidence score:
Confidence Score = Ξ£(passed_weight) / Ξ£(total_weight) Γ 100
Individual assertion results:
- β PASS: Assertion met
- β FAIL: Assertion not met
β οΈ WARN: Assertion partially met (with penalty)
name: code_review
model: gpt-4
assertions:
- name: has_tests
type: CONTAINS_TEXT
params: { text: "test" }
weight: 0.3
- name: no_hardcoded_secrets
type: NO_PATTERN
params: { pattern: "api_key|password|secret" }
weight: 0.4
- name: reasonable_length
type: MAX_LENGTH
params: { max_tokens: 2000 }
weight: 0.2
- name: has_error_handling
type: REGEX_MATCH
params: { pattern: "except|try|catch" }
weight: 0.1name: customer_email
model: claude-3-opus
assertions:
- name: professional_tone
type: SENTIMENT
params: { min_positive: 0.3, max_negative: 0.2 }
weight: 0.3
- name: has_greeting
type: CONTAINS_TEXT
params: { text: "Dear|Hello|Hi" }
weight: 0.1
- name: has_signature
type: CONTAINS_TEXT
params: { text: "Sincerely|Best|Thanks" }
weight: 0.1
- name: no_pii
type: NO_PATTERN
params: { pattern: "\\d{3}-\\d{2}-\\d{4}" } # SSN pattern
weight: 0.5OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=...
GOOGLE_API_KEY=...from llm_validator.providers import OpenAIProvider, AnthropicProvider, LocalProvider
# OpenAI
validator = LLMValidator(provider=OpenAIProvider(model="gpt-4"))
# Anthropic
validator = LLMValidator(provider=AnthropicProvider(model="claude-3-opus"))
# Local/Ollama
validator = LLMValidator(provider=LocalProvider(model="llama2"))# Run all tests
pytest tests/
# Run with coverage
pytest --cov=llm_validator tests/
# Run specific test
pytest tests/test_core.py -vname: Validate AI Outputs
on: [push]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with: { python-version: '3.11' }
- name: Install
run: pip install llm-validator
- name: Run Validation
run: |
llm-validate \
--config validators/code_review.yaml \
--output validation_results.json
- name: Check Score
run: |
if [ $(jq '.score' validation_results.json) -lt 80 ]; then
echo "Score below threshold!"
exit 1
fi- Production AI Safety: Validate outputs before showing to users
- Code Review Automation: Check AI-generated code for quality
- Content Moderation: Ensure outputs meet guidelines
- Customer Support: Validate response quality
- RAG Evaluation: Test retrieval-augmented generation accuracy
- Model Comparison: Compare output quality across models
- Fork the repo
- Create a feature branch
- Add your assertion type
- Submit a PR
This project uses GitHub Actions for CI/CD:
| Workflow | Description |
|---|---|
| Test | Runs pytest on every push/PR |
| Build | Builds PyPI package on every push |
| Publish | Auto-publishes to PyPI when a git tag is pushed |
# Make changes, commit
git add -A
git commit -m "Description of changes"
# Create a version tag (follows semver)
git tag v0.1.1
# Push tag to trigger PyPI release
git push origin main
git push origin v0.1.1The CI workflow will:
- Run tests
- Build the package
- Publish to PyPI automatically
Note: Requires PYPI_API_TOKEN secret in GitHub repo settings.
MIT License - Build, validate, ship with confidence!
Never deploy AI without validation. π‘οΈ