Skip to content

LintAI Validation

Actions
Validates LLM outputs using configurable assertions with confidence scoring
v0.1
Latest
Star (1)

LintAI - AI Output Testing & Validation Framework

PyPI Version PyPI Downloads License: MIT CI/CD

A production-ready framework for validating AI/LLM outputs against user-defined assertions, confidence scoring, and edge case testing.

📦 Installation

From PyPI (Recommended)

pip install llm-validator

From Source

git clone https://github.com/SoulSniper-V2/lintai.git
cd lintai
pip install -e .

🎯 Features

  • Assertion-Based Validation - Define expected behavior with simple rules
  • 📊 Confidence Scoring - Get quantified trust metrics for outputs
  • 🧪 Edge Case Testing - Systematically test boundary conditions
  • 🤖 Multi-Model Support - Works with OpenAI, Anthropic, Gemini, local LLMs
  • 📈 Regression Tracking - Track validation scores over time
  • 🔄 CI/CD Integration - Run validations in GitHub Actions pipelines
  • 🚀 Auto-Release to PyPI - Tags automatically publish to PyPI

🚀 GitHub Action

LintAI is also available as a GitHub Marketplace Action for CI/CD pipelines:

name: Validate AI Output
on: [push]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      
      - name: Validate LLM Output
        uses: SoulSniper-V2/lintai@v0.1
        with:
          prompt: "Summarize this document"
          output: "${{ steps.generate.outputs.result }}"
          assertions-config: "./assertions.json"
          pass-threshold: 80

Example Assertions Config (assertions.json)

{
  "assertions": [
    {
      "name": "max_length",
      "type": "MAX_LENGTH",
      "params": { "max_chars": 1000 },
      "weight": 0.3
    },
    {
      "name": "contains_steps",
      "type": "CONTAINS_TEXT",
      "params": { "text": "step 1" },
      "weight": 0.5
    },
    {
      "name": "no_profanity",
      "type": "NO_PATTERN",
      "params": { "pattern": "badword|offensive" },
      "weight": 0.2
    }
  ]
}

Action Inputs

Input Required Default Description
prompt Yes - Original prompt sent to LLM
output Yes - LLM output to validate
assertions-config Yes - Path to JSON assertions config
pass-threshold No 70 Minimum score to pass (0-100)
fail-on-warning No false Fail if any warnings

Action Outputs

Output Description
passed Whether validation passed (true/false)
score Confidence score (0-100)
failed-assertions Number of failed assertions
warnings-count Number of warnings

🚀 Quick Start

CLI Usage

# Initialize a validation config
lintai init-config

# Validate with a config file
lintai validate --config validators/my_config.yaml

# Batch validation from JSONL
lintai batch --input test_cases.jsonl --output results.jsonl
from llm_validator import LLMValidator, Assertion, AssertionType

# Initialize validator
validator = LLMValidator(
    model="gpt-4",
    api_key="your-key"
)

# Define assertions
assertions = [
    Assertion(
        name="max_length",
        type=AssertionType.MAX_LENGTH,
        params={"max_tokens": 500},
        weight=0.3
    ),
    Assertion(
        name="no_profanity",
        type=AssertionType.NO_PATTERN,
        params={"pattern": r"(?i)badword|offensive"},
        weight=0.5
    ),
    Assertion(
        name="contains_action_plan",
        type=AssertionType.CONTAINS_TEXT,
        params={"text": "step 1", "count": 1},
        weight=0.2
    )
]

# Validate output
result = validator.validate(
    prompt="Create a plan to increase sales",
    output="Here is a step by step plan...",
    assertions=assertions
)

print(f"Confidence Score: {result.score}/100")
print(f"Passed: {result.passed}")
print(f"Failed: {result.failed_assertions}")

CLI Usage

# Run validation from config
llm-validate --config validators/sales_plan.yaml

# Quick test
llm-validate --prompt "Summarize this" --output "The text says..." --rules "max_tokens:100"

# Batch validation
llm-validate --input test_cases.jsonl --output results.jsonl

Web Dashboard

cd frontend
npm install
npm run dev

Access at http://localhost:5173

📁 Project Structure

llm-validator/
├── llm_validator/
│   ├── __init__.py
│   ├── core.py           # Main validation logic
│   ├── assertions.py     # Assertion types
│   ├── models.py         # Data models
│   └── providers.py      # LLM provider integration
├── frontend/
│   ├── src/
│   │   ├── App.jsx
│   │   └── components/
│   ├── package.json
│   └── vite.config.js
├── tests/
│   ├── test_core.py
│   └── test_assertions.py
├── validators/           # Example validation configs
├── README.md
└── requirements.txt

🛠️ Assertion Types

Type Description Example
MAX_LENGTH Output within token/char limit max_tokens: 1000
MIN_LENGTH Output meets minimum length min_words: 50
CONTAINS_TEXT Output has required text text: "step 1"
NO_PATTERN Output doesn't match pattern pattern: "error|fail"
REGEX_MATCH Output matches regex pattern: r"^\d+\."
SENTIMENT Output sentiment check min_positive: 0.6
JSON_VALID Output is valid JSON schema: ./schema.json
KEYWORD_COUNT Keywords present keywords: ["AI", "ML"]
CUSTOM Python function validation function: my_validator.py

📊 Confidence Scoring

The validator calculates a weighted confidence score:

Confidence Score = Σ(passed_weight) / Σ(total_weight) × 100

Individual assertion results:

  • PASS: Assertion met
  • FAIL: Assertion not met
  • ⚠️ WARN: Assertion partially met (with penalty)

🎨 Example Validators

Code Review Validator

name: code_review
model: gpt-4
assertions:
  - name: has_tests
    type: CONTAINS_TEXT
    params: { text: "test" }
    weight: 0.3
  
  - name: no_hardcoded_secrets
    type: NO_PATTERN
    params: { pattern: "api_key|password|secret" }
    weight: 0.4
  
  - name: reasonable_length
    type: MAX_LENGTH
    params: { max_tokens: 2000 }
    weight: 0.2
  
  - name: has_error_handling
    type: REGEX_MATCH
    params: { pattern: "except|try|catch" }
    weight: 0.1

Customer Email Validator

name: customer_email
model: claude-3-opus
assertions:
  - name: professional_tone
    type: SENTIMENT
    params: { min_positive: 0.3, max_negative: 0.2 }
    weight: 0.3
  
  - name: has_greeting
    type: CONTAINS_TEXT
    params: { text: "Dear|Hello|Hi" }
    weight: 0.1
  
  - name: has_signature
    type: CONTAINS_TEXT
    params: { text: "Sincerely|Best|Thanks" }
    weight: 0.1
  
  - name: no_pii
    type: NO_PATTERN
    params: { pattern: "\\d{3}-\\d{2}-\\d{4}" }  # SSN pattern
    weight: 0.5

🔧 Configuration

Environment Variables

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=...
GOOGLE_API_KEY=...

Provider Selection

from llm_validator.providers import OpenAIProvider, AnthropicProvider, LocalProvider

# OpenAI
validator = LLMValidator(provider=OpenAIProvider(model="gpt-4"))

# Anthropic
validator = LLMValidator(provider=AnthropicProvider(model="claude-3-opus"))

# Local/Ollama
validator = LLMValidator(provider=LocalProvider(model="llama2"))

🧪 Testing

# Run all tests
pytest tests/

# Run with coverage
pytest --cov=llm_validator tests/

# Run specific test
pytest tests/test_core.py -v

📈 CI/CD Integration

GitHub Actions

name: Validate AI Outputs
on: [push]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with: { python-version: '3.11' }
      - name: Install
        run: pip install llm-validator
      - name: Run Validation
        run: |
          llm-validate \
            --config validators/code_review.yaml \
            --output validation_results.json
      - name: Check Score
        run: |
          if [ $(jq '.score' validation_results.json) -lt 80 ]; then
            echo "Score below threshold!"
            exit 1
          fi

🎯 Use Cases

  1. Production AI Safety: Validate outputs before showing to users
  2. Code Review Automation: Check AI-generated code for quality
  3. Content Moderation: Ensure outputs meet guidelines
  4. Customer Support: Validate response quality
  5. RAG Evaluation: Test retrieval-augmented generation accuracy
  6. Model Comparison: Compare output quality across models

🤝 Contributing

  1. Fork the repo
  2. Create a feature branch
  3. Add your assertion type
  4. Submit a PR

🔄 Automated Releases

This project uses GitHub Actions for CI/CD:

Workflow Description
Test Runs pytest on every push/PR
Build Builds PyPI package on every push
Publish Auto-publishes to PyPI when a git tag is pushed

How to Release

# Make changes, commit
git add -A
git commit -m "Description of changes"

# Create a version tag (follows semver)
git tag v0.1.1

# Push tag to trigger PyPI release
git push origin main
git push origin v0.1.1

The CI workflow will:

  1. Run tests
  2. Build the package
  3. Publish to PyPI automatically

Note: Requires PYPI_API_TOKEN secret in GitHub repo settings.

📄 License

MIT License - Build, validate, ship with confidence!


Never deploy AI without validation. 🛡️

LintAI Validation is not certified by GitHub. It is provided by a third-party and is governed by separate terms of service, privacy policy, and support documentation.

About

Validates LLM outputs using configurable assertions with confidence scoring
v0.1
Latest

LintAI Validation is not certified by GitHub. It is provided by a third-party and is governed by separate terms of service, privacy policy, and support documentation.