LintAI - AI Output Testing & Validation Framework

A production-ready framework for validating AI/LLM outputs against user-defined assertions, confidence scoring, and edge case testing.

📦 Installation

From PyPI (Recommended)

pip install llm-validator

From Source

git clone https://github.com/SoulSniper-V2/lintai.git
cd lintai
pip install -e .

🎯 Features

✅ Assertion-Based Validation - Define expected behavior with simple rules
📊 Confidence Scoring - Get quantified trust metrics for outputs
🧪 Edge Case Testing - Systematically test boundary conditions
🤖 Multi-Model Support - Works with OpenAI, Anthropic, Gemini, local LLMs
📈 Regression Tracking - Track validation scores over time
🔄 CI/CD Integration - Run validations in GitHub Actions pipelines
🚀 Auto-Release to PyPI - Tags automatically publish to PyPI

🚀 GitHub Action

LintAI is also available as a GitHub Marketplace Action for CI/CD pipelines:

name: Validate AI Output
on: [push]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      
      - name: Validate LLM Output
        uses: SoulSniper-V2/lintai@v0.1
        with:
          prompt: "Summarize this document"
          output: "${{ steps.generate.outputs.result }}"
          assertions-config: "./assertions.json"
          pass-threshold: 80

Example Assertions Config (`assertions.json`)

{
  "assertions": [
    {
      "name": "max_length",
      "type": "MAX_LENGTH",
      "params": { "max_chars": 1000 },
      "weight": 0.3
    },
    {
      "name": "contains_steps",
      "type": "CONTAINS_TEXT",
      "params": { "text": "step 1" },
      "weight": 0.5
    },
    {
      "name": "no_profanity",
      "type": "NO_PATTERN",
      "params": { "pattern": "badword|offensive" },
      "weight": 0.2
    }
  ]
}

Action Inputs

Input	Required	Default	Description
`prompt`	Yes	-	Original prompt sent to LLM
`output`	Yes	-	LLM output to validate
`assertions-config`	Yes	-	Path to JSON assertions config
`pass-threshold`	No	70	Minimum score to pass (0-100)
`fail-on-warning`	No	false	Fail if any warnings

Action Outputs

Output	Description
`passed`	Whether validation passed (true/false)
`score`	Confidence score (0-100)
`failed-assertions`	Number of failed assertions
`warnings-count`	Number of warnings

🚀 Quick Start

CLI Usage

# Initialize a validation config
lintai init-config

# Validate with a config file
lintai validate --config validators/my_config.yaml

# Batch validation from JSONL
lintai batch --input test_cases.jsonl --output results.jsonl

from llm_validator import LLMValidator, Assertion, AssertionType

# Initialize validator
validator = LLMValidator(
    model="gpt-4",
    api_key="your-key"
)

# Define assertions
assertions = [
    Assertion(
        name="max_length",
        type=AssertionType.MAX_LENGTH,
        params={"max_tokens": 500},
        weight=0.3
    ),
    Assertion(
        name="no_profanity",
        type=AssertionType.NO_PATTERN,
        params={"pattern": r"(?i)badword|offensive"},
        weight=0.5
    ),
    Assertion(
        name="contains_action_plan",
        type=AssertionType.CONTAINS_TEXT,
        params={"text": "step 1", "count": 1},
        weight=0.2
    )
]

# Validate output
result = validator.validate(
    prompt="Create a plan to increase sales",
    output="Here is a step by step plan...",
    assertions=assertions
)

print(f"Confidence Score: {result.score}/100")
print(f"Passed: {result.passed}")
print(f"Failed: {result.failed_assertions}")

CLI Usage

# Run validation from config
llm-validate --config validators/sales_plan.yaml

# Quick test
llm-validate --prompt "Summarize this" --output "The text says..." --rules "max_tokens:100"

# Batch validation
llm-validate --input test_cases.jsonl --output results.jsonl

Web Dashboard

cd frontend
npm install
npm run dev

Access at http://localhost:5173

📁 Project Structure

llm-validator/
├── llm_validator/
│   ├── __init__.py
│   ├── core.py           # Main validation logic
│   ├── assertions.py     # Assertion types
│   ├── models.py         # Data models
│   └── providers.py      # LLM provider integration
├── frontend/
│   ├── src/
│   │   ├── App.jsx
│   │   └── components/
│   ├── package.json
│   └── vite.config.js
├── tests/
│   ├── test_core.py
│   └── test_assertions.py
├── validators/           # Example validation configs
├── README.md
└── requirements.txt

🛠️ Assertion Types

Type	Description	Example
`MAX_LENGTH`	Output within token/char limit	`max_tokens: 1000`
`MIN_LENGTH`	Output meets minimum length	`min_words: 50`
`CONTAINS_TEXT`	Output has required text	`text: "step 1"`
`NO_PATTERN`	Output doesn't match pattern	`pattern: "error\|fail"`
`REGEX_MATCH`	Output matches regex	`pattern: r"^\d+\."`
`SENTIMENT`	Output sentiment check	`min_positive: 0.6`
`JSON_VALID`	Output is valid JSON	`schema: ./schema.json`
`KEYWORD_COUNT`	Keywords present	`keywords: ["AI", "ML"]`
`CUSTOM`	Python function validation	`function: my_validator.py`

📊 Confidence Scoring

The validator calculates a weighted confidence score:

Confidence Score = Σ(passed_weight) / Σ(total_weight) × 100

Individual assertion results:

✅ PASS: Assertion met
❌ FAIL: Assertion not met
⚠️ WARN: Assertion partially met (with penalty)

🎨 Example Validators

Code Review Validator

name: code_review
model: gpt-4
assertions:
  - name: has_tests
    type: CONTAINS_TEXT
    params: { text: "test" }
    weight: 0.3
  
  - name: no_hardcoded_secrets
    type: NO_PATTERN
    params: { pattern: "api_key|password|secret" }
    weight: 0.4
  
  - name: reasonable_length
    type: MAX_LENGTH
    params: { max_tokens: 2000 }
    weight: 0.2
  
  - name: has_error_handling
    type: REGEX_MATCH
    params: { pattern: "except|try|catch" }
    weight: 0.1

Customer Email Validator

name: customer_email
model: claude-3-opus
assertions:
  - name: professional_tone
    type: SENTIMENT
    params: { min_positive: 0.3, max_negative: 0.2 }
    weight: 0.3
  
  - name: has_greeting
    type: CONTAINS_TEXT
    params: { text: "Dear|Hello|Hi" }
    weight: 0.1
  
  - name: has_signature
    type: CONTAINS_TEXT
    params: { text: "Sincerely|Best|Thanks" }
    weight: 0.1
  
  - name: no_pii
    type: NO_PATTERN
    params: { pattern: "\\d{3}-\\d{2}-\\d{4}" }  # SSN pattern
    weight: 0.5

🔧 Configuration

Environment Variables

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=...
GOOGLE_API_KEY=...

Provider Selection

from llm_validator.providers import OpenAIProvider, AnthropicProvider, LocalProvider

# OpenAI
validator = LLMValidator(provider=OpenAIProvider(model="gpt-4"))

# Anthropic
validator = LLMValidator(provider=AnthropicProvider(model="claude-3-opus"))

# Local/Ollama
validator = LLMValidator(provider=LocalProvider(model="llama2"))

🧪 Testing

# Run all tests
pytest tests/

# Run with coverage
pytest --cov=llm_validator tests/

# Run specific test
pytest tests/test_core.py -v

📈 CI/CD Integration

GitHub Actions

name: Validate AI Outputs
on: [push]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with: { python-version: '3.11' }
      - name: Install
        run: pip install llm-validator
      - name: Run Validation
        run: |
          llm-validate \
            --config validators/code_review.yaml \
            --output validation_results.json
      - name: Check Score
        run: |
          if [ $(jq '.score' validation_results.json) -lt 80 ]; then
            echo "Score below threshold!"
            exit 1
          fi

🎯 Use Cases

Production AI Safety: Validate outputs before showing to users
Code Review Automation: Check AI-generated code for quality
Content Moderation: Ensure outputs meet guidelines
Customer Support: Validate response quality
RAG Evaluation: Test retrieval-augmented generation accuracy
Model Comparison: Compare output quality across models

🤝 Contributing

Fork the repo
Create a feature branch
Add your assertion type
Submit a PR

🔄 Automated Releases

This project uses GitHub Actions for CI/CD:

Workflow	Description
Test	Runs pytest on every push/PR
Build	Builds PyPI package on every push
Publish	Auto-publishes to PyPI when a git tag is pushed

How to Release

# Make changes, commit
git add -A
git commit -m "Description of changes"

# Create a version tag (follows semver)
git tag v0.1.1

# Push tag to trigger PyPI release
git push origin main
git push origin v0.1.1

The CI workflow will:

Run tests
Build the package
Publish to PyPI automatically

Note: Requires PYPI_API_TOKEN secret in GitHub repo settings.

📄 License

MIT License - Build, validate, ship with confidence!

Never deploy AI without validation. 🛡️

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
dist		dist
frontend		frontend
lintai		lintai
llm_validator.egg-info		llm_validator.egg-info
llm_validator		llm_validator
tests		tests
validators		validators
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
action.yml		action.yml
entrypoint.py		entrypoint.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

LintAI - AI Output Testing & Validation Framework

📦 Installation

From PyPI (Recommended)

From Source

🎯 Features

🚀 GitHub Action

Example Assertions Config (assertions.json)

Action Inputs

Action Outputs

🚀 Quick Start

CLI Usage

CLI Usage

Web Dashboard

📁 Project Structure

🛠️ Assertion Types

📊 Confidence Scoring

🎨 Example Validators

Code Review Validator

Customer Email Validator

🔧 Configuration

Environment Variables

Provider Selection

🧪 Testing

📈 CI/CD Integration

GitHub Actions

🎯 Use Cases

🤝 Contributing

🔄 Automated Releases

How to Release

📄 License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Example Assertions Config (`assertions.json`)

Packages