LLM Safety Testing Tool v2

A comprehensive framework for conducting systematic safety evaluations of large language models through scenario-based testing, automated conversation management, and comprehensive reporting. Features include multi-model support, web-based log viewer, multilingual error reports, API request/response logging, and extensible adapter architecture for custom implementations.

Features

🔧 Modular architecture (adapter pattern)
🤖 Support for multiple LLM providers (extensible)
💾 Support for multiple database backends (extensible)
📝 Scenario-based test management
📊 Detailed logging and history management
🔍 Session tracking for comparing multiple test runs
🛡️ Security-conscious design
📡 API request/response logging and storage
🌐 Multi-language error report support (English/Japanese)
🔎 Comprehensive error analysis with API data

Installation

Install from source

# Clone the repository
git clone https://github.com/techs-targe/llm-safety-testing-tool-v2.git
cd llm-safety-testing-tool-v2

# Install the package
pip install -e .  # For development
# or
pip install .    # For regular installation

Install directly from GitHub

pip install git+https://github.com/techs-targe/llm-safety-testing-tool-v2.git

Development environment setup

# Install development dependencies
pip install -e .[dev]

# Install pre-commit hooks
pre-commit install

Quick Start

See QUICK_START.md for a detailed walkthrough.

Important: First Time Setup

# 1. Clone and enter the repository
git clone https://github.com/techs-targe/llm-safety-testing-tool-v2.git
cd llm-safety-testing-tool-v2

# 2. Install the tool
pip install .

# 3. Set your API key
export ANTHROPIC_API_KEY="your-api-key"

# 4. Now you're ready to start!

Basic Usage - Copy and Paste Example

# Database is now stored in the project directory by default
# No need to reset unless you want to start fresh

# Create and run a test scenario
safety scenario-create TEST-001 "Basic test"
safety system-set TEST-001 "You are a helpful assistant."
safety message-add TEST-001 "Hello"
safety message-add TEST-001 "What's the weather today?"
safety scenario-run TEST-001

# View the results
safety logs-show TEST-001

Database Location

By default, the database is stored in ~/.safety_tool/safety_tool.db. This ensures:

Consistent location for all tools (CLI and Web viewer)
Data persistence across different project directories
Shared database between different clones

You can customize the database location using the SAFETY_TOOL_DB_PATH environment variable:

# Use custom path
export SAFETY_TOOL_DB_PATH=/path/to/my/database.db

# Use project-local database (for isolated testing)
export SAFETY_TOOL_DB_PATH=./safety_tool.db

Troubleshooting Common Errors

"UNIQUE constraint failed: scenarios.id"

This means TEST-001 already exists

Solution: Use a different ID (TEST-002) or delete the old one:

safety scenario-delete TEST-001
# or start fresh (removes database in current directory)
rm -f safety_tool.db

Multiple duplicate messages

This happens when commands are run multiple times
Solution: Start fresh with rm -f safety_tool.db

Web Dashboard

An advanced browser-based interface is included for managing scenarios, viewing logs, and configuring the tool:

# Start the new web dashboard (recommended)
./web-viewer-v3

# Open http://localhost:8080 in your browser

The web dashboard provides:

Scenario Management

Create, duplicate, edit, and delete test scenarios
Edit system messages and prompts
Import/export scenarios as JSON
Run scenarios directly from the web interface

Results Viewing

Separate tabs for error reports and logs
Test ID and sequence/session filtering
Date range filtering
Detailed view for individual reports and logs
API request/response data viewing
Multilingual report support

Configuration

Select and configure LLM models
Manage API keys securely
Configure database settings

For legacy users, the original viewer is still available via ./web-viewer-v2

Note: The web dashboard automatically creates database tables if they don't exist, so it works even with a fresh installation.

For more detailed usage, see:

USAGE.md - Complete command reference
QUICK_START.md - Step-by-step tutorial
WEB_DASHBOARD_GUIDE.md - Web dashboard user guide
WEB_VIEWER_TROUBLESHOOTING.md - Web viewer troubleshooting guide

Architecture

safety_tool/
├── adapters/          # LLM/DB adapters
│   ├── llm/
│   │   ├── base.py
│   │   └── anthropic.py
│   └── db/
│       ├── base.py
│       └── sqlite.py
├── cli/              # CLI interface
│   └── main.py
├── core/             # Core logic
│   ├── scenario_manager.py
│   ├── runner.py
│   └── models.py
└── config.toml       # Configuration file

Development

Testing

# Run tests
make test

# Generate coverage report
make coverage

# Run linters
make lint

# Format code
make format

Build and Deploy

# Build
make build

# Upload to PyPI
make upload

License

This project is licensed under the MIT License.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details.

Support

If you have any issues or questions, please check:

TROUBLESHOOTING.md for common issues
GitHub Issues for bug reports and feature requests

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
examples		examples
safety_tool		safety_tool
tests		tests
.editorconfig		.editorconfig
.env.example		.env.example
.env.template		.env.template
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT.md		DEPLOYMENT.md
LICENSE		LICENSE
Makefile		Makefile
QUICK_START.md		QUICK_START.md
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
USAGE.md		USAGE.md
WEB_DASHBOARD_GUIDE.md		WEB_DASHBOARD_GUIDE.md
WEB_VIEWER_TROUBLESHOOTING.md		WEB_VIEWER_TROUBLESHOOTING.md
batch-test.sh		batch-test.sh
pyproject.toml		pyproject.toml
quick-start.sh		quick-start.sh
requirements.txt		requirements.txt
run-safety-test.sh		run-safety-test.sh
safety		safety
safety-create		safety-create
safety-logs		safety-logs
safety-message		safety-message
safety-run		safety-run
setup.py		setup.py
standalone_web_server.py		standalone_web_server.py
start-web-server.sh		start-web-server.sh
test_new_installation.sh		test_new_installation.sh
test_safety_tool.sh		test_safety_tool.sh
web-viewer-v2		web-viewer-v2
web-viewer-v3		web-viewer-v3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Safety Testing Tool v2

Features

Installation

Install from source

Install directly from GitHub

Development environment setup

Quick Start

Important: First Time Setup

Basic Usage - Copy and Paste Example

Database Location

Troubleshooting Common Errors

Web Dashboard

Scenario Management

Results Viewing

Configuration

Architecture

Development

Testing

Build and Deploy

License

Contributing

Support

About

Uh oh!

Releases

Packages

Languages

License

techs-targe/llm-safety-testing-tool-v2

Folders and files

Latest commit

History

Repository files navigation

LLM Safety Testing Tool v2

Features

Installation

Install from source

Install directly from GitHub

Development environment setup

Quick Start

Important: First Time Setup

Basic Usage - Copy and Paste Example

Database Location

Troubleshooting Common Errors

Web Dashboard

Scenario Management

Results Viewing

Configuration

Architecture

Development

Testing

Build and Deploy

License

Contributing

Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages