safety-harness

A closed-loop safety harness for agentic LLMs — find failures, lock them in as regressions, gate releases on them, and replay real incidents. Each stage is a self-contained module; together they form the loop:

 ┌──────────────┐   ┌──────────────┐   ┌──────────────┐   ┌──────────────┐
 │   stress-    │──▶│  regression- │──▶│   release-   │──▶│  incident-   │
 │   testing    │   │    suite     │   │    gate      │   │     lab      │
 └──────────────┘   └──────────────┘   └──────────────┘   └──────────────┘
   surface slow-      pin failures as     block a release    replay & root-
   burn failures      regression tests    that regresses     cause incidents
        ▲                                                          │
        └──────────────────  feeds new cases back  ◀───────────────┘

…all exercised against the simulator (a controllable agent under test), and driven end-to-end by the demo orchestrator.

Why It Matters

Agent safety failures rarely stay in one neat box. A red-team finding needs to become a regression test; a regression needs to block release; an incident needs to add new scenarios. safety-harness keeps those steps connected so safety work does not die as a one-off report.

Use it when you want a runnable skeleton for:

finding slow-burn agent failures
turning failures into regression cases
blocking releases when safety metrics regress
replaying incidents into root-cause graphs
showing the whole loop in a demo

Stages

Stage	What it does
`stress-testing/`	Static + adaptive red-teaming that surfaces delayed (slow-burn) safety failures, with attack mutators, a template catalog, and statistical power analysis.
`regression-suite/`	Turns discovered failures into a deterministic regression suite via pluggable eval adapters (misuse, red-team, traffic).
`release-gate/`	A production-style evaluation pipeline that computes a safety budget and blocks a release when safety metrics regress.
`incident-lab/`	Reproducible incident replay + causal-graph root-cause analysis, with adapters that integrate every other stage.
`simulator/`	A controllable agent (planner / memory / tools / executor) that serves as the system under test.
`demo/`	One end-to-end run of the full loop: stress → regression → release gate → incident replay.

Run it

Each stage is independently runnable and tested. From a stage directory:

cd stress-testing
pip install -r requirements.txt   # if present
PYTHONPATH=. python -m pytest -q  # run that stage's tests

The demo/ stage orchestrates the whole pipeline end-to-end.

Quick Start

git clone https://github.com/yingchen-coding/safety-harness
cd safety-harness/demo
pip install -r requirements.txt
make demo

License

CC BY-NC 4.0 — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

safety-harness

Why It Matters

Stages

Run it

Quick Start

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
demo		demo
incident-lab		incident-lab
regression-suite		regression-suite
release-gate		release-gate
simulator		simulator
stress-testing		stress-testing
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pytest.ini		pytest.ini

Folders and files

Latest commit

History

Repository files navigation

safety-harness

Why It Matters

Stages

Run it

Quick Start

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages