Skip to content
#

agent-debugging

Here are 34 public repositories matching this topic...

ChainWatch is a flight data recorder for multi-step AI systems. It's a CLI-based tool that records every step in an AI decision chain, links them together in order, prevents tampering, and allows you to verify the chain's integrity and replay the full decision flow.

  • Updated Jan 22, 2026
  • Python
agenticstash

Deterministic record and replay for agent runs. A zero-runtime-dependency TypeScript library and CLI that captures every source of non-determinism an agent touches (model output, tools, MCP, clock, randomness) and replays it exactly, with fork, diff, a tamper-evident seal, and redaction.

  • Updated Jun 19, 2026
  • HTML

AI agents fail like junior teammates, looping on bad ideas, ignoring feedback, and escalating commitment. vstack ports 34 of the most-cited organizational-behavior frameworks so you can diagnose your agents the same way you'd diagnose your team.

  • Updated Jun 23, 2026
  • Python

Improve this page

Add a description, image, and links to the agent-debugging topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agent-debugging topic, visit your repo's landing page and select "manage topics."

Learn more