Babysitter

Enforce obedience to agentic workforces. Manage extremely complex workflows through deterministic, hallucination-free self-orchestration.

Getting Started | Documentation | Community

babysitter-explainer.mp4

What is Babysitter?

Babysitter enforces obedience to agentic workforces, enabling them to manage extremely complex tasks and workflows through deterministic, hallucination-free self-orchestration. Define your workflow in code - Babysitter enforces every step, ensures quality gates pass before progression, requires human approval at breakpoints, and records every decision in an immutable journal. Your agents do exactly what the process permits, nothing more.

Prerequisites

Node.js: Version 20.0.0+ (22.x LTS recommended)
Claude Code: Latest version (docs)
Git: For cloning (optional)

Installation

1. Install the Plugin

claude plugin marketplace add a5c-ai/babysitter
claude plugin install --scope user babysitter@a5c.ai

Then restart Claude Code.

2. Verify Installation

Type /skills in Claude Code to verify "babysit" appears.

Codex CLI Integration (babysitter-codex)

Codex support is available as a dedicated plugin bundle in:

plugins/babysitter-codex

It includes Codex hook wiring, slash command dispatch, and orchestration harness scripts compatible with the Babysitter SDK.

First Steps

After installation, set up your environment:

1. Configure Your Profile (One-Time)

/babysitter:user-install

This creates your personal profile with:

Breakpoint preferences (how much oversight you want)
Tool preferences and communication style
Expertise areas for better process matching

2. Set Up Your Project

/babysitter:project-install

This analyzes your codebase and configures:

Project-specific workflows
Test frameworks and CI/CD integration
Tech stack preferences

3. Verify Setup

/babysitter:doctor

Run diagnostics to confirm everything is working.

Quick Start

claude "/babysitter:call implement user authentication with TDD"

Or in natural language:

Use the babysitter skill to implement user authentication with TDD

Claude will create an orchestration run, execute tasks step-by-step, handle quality checks and approvals, and continue until completion.

Choose Your Mode

Mode	Command	When to Use
Interactive	`/babysitter:call`	Learning, critical workflows - pauses for approval
Autonomous	`/babysitter:yolo`	Trusted tasks - full auto, no breakpoints
Planning	`/babysitter:plan`	Review process before executing
Continuous	`/babysitter:forever`	Monitoring, periodic tasks - runs indefinitely

Utility Commands

Command	Purpose
`/babysitter:doctor`	Diagnose run health and issues
`/babysitter:observe`	Launch real-time monitoring dashboard
`/babysitter:resume`	Continue an interrupted run
`/babysitter:help`	Documentation and usage help

How It Works

+=============================================================================+
|                         /babysitter:call                                    |
+=============================================================================+
|                                                                             |
|   YOUR PROCESS (JavaScript)                   This is the AUTHORITY         |
|   +----------------------------------------+                                |
|   | async function process(inputs, ctx) {  |  Real code, not config.       |
|   |                                        |  The orchestrator can ONLY    |
|   |   await ctx.task(plan, { ... });       |  do what this code permits.   |
|   |                                        |                                |
|   |   await ctx.breakpoint({               |  Breakpoints = human gates    |
|   |     question: 'Approve plan?'          |  (enforced, not optional)     |
|   |   });                                  |                                |
|   |                                        |                                |
|   |   await ctx.task(implement, { ... });  |  Tasks = executable work      |
|   |                                        |                                |
|   |   const score = await ctx.task(verify);|  Quality gates = code logic   |
|   |   if (score < 80)                      |  (not config, real checks)    |
|   |     await ctx.task(refine, { ... });   |                                |
|   | }                                      |                                |
|   +-------------------+--------------------+                                |
|                       |                                                     |
|                       | governs                                             |
|                       v                                                     |
|   +---------------------------------------------------------------------+   |
|   |                      ENFORCEMENT MECHANISM                          |   |
|   |                                                                     |   |
|   |   +-------------+     +------------------+     +-----------------+  |   |
|   |   | MANDATORY   |---->| PROCESS CHECK    |---->| DECISION        |  |   |
|   |   | STOP        |     | What does the    |     |                 |  |   |
|   |   | (enforced   |     | process permit   |     | Permitted: next |  |   |
|   |   |  by hook)   |     | next?            |     | task assigned   |  |   |
|   |   +-------------+     +------------------+     |                 |  |   |
|   |                              |                 | Blocked: halt   |  |   |
|   |                              v                 | until gate      |  |   |
|   |                       +--------------+        | passes          |  |   |
|   |                       | Gate/task    |        +-----------------+  |   |
|   |                       | from code    |                              |   |
|   |                       +--------------+                              |   |
|   +---------------------------------------------------------------------+   |
|                       |                                                     |
|                       | records every decision                              |
|                       v                                                     |
|   +---------------------------------------------------------------------+   |
|   |   JOURNAL: Every task, gate, decision - immutable, replayable       |   |
|   +---------------------------------------------------------------------+   |
|                                                                             |
+=============================================================================+

The difference from simple iteration:

Process as Code: Your workflow is JavaScript - the orchestrator can ONLY do what this code permits
Mandatory Stop: Claude cannot "keep running" - every step ends with a forced stop, then the process decides what's next
Enforcement, not Assistance: Gates block progression until satisfied - they're not suggestions
Event-Sourced Journal: All state in .a5c/runs/ - deterministic replay and resume from any point

Why Babysitter?

Traditional Approach	Babysitter
Run script once, hope it works	Process enforces quality gates before completion
Manual approval via chat	Structured breakpoints with context
State lost on session end	Event-sourced, fully resumable
Single task execution	Parallel execution, dependencies
No audit trail	Complete journal of all events
Ad-hoc workflow	Deterministic, code-defined processes

Key differentiators: Process enforcement, deterministic replay, quality convergence, human-in-the-loop breakpoints, and parallel execution.

Documentation

Getting Started

Features

Process Library - 2,000+ pre-built processes
Process Definitions
Quality Convergence
Run Resumption
Journal System
Best Practices
Architecture Overview

Reference

Contributing

We welcome contributions! Here's how you can help:

Report bugs: GitHub Issues
Suggest features: Share your ideas for improvements
Submit pull requests: Fix bugs or add features
Improve documentation: Help make docs clearer

See CONTRIBUTING.md for detailed guidelines.

Community and Support

Discord: Join our community (GitHub invite link)
GitHub Issues: Report bugs or request features
GitHub Discussions: Ask questions and share ideas
npm: @a5c-ai/babysitter-sdk

Community Tools

Tool	Description
Observer Dashboard	Real-time monitoring UI for parallel runs
Telegram Bot	Control sessions remotely
vibe-kanban	Parallel process management

Star History

Contributors

License

This project is licensed under the MIT License. See LICENSE.md for details.

Compression

Babysitter includes a 4-layer token compression subsystem (built into packages/sdk/) that reduces context window usage by 50–67% on real sessions while maintaining 99% fact retention.

All compression hooks are automatically registered by the babysitter plugin — no manual settings.json configuration needed. Install the plugin and compression is active.

How It Works

Layer	Hook	Engine	Content	Reduction
1a	userPromptHook	density-filter	User prompts	~29%
1b	commandOutputHook	command-compressor	Bash/shell output	~47% avg
2	sdkContextHook	sentence-extractor	Agent/task context	~87%
3	processLibraryCache	sentence-extractor	Library files (pre-cached)	~94%

Quick Toggle

# Disable all compression
export BABYSITTER_COMPRESSION_ENABLED=false

# Disable a single layer
babysitter compression:toggle sdkContextHook off

# Show current effective config
babysitter compression:config

Config File

Edit .a5c/compression.config.json to persist settings (env vars always take priority):

{
  "enabled": true,
  "layers": {
    "userPromptHook":    { "enabled": true, "threshold": 500, "keepRatio": 0.78 },
    "commandOutputHook": { "enabled": true, "excludeCommands": ["jq", "curl", "docker"] },
    "sdkContextHook":    { "enabled": true, "targetReduction": 0.15, "minCompressionTokens": 150 },
    "processLibraryCache": { "enabled": true, "targetReduction": 0.35, "ttlHours": 24 }
  }
}

Toggle any layer with babysitter compression:toggle <layer> <on|off> or set individual values with babysitter compression:set <key> <value>.

Built with Claude by A5C AI

Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 1,045 Commits
.a5c		.a5c
.claude-plugin		.claude-plugin
.claude		.claude
.github		.github
docs		docs
e2e-artifacts		e2e-artifacts
e2e-tests		e2e-tests
notes		notes
packages		packages
plugins		plugins
scripts		scripts
video		video
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
DOCKER.md		DOCKER.md
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
test-dispatcher.js		test-dispatcher.js
test-project-root.js		test-project-root.js

Folders and files

Latest commit

History

Repository files navigation

Babysitter

Table of Contents

What is Babysitter?

Prerequisites

Installation

1. Install the Plugin

2. Verify Installation

Codex CLI Integration (babysitter-codex)

First Steps

1. Configure Your Profile (One-Time)

2. Set Up Your Project

3. Verify Setup

Quick Start

Choose Your Mode

Utility Commands

How It Works

Why Babysitter?

Documentation

Getting Started

Features

Reference

Contributing

Community and Support

Community Tools

Star History

Contributors

License

Compression

How It Works

Quick Toggle

Config File

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 104

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages