Troubleshooting Guide

Version: 1.0 Last Updated: 2026-01-25 Category: Reference

This guide provides systematic troubleshooting procedures for common Babysitter issues. Each section includes symptoms, diagnosis steps, solutions, and prevention tips.

Installation Issues
Plugin Issues
Run Execution Issues
Quality Convergence Issues
Resumption Issues
Performance Issues
Journal and State Issues
Diagnostic Commands Reference
When to Contact Support

Installation Issues

Node.js Not Found

Symptoms:

command not found: npm
command not found: node

Diagnosis:

which node
which npm
node --version

Solutions:

Install Node.js via nvm (recommended):

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
source ~/.bashrc  # or ~/.zshrc
nvm install 22
nvm use 22

Direct installation:
- Download from nodejs.org
- Install the LTS version (22.x)
Verify PATH:
```
echo $PATH | grep -E "(node|npm)"
```

Prevention: Use nvm to manage Node.js versions.

Permission Denied During npm Install

Symptoms:

npm ERR! EACCES: permission denied, mkdir '/usr/local/lib/node_modules'

Diagnosis:

npm config get prefix
ls -la $(npm config get prefix)/lib/node_modules/

Solutions:

Fix npm permissions (recommended):

mkdir ~/.npm-global
npm config set prefix '~/.npm-global'
echo 'export PATH=~/.npm-global/bin:$PATH' >> ~/.zshrc
source ~/.zshrc

Reinstall packages:

npm install -g @a5c-ai/babysitter-sdk@latest

Prevention: Never use sudo npm install -g. Configure npm for user installs.

SDK Module Not Found

Symptoms:

Error: Cannot find module '@a5c-ai/babysitter-sdk'

Diagnosis:

npm list -g @a5c-ai/babysitter-sdk
which babysitter
echo $PATH

Solutions:

Install globally:

npm install -g @a5c-ai/babysitter-sdk@latest

Use npx (always works):

npx -y @a5c-ai/babysitter-sdk@latest --version

Check PATH includes npm global bin:

npm bin -g
# Add this to your PATH if not included

Prevention: Verify installation with npm list -g after installing.

Version Mismatch Errors

Symptoms:

Error: Incompatible version: sdk@0.0.120 requires babysitter@^0.0.120

Diagnosis:

npm list -g @a5c-ai/babysitter-sdk

Solutions:

Update all packages to the latest versions:

npm install -g @a5c-ai/babysitter-sdk@latest

Prevention: Update all packages together, not individually.

jq Not Found

Symptoms:

command not found: jq
jq: command not found

Many Babysitter diagnostic commands use jq for JSON processing. Without it, commands like journal analysis and status inspection will fail.

Diagnosis:

which jq
jq --version

Solutions:

Install jq for your platform:

# macOS
brew install jq

# Ubuntu/Debian
sudo apt-get install jq

# Fedora/RHEL/CentOS
sudo dnf install jq

# Arch Linux
sudo pacman -S jq

# Windows (Chocolatey)
choco install jq

# Windows (Scoop)
scoop install jq

# Windows (WSL) - use Ubuntu instructions
sudo apt-get install jq

Verify installation:

jq --version
# Expected: jq-1.6 or higher

Prevention: Install jq as part of your initial setup. It's a standard tool for working with JSON on the command line.

Plugin Issues

Plugin Not Appearing in /skills

Symptoms:

/skills command doesn't show "babysit"
"Skill not found: babysit" error

Diagnosis:

claude plugin list
claude plugin list | grep babysitter

Step-by-Step Solution:

Add marketplace (if not added):

claude plugin marketplace add a5c-ai/babysitter

Install plugin:

claude plugin install --scope user babysitter@a5c.ai

Enable plugin:

claude plugin enable --scope user babysitter@a5c.ai

Restart Claude Code completely:
- Close all Claude Code windows
- Wait a few seconds
- Reopen Claude Code
Verify:
```
/skills
# Should show "babysit"
```

Plugin Install Fails

Symptoms:

Error: Plugin 'babysitter@a5c.ai' not found

Diagnosis:

claude plugin marketplace list

Solutions:

Add marketplace first:

claude plugin marketplace add a5c-ai/babysitter

Check network connectivity:

curl -I https://github.com/a5c-ai/babysitter

Try updating marketplace:
```
claude plugin marketplace update a5c.ai
```

Plugin Conflicts

Symptoms:

Claude Code behaves unexpectedly
Other plugins stop working after Babysitter install

Diagnosis:

claude plugin list --all

Solutions:

Disable and re-enable:

claude plugin disable babysitter@a5c.ai
claude plugin enable babysitter@a5c.ai

Check for conflicts:
- Temporarily disable other plugins
- Test Babysitter alone
- Re-enable plugins one by one

Run Execution Issues

Run Encountered an Error

Symptoms:

Error: Run encountered an error
  at processIteration (process.js:123)

Diagnosis:

# Check journal for error details
babysitter run:events <runId> --filter-type RUN_FAILED --json

# View last events
babysitter run:events <runId> --limit 10 --reverse --json

Solutions:

Identify the cause:

jq '.data.error' .a5c/runs/<runId>/journal/$(ls -t .a5c/runs/<runId>/journal/ | head -1)

If journal conflict:
- Ensure no other sessions are running the same workflow
- Delete state cache and rebuild:
```
rm .a5c/runs/<runId>/state/state.json
babysitter run:status <runId>
```
If task failure:
- Fix the underlying issue (missing deps, file not found, etc.)
- Resume:
```
/babysitter:call resume --run-id <runId>
```

Ask Claude to analyze:

Analyze the babysitter run error for <runId> and try to recover

Task Execution Failed: ENOENT

Symptoms:

Task failed: test-task-001
Error: ENOENT: no such file or directory

Diagnosis:

# Check task details
cat .a5c/runs/<runId>/tasks/<effectId>/task.json | jq .

# Check working directory
pwd
ls -la

Solutions:

Verify file paths:
- Check if referenced files exist
- Ensure paths are correct for your system
Install dependencies:
```
npm install
```
Check working directory:
- Tasks run relative to the run directory
- Verify paths in task definition

Agent Task Timeout

Symptoms:

Task timeout: agent-task-001
Execution exceeded 120s

Diagnosis:

# Check task that timed out
cat .a5c/runs/<runId>/tasks/<effectId>/task.json | jq .

Solutions:

Reduce task scope:
- Analyze specific files instead of entire codebase
- Break large tasks into smaller ones

Increase timeout (if necessary):

await ctx.task(analyzeTask, { /* args */ }, {
  timeout: 300000  // 5 minutes
});

Check API status:
- Verify Claude API is accessible
- Check for rate limiting

Session Ended Unexpectedly

Symptoms:

Claude Code session terminated
Run ID: 01KFFTSF8TK8C9GT3YM9QYQ6WG
Status: interrupted

Diagnosis:

babysitter run:status <runId> --json

Solutions:

Resume the run:

/babysitter:call resume --run-id 01KFFTSF8TK8C9GT3YM9QYQ6WG

Babysitter is designed to be resumable. All progress is preserved.

Quality Convergence Issues

Quality Score Not Improving

Symptoms:

Iteration 1: Quality 65/100
Iteration 2: Quality 66/100
Iteration 3: Quality 65/100
...
Iteration 5: Quality 66/100
Target not met: 85/100

Diagnosis:

# View effect resolutions
babysitter run:events <runId> --filter-type EFFECT_RESOLVED --json

# Check recommendations
jq '.recommendations' .a5c/runs/<runId>/tasks/*/result.json

Solutions:

Review feedback:
- Check what's blocking improvement
- Look for recurring issues
Lower target:
```
Use babysitter with 75% quality target
```
Increase iterations:
```
Use babysitter with max 10 iterations
```
Fix blocking issues:
- Run tests manually: npm test
- Check lint: npm run lint
- Fix any consistent failures
Review scoring weights:
- Some criteria may be weighted too heavily
- Adjust in process definition if needed

Iterations Never Converge

Symptoms:

Quality score oscillates without improvement
Maximum iterations reached without meeting target

Diagnosis:

# Check quality trend
jq -s '[.[] | select(.type == "EFFECT_RESOLVED")] | map({effectId: .data.effectId, status: .data.status})' \
  .a5c/runs/<runId>/journal/*.json

Solutions:

Implement early exit on plateau:

const recentScores = iterationResults.slice(-3).map(r => r.quality);
if (Math.max(...recentScores) - Math.min(...recentScores) < 2) {
  ctx.log('Quality plateaued, stopping early');
  break;
}

Lower target to achievable level
Review if target is realistic for the codebase

Inconsistent Quality Scores

Symptoms:

Quality scores vary significantly between iterations
No clear trend

Diagnosis:

# Check variance in scores
jq -s '[.[] | select(.type == "EFFECT_RESOLVED")] | length' \
  .a5c/runs/<runId>/journal/*.json

Solutions:

Use more deterministic scoring criteria
Ensure tests are stable (no flaky tests)
Review agent scoring prompts for consistency
Use ctx.now() instead of Date.now() for timestamps

Resumption Issues

Cannot Resume Completed Run

Symptoms:

Resume has no effect
Run state shows "completed"

Diagnosis:

babysitter run:status <runId> --json | jq '.state'

Solutions:

A completed run cannot be resumed - it's already finished. Create a new run instead.

State Corruption After Manual Edits

Symptoms:

Run behaves unexpectedly
Events don't match expected behavior

Diagnosis:

# Check journal integrity
jq empty .a5c/runs/<runId>/journal/*.json 2>&1

Solutions:

Never edit journal files manually

If state cache corrupted:

rm .a5c/runs/<runId>/state/state.json
babysitter run:status <runId>  # Rebuilds state

If journal corrupted:
- May need to start a new run
- Backup journal first for analysis

Missing Run Directory

Symptoms:

Error: Run not found: 01KFFTSF8TK8C9GT3YM9QYQ6WG

Diagnosis:

ls -la .a5c/runs/ | grep <runId>

Solutions:

Run directory was deleted or never created
Create a new run instead
Check if you're in the correct working directory

Performance Issues

Slow Iteration Execution

Symptoms:

Each iteration takes several minutes
Tasks run sequentially when they could be parallel

Diagnosis:

# Find slow tasks
jq -s '[.[] | select(.type == "EFFECT_RESOLVED")] | map({effectId: .data.effectId, duration: ((.data.finishedAt | fromdateiso8601) - (.data.startedAt | fromdateiso8601))}) | sort_by(.duration) | reverse | .[0:5]' \
  .a5c/runs/<runId>/journal/*.json

Solutions:

Use parallel execution:

const [coverage, lint, security] = await ctx.parallel.all([
  () => ctx.task(coverageTask, {}),
  () => ctx.task(lintTask, {}),
  () => ctx.task(securityTask, {})
]);

Reduce agent task scope
Set lower iteration limits
Cache research results between iterations

High Disk Usage

Symptoms:

.a5c/ directory growing large
Disk space warnings

Diagnosis:

du -sh .a5c/
du -h .a5c/runs/* | sort -h | tail -10

Solutions:

Clean old runs:

# List runs by size
du -h .a5c/runs/* | sort -h

# Delete old runs
rm -rf .a5c/runs/<old-run-id>

Archive completed runs:

tar -czf run-archive.tar.gz .a5c/runs/<runId>
rm -rf .a5c/runs/<runId>

Add cleanup to workflow:
- Periodically clean old runs
- Set up automated cleanup

Memory Issues

Symptoms:

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

Solutions:

Increase Node.js memory:

export NODE_OPTIONS="--max-old-space-size=4096"

Reduce concurrent tasks
Process smaller batches of files

Journal and State Issues

Journal Conflict

Symptoms:

Error: Journal conflict detected

Diagnosis:

# Check if multiple processes running
ps aux | grep babysitter

Solutions:

Ensure single writer:
- Only one session should run a workflow at a time
Wait and retry:
- If previous session crashed, wait a moment
- Resume the run

Rebuild state:

rm .a5c/runs/<runId>/state/state.json
babysitter run:status <runId>

Events Out of Order

Symptoms:

Events appear in unexpected order
State doesn't match expectations

Diagnosis:

# Check sequence numbers
jq '.seq' .a5c/runs/<runId>/journal/*.json | sort -n

Solutions:

Always use CLI or sort by seq field:

jq -s 'sort_by(.seq)' .a5c/runs/<runId>/journal/*.json

File system listing may not be sorted correctly

State Cache Missing

Symptoms:

CLI commands are slow on first access
State needs rebuilding

Diagnosis:

ls .a5c/runs/<runId>/state/

Solutions:

This is normal. The state cache is derived from the journal:

# Trigger rebuild
babysitter run:status <runId>

The cache will be created automatically.

Diagnostic Commands Reference

Run Status and Events

# Check run status
babysitter run:status <runId> --json

# View all events
babysitter run:events <runId> --json

# View last N events
babysitter run:events <runId> --limit 10 --reverse --json

# Filter by event type
babysitter run:events <runId> --filter-type EFFECT_RESOLVED --json

# Find failures
babysitter run:events <runId> --filter-type RUN_FAILED --json

Task Information

# List pending tasks
babysitter task:list <runId> --pending --json

# View task details
cat .a5c/runs/<runId>/tasks/<effectId>/task.json | jq .

# View task result
cat .a5c/runs/<runId>/tasks/<effectId>/result.json | jq .

System Checks

# Check SDK version
npx -y @a5c-ai/babysitter-sdk@latest --version

# Check installed packages
npm list -g @a5c-ai/babysitter @a5c-ai/babysitter-sdk

# Check plugin status
claude plugin list | grep babysitter

Journal Analysis

# Count events by type
jq -s 'group_by(.type) | map({type: .[0].type, count: length})' \
  .a5c/runs/<runId>/journal/*.json

# Find failed tasks
jq 'select(.type == "EFFECT_RESOLVED" and .data.status == "error")' \
  .a5c/runs/<runId>/journal/*.json

# Check effect resolutions
jq 'select(.type == "EFFECT_RESOLVED") | {effectId: .data.effectId, status: .data.status}' \
  .a5c/runs/<runId>/journal/*.json

When to Contact Support

Contact support if you experience:

Persistent crashes that cannot be resolved with troubleshooting
Data corruption that affects the journal or state
Security concerns about the tool behavior
Bugs that are reproducible with clear steps

How to Report Issues

Gather information:
- OS and version
- Node.js version: node --version
- Claude Code version: claude --version
- Babysitter SDK version: npx @a5c-ai/babysitter-sdk --version
- Full error message and stack trace
- Steps to reproduce

Include diagnostic output:

babysitter run:status <runId> --json > diagnostic.json
babysitter run:events <runId> --limit 50 --reverse --json >> diagnostic.json

Create issue at: GitHub Issues
For discussions: GitHub Discussions

FilesExpand file tree

troubleshooting.md

Latest commit

History

troubleshooting.md

File metadata and controls

Troubleshooting Guide

Table of Contents

Installation Issues

Node.js Not Found

Permission Denied During npm Install

SDK Module Not Found

Version Mismatch Errors

jq Not Found

Plugin Issues

Plugin Not Appearing in /skills

Plugin Install Fails

Plugin Conflicts

Run Execution Issues

Run Encountered an Error

Task Execution Failed: ENOENT

Agent Task Timeout

Session Ended Unexpectedly

Quality Convergence Issues

Quality Score Not Improving

Iterations Never Converge

Inconsistent Quality Scores

Resumption Issues

Cannot Resume Completed Run

State Corruption After Manual Edits

Missing Run Directory

Performance Issues

Slow Iteration Execution

High Disk Usage

Memory Issues

Journal and State Issues

Journal Conflict

Events Out of Order

State Cache Missing

Diagnostic Commands Reference

Run Status and Events

Task Information

System Checks

Journal Analysis

When to Contact Support

How to Report Issues

Related Documentation