Skip to content

Phase 1: OpenClaw operational controls + AO health polling #386

@AgentWrapper

Description

@AgentWrapper

Context

Phase 0 shipped webhook-first escalation from AO to OpenClaw (notifier-openclaw -> /hooks/agent).

Phase 1 should make OpenClaw materially more operational without introducing new protocol abstractions.

Goal

Enable practical day-2 operations for AO + OpenClaw on single-host localhost setups:

  • fast operator commands
  • periodic AO health visibility
  • safer escalation handling under load

Scope (Phase 1)

1) OpenClaw auto-reply commands (no AI turn)

Add lightweight OpenClaw command handlers for high-frequency ops:

  • /ao status <sessionId?>
  • /ao sessions
  • /ao retry <sessionId>
  • /ao kill <sessionId>

These should map to AO CLI calls (ao session ls, ao send, ao kill) and return compact confirmations.

2) Background AO health polling service

Add OpenClaw background service to poll AO state periodically and surface health summaries:

  • active/degraded/dead session counts
  • stale session detection (no output for N mins)
  • failed spawn/send command counts

No new AO transport required for Phase 1; use existing CLI + files.

3) Controlled notification behavior

Implement simple anti-spam guardrails on OpenClaw side:

  • debounce repeated escalations for same session+reason window
  • batch summary when multiple sessions escalate in short interval

4) Operational docs + runbook

Document:

  • command behaviors and response format
  • expected AO/OpenClaw permissions
  • troubleshooting checklist (hooks token, queueing, missing AO binary)

Out of scope

  • cross-network/multi-tenant architecture
  • peer protocol / HMAC / RBAC
  • new AO REST/WS reverse API

Acceptance Criteria

  • /ao status, /ao sessions, /ao retry, /ao kill work from chat and return deterministic outcomes.
  • OpenClaw health service publishes periodic AO health summary without manual polling.
  • repeated identical escalations are visibly reduced (debounce/batch behavior verified).
  • end-to-end test evidence for at least:
    • command success path
    • AO CLI unavailable path
    • burst escalation path with reduced notification noise

Test Plan

  • unit tests for command parsing + command execution adapters
  • integration tests with mocked AO CLI responses
  • manual smoke on real OpenClaw + AO runtime

Dependencies

Suggested rollout

  1. Land commands first (/ao status, /ao sessions)
  2. Add /ao retry and /ao kill
  3. Add health polling service
  4. Add debounce/batch behavior and docs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions