This SOP defines operational procedures for three security-critical materials:
- Registry signature trust roots (
S61) - Secrets-at-rest encryption key (
S57) - Bridge device tokens (
S58)
Use this runbook for routine rotation, emergency revocation, and disaster recovery.
Scope note:
- If
S11optional 1Password provider is enabled, provider API keys can be sourced from local 1Password instead ofsecrets.enc.json. In that mode, this SOP still governs the local encryption key lifecycle for any secrets that remain in server-side store.
All paths are relative to OPENCLAW_STATE_DIR (legacy fallback: MOLTBOT_STATE_DIR):
| Material | File(s) | Purpose |
|---|---|---|
| Registry trust roots | registry/trust/trust_roots.json |
Key IDs, fingerprints, validity windows, revocation state |
| Secrets-at-rest key | secrets.key |
Envelope encryption key for secrets.enc.json |
| Encrypted secret store | secrets.enc.json |
Encrypted provider secrets |
| Bridge token registry | bridge_tokens.json |
Device token lifecycle state and audit trail |
Optional startup log hygiene for incident drills:
- Set
OPENCLAW_LOG_TRUNCATE_ON_START=1before restart when you need a cleanopenclaw.logtimeline.
Optional S11 local secret-manager settings (if used):
OPENCLAW_1PASSWORD_ENABLED=1OPENCLAW_1PASSWORD_ALLOWED_COMMANDS=<allowlisted executables>OPENCLAW_1PASSWORD_CMD=opOPENCLAW_1PASSWORD_VAULT=<vault>
- Always take a timestamped backup before any lifecycle operation.
- Never rotate all three materials in one change window.
- Require two-person review for revoke or disaster-recovery actions.
- Record exact commands and outputs in the implementation record/change ticket.
- Add the new signer key as an additional trust root.
- Start signing new artifacts with the new key while old key remains active.
- Verify both key paths pass signature verification.
- Revoke old key after rollout completion.
Example add/revoke flow:
python - <<'PY'
import os, time
from services.registry_quarantine import TrustRoot, TrustRootStore
state_dir = os.environ["OPENCLAW_STATE_DIR"]
store = TrustRootStore(state_dir)
# Add new root (replace key_id/public_key_pem)
store.add_root(TrustRoot(
key_id="k-2026q1",
public_key_pem="-----BEGIN PUBLIC KEY-----\n...\n-----END PUBLIC KEY-----",
valid_from=time.time(),
))
# Revoke old root after cutover
store.revoke_root("k-2025q4", reason="planned rotation complete")
print("active_roots=", [r.key_id for r in store.get_active_roots()])
PY- Revoke compromised
key_idimmediately. - Switch registry policy to strict mode if not already strict (
OPENCLAW_REGISTRY_POLICY=strict). - Reject/quarantine artifacts signed only by revoked key.
- Restore
registry/trust/trust_roots.jsonfrom last known good backup. - Verify expected fingerprints before enabling registry sync.
- Run
tests.test_s61_registry_signaturebefore production rollout.
- Back up
secrets.keyandsecrets.enc.json. - Decrypt existing envelope with old key.
- Generate a new key and re-encrypt the same secret payload.
- Restart service and verify secret reads.
Reference script (run in maintenance window; requires cryptography installed):
python - <<'PY'
import os
import json
import shutil
from pathlib import Path
from cryptography.fernet import Fernet
from services import secrets_encryption as enc
state_dir = Path(os.environ["OPENCLAW_STATE_DIR"])
old_key_path = state_dir / "secrets.key"
store_path = state_dir / "secrets.enc.json"
shutil.copy2(old_key_path, state_dir / "secrets.key.bak")
shutil.copy2(store_path, state_dir / "secrets.enc.json.bak")
envelope = enc.EncryptedEnvelope.from_dict(json.loads(store_path.read_text(encoding="utf-8")))
old_key = old_key_path.read_bytes().strip()
secrets = enc.decrypt_secrets(envelope, old_key)
new_key = Fernet.generate_key()
old_key_path.write_bytes(new_key)
new_envelope = enc.encrypt_secrets(secrets, new_key)
store_path.write_text(json.dumps(new_envelope.to_dict(), indent=2), encoding="utf-8")
print("rotated secrets.key and re-encrypted secrets.enc.json")
PY- If key exposure is suspected, rotate
secrets.keyimmediately. - Re-issue high-risk credentials upstream (provider/API tokens), then update store.
- If
secrets.keyis lost and no backup exists, encrypted secrets are unrecoverable by design. - Re-provision all provider secrets from source systems.
- Delete stale
secrets.enc.json, then repopulate secrets through approved flows.
- Issue/rotate per device with bounded overlap (max 30 minutes).
- Verify new token works before overlap expiry.
- Revoke old token after cutover.
Example rotation flow:
python - <<'PY'
import os
from services.bridge_token_lifecycle import get_token_store
store = get_token_store(os.environ["OPENCLAW_STATE_DIR"])
active = store.list_tokens(device_id="device-a", active_only=True)
old = active[0]
new_token, old_token = store.rotate_token(old.token_id, overlap_sec=300, ttl_sec=3600)
print("new_token_id=", new_token.token_id)
print("old_overlap_until=", old_token.overlap_until)
PY- Revoke compromised token IDs immediately (
revoke_token). - If blast radius is unclear, disable bridge ingress (
OPENCLAW_BRIDGE_ENABLED=0) until re-issued tokens are deployed. - Validate audit trail contains revoke events.
- Restore
bridge_tokens.jsonfrom backup if corruption is detected. - If integrity is uncertain, revoke all active tokens and re-issue per device.
- Validate with
tests.test_s58_bridge_token_lifecycleandtests.test_s58_bridge_auth_integration.
In addition to the manual procedures above, OpenClaw provides a local/CI-safe drill runner that simulates lifecycle incidents and emits machine-readable evidence.
Script:
scripts/run_crypto_lifecycle_drills.py
Supported scenarios:
planned_rotationemergency_revokekey_loss_recoverytoken_compromise
Example commands:
python scripts/run_crypto_lifecycle_drills.py --pretty
python scripts/run_crypto_lifecycle_drills.py --scenarios planned_rotation,emergency_revoke --output .planning/logs/crypto_drills.json --prettyEvidence bundle contract (JSON):
- top-level fields include
schema_version,bundle,state_dir, anddrills - each drill record includes:
operationscenarioprecheckresultrollback_statusartifactsdecision_codesfail_closed_assertions
Operational notes:
- This drill runner is for verification/training/evidence collection and does not replace maintenance-window production rotation procedures.
- Use an isolated or temporary state directory unless you intentionally want artifacts written to a specific test state path.
- Store drill evidence alongside change tickets or implementation records when lifecycle readiness is part of acceptance criteria.
Run at minimum:
python scripts/run_unittests.py --module tests.test_s58_bridge_token_lifecycle
python scripts/run_unittests.py --module tests.test_s61_registry_signature
python scripts/run_unittests.py --module tests.test_s57_secrets_encryption
python scripts/run_unittests.py --module tests.test_s60_routes_startup_gate
python scripts/run_unittests.py --module tests.security.test_endpoint_driftThen run the full gate from tests/TEST_SOP.md before rollout.