Skip to content

[Feature]: safe /restart slash command for in-chat gateway restarts #2894

@din0s

Description

@din0s

Problem or Use Case

Hermes gateway already exposes operational slash commands like /status, /update, and /stop, but it does not expose a safe /restart command.

Today, users or agents may try to restart the gateway from inside chat by approving or running commands such as:

  • systemctl --user restart hermes-gateway.service
  • hermes gateway restart

On systemd-managed installs, those commands run from inside the current gateway service process itself. That can self-destruct:

  • the restart caller lives inside the current gateway service cgroup
  • systemctl --user restart ... stops the service that is currently executing the command
  • the restart caller gets killed along with the service
  • depending on timing, the gateway can end up stopped instead of cleanly restarted

This is especially likely in Telegram because users naturally try to control the gateway from the gateway itself.

Proposed Solution

Add a first-class gateway slash command:

  • /restart

Implementation direction:

  • add _handle_restart_command(event) in gateway/run.py
  • route canonical == "restart" to that handler
  • use systemd-run --user --scope ... systemctl --user restart hermes-gateway.service rather than executing restart inline via the terminal tool
  • write a pending restart marker (for example .restart_pending.json) so the next gateway instance can send a success/failure follow-up message after startup
  • if systemd-run is unavailable, fail safely and tell the user to restart externally instead of attempting an inline self-killing restart

Desired UX:

  1. gateway acknowledges restart is starting
  2. restart is delegated outside the current service cgroup
  3. new gateway process starts cleanly
  4. optional post-restart confirmation is sent back to the originating chat

This should be a dedicated slash command rather than going through generic dangerous-command approval plus inline terminal execution.

Alternatives Considered

  1. Keep restart as a generic dangerous command approved through /approve and executed inline via the terminal tool. This is the current wrong execution model because the gateway process is trying to restart itself from inside its own service context.
  2. Require users to restart externally every time. That is safer than inline restart, but it leaves an obvious operational command unsupported in chat even though adjacent runtime slash commands already exist.

Related context:

Feature Type

Gateway / messaging improvement

Scope

Medium (few files, < 300 lines)

Contribution

  • I'd like to implement this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions