Instructions for AI coding assistants (GitHub Copilot, Cursor, etc.) and human developers.
Hermes Agent is an AI agent harness with tool-calling capabilities, interactive CLI, messaging integrations, and scheduled tasks.
IMPORTANT: Always use the virtual environment if it exists:
source venv/bin/activate # Before running any Python commandshermes-agent/
├── agent/ # Agent internals (extracted from run_agent.py)
│ ├── model_metadata.py # Model context lengths, token estimation
│ ├── context_compressor.py # Auto context compression
│ ├── prompt_caching.py # Anthropic prompt caching
│ ├── prompt_builder.py # System prompt assembly (identity, skills index, context files)
│ ├── display.py # KawaiiSpinner, tool preview formatting
│ └── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI implementation
│ ├── main.py # Entry point, command dispatcher
│ ├── banner.py # Welcome banner, ASCII art, skills summary
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive prompt callbacks (clarify, sudo, approval)
│ ├── setup.py # Interactive setup wizard
│ ├── config.py # Config management & migration
│ ├── status.py # Status display
│ ├── doctor.py # Diagnostics
│ ├── gateway.py # Gateway management
│ ├── uninstall.py # Uninstaller
│ ├── cron.py # Cron job management
│ └── skills_hub.py # Skills Hub CLI + /skills slash command
├── tools/ # Tool implementations
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection + per-session approval
│ ├── environments/ # Terminal execution backends
│ │ ├── base.py # BaseEnvironment ABC
│ │ ├── local.py # Local execution with interrupt support
│ │ ├── docker.py # Docker container execution
│ │ ├── ssh.py # SSH remote execution
│ │ ├── singularity.py # Singularity/Apptainer + SIF management
│ │ └── modal.py # Modal cloud execution
│ ├── terminal_tool.py # Terminal orchestration (sudo, lifecycle, factory)
│ ├── todo_tool.py # Planning & task management
│ ├── process_registry.py # Background process management
│ └── ... # Other tool files
├── gateway/ # Messaging platform adapters
│ ├── platforms/ # Platform-specific adapters (telegram, discord, slack, whatsapp)
│ └── ...
├── cron/ # Scheduler implementation
├── environments/ # RL training environments (Atropos integration)
├── skills/ # Bundled skill sources
├── cli.py # Interactive CLI orchestrator (HermesCLI class)
├── run_agent.py # AIAgent class (core conversation loop)
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings
├── toolset_distributions.py # Probability-based tool selection
└── batch_runner.py # Parallel batch processing
User Configuration (stored in ~/.hermes/):
~/.hermes/config.yaml- Settings (model, terminal, toolsets, etc.)~/.hermes/.env- API keys and secrets~/.hermes/pairing/- DM pairing data~/.hermes/hooks/- Custom event hooks~/.hermes/image_cache/- Cached user images~/.hermes/audio_cache/- Cached user voice messages~/.hermes/sticker_cache.json- Telegram sticker descriptions
tools/registry.py (no deps — imported by all tool files)
↑
tools/*.py (each calls registry.register() at import time)
↑
model_tools.py (imports tools/registry + triggers tool discovery)
↑
run_agent.py, cli.py, batch_runner.py, environments/
Each tool file co-locates its schema, handler, and registration. model_tools.py is a thin orchestration layer.
The main agent is implemented in run_agent.py:
class AIAgent:
def __init__(
self,
model: str = "anthropic/claude-sonnet-4",
api_key: str = None,
base_url: str = "https://openrouter.ai/api/v1",
max_iterations: int = 60, # Max tool-calling loops
enabled_toolsets: list = None,
disabled_toolsets: list = None,
verbose_logging: bool = False,
quiet_mode: bool = False, # Suppress progress output
tool_progress_callback: callable = None, # Called on each tool use
):
# Initialize OpenAI client, load tools based on toolsets
...
def chat(self, user_message: str, task_id: str = None) -> str:
# Main entry point - runs the agent loop
...The core loop in _run_agent_loop():
1. Add user message to conversation
2. Call LLM with tools
3. If LLM returns tool calls:
- Execute each tool
- Add tool results to conversation
- Go to step 2
4. If LLM returns text response:
- Return response to user
while turns < max_turns:
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tool_schemas,
)
if response.tool_calls:
for tool_call in response.tool_calls:
result = await execute_tool(tool_call)
messages.append(tool_result_message(result))
turns += 1
else:
return response.contentMessages are stored as a list of dicts following OpenAI format:
messages = [
{"role": "system", "content": "You are a helpful assistant..."},
{"role": "user", "content": "Search for Python tutorials"},
{"role": "assistant", "content": None, "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "content": "..."},
{"role": "assistant", "content": "Here's what I found..."},
]For models that support chain-of-thought reasoning:
- Extract
reasoning_contentfrom API responses - Store in
assistant_msg["reasoning"]for trajectory export - Pass back via
reasoning_contentfield on subsequent turns
The interactive CLI uses:
- Rich - For the welcome banner and styled panels
- prompt_toolkit - For fixed input area with history,
patch_stdout, slash command autocomplete, and floating completion menus - KawaiiSpinner (in run_agent.py) - Animated kawaii faces during API calls; clean
┊activity feed for tool execution results
Key components:
HermesCLIclass - Main CLI controller with commands and conversation loopSlashCommandCompleter- Autocomplete dropdown for/commands(type/to see all)agent/skill_commands.py- Scans skills and builds invocation messages (shared with gateway)load_cli_config()- Loads config, sets environment variables for terminalbuild_welcome_banner()- Displays ASCII art logo, tools, and skills summary
CLI UX notes:
- Thinking spinner (during LLM API call) shows animated kawaii face + verb (
(⌐■_■) deliberating...) - When LLM returns tool calls, the spinner clears silently (no "got it!" noise)
- Tool execution results appear as a clean activity feed:
┊ {emoji} {verb} {detail} {duration} - "got it!" only appears when the LLM returns a final text response (
⚕ ready) - The prompt shows
⚕ ❯when the agent is working,❯when idle - Pasting 5+ lines auto-saves to
~/.hermes/pastes/and collapses to a reference - Multi-line input via Alt+Enter or Ctrl+J
/commands- Process user commands like/help,/clear,/personality, etc./skill-name- Invoke installed skills directly (e.g.,/axolotl,/gif-search)
CLI uses quiet_mode=True when creating AIAgent to suppress verbose logging.
Every installed skill in ~/.hermes/skills/ is automatically registered as a slash command.
The skill name (from frontmatter or folder name) becomes the command: axolotl → /axolotl.
Implementation (agent/skill_commands.py, shared between CLI and gateway):
scan_skill_commands()scans all SKILL.md files at startupbuild_skill_invocation_message()loads the SKILL.md content and builds a user-turn message- The message includes the full skill content, a list of supporting files (not loaded), and the user's instruction
- Supporting files can be loaded on demand via the
skill_viewtool - Injected as a user message (not system prompt) to preserve prompt caching
- Add to
COMMANDSdict with description - Add handler in
process_command()method - For persistent settings, use
save_config_value()to update config
The unified hermes command provides all functionality:
| Command | Description |
|---|---|
hermes |
Interactive chat (default) |
hermes chat -q "..." |
Single query mode |
hermes setup |
Configure API keys and settings |
hermes config |
View current configuration |
hermes config edit |
Open config in editor |
hermes config set KEY VAL |
Set a specific value |
hermes config check |
Check for missing config |
hermes config migrate |
Prompt for missing config interactively |
hermes status |
Show configuration status |
hermes doctor |
Diagnose issues |
hermes update |
Update to latest (checks for new config) |
hermes uninstall |
Uninstall (can keep configs for reinstall) |
hermes gateway |
Start gateway (messaging + cron scheduler) |
hermes gateway install |
Install gateway as system service |
hermes cron list |
View scheduled jobs |
hermes cron status |
Check if cron scheduler is running |
hermes version |
Show version info |
hermes pairing list/approve/revoke |
Manage DM pairing codes |
The gateway connects Hermes to Telegram, Discord, and WhatsApp.
# Telegram
TELEGRAM_BOT_TOKEN=123456:ABC-DEF... # From @BotFather
TELEGRAM_ALLOWED_USERS=123456789,987654 # Comma-separated user IDs (from @userinfobot)
# Discord
DISCORD_BOT_TOKEN=MTIz... # From Developer Portal
DISCORD_ALLOWED_USERS=123456789012345678 # Comma-separated user IDs
# Agent Behavior
HERMES_MAX_ITERATIONS=60 # Max tool-calling iterations
MESSAGING_CWD=/home/myuser # Terminal working directory for messaging
# Tool progress is configured in config.yaml (display.tool_progress: off|new|all|verbose)- CLI (
hermescommand): Uses current directory (.→os.getcwd()) - Messaging (Telegram/Discord): Uses
MESSAGING_CWD(default: home directory)
This is intentional: CLI users are in a terminal and expect the agent to work in their current directory, while messaging users need a consistent starting location.
IMPORTANT: By default, the gateway denies all users who are not in an allowlist or paired via DM.
The gateway checks {PLATFORM}_ALLOWED_USERS environment variables:
- If set: Only listed user IDs can interact with the bot
- If unset: All users are denied unless
GATEWAY_ALLOW_ALL_USERS=trueis set
Users can find their IDs:
- Telegram: Message @userinfobot
- Discord: Enable Developer Mode, right-click name → Copy ID
Instead of static allowlists, users can pair via one-time codes:
- Unknown user DMs the bot → receives pairing code
- Owner runs
hermes pairing approve <platform> <code> - User is permanently authorized
Security: 8-char codes, 1-hour expiry, rate-limited (1/10min/user), max 3 pending per platform, lockout after 5 failed attempts, chmod 0600 on data files.
Files: gateway/pairing.py, hermes_cli/pairing.py
Hooks fire at lifecycle points. Place hook directories in ~/.hermes/hooks/:
~/.hermes/hooks/my-hook/
├── HOOK.yaml # name, description, events list
└── handler.py # async def handle(event_type, context): ...
Events: gateway:startup, session:start, session:reset, agent:start, agent:step, agent:end, command:*
The agent:step event fires each iteration of the tool-calling loop with tool names and results.
Files: gateway/hooks.py
When tool_progress is enabled in config.yaml, the bot sends status messages as it works:
💻 \ls -la`...` (terminal commands show the actual command)🔍 web_search...📄 web_extract...🐍 execute_code...(programmatic tool calling sandbox)🔀 delegate_task...(subagent delegation)❓ clarify...(user question, CLI-only)
Modes:
new: Only when switching to a different tool (less spam)all: Every single tool call
The gateway keeps the "typing..." indicator active throughout processing, refreshing every 4 seconds. This lets users know the bot is working even during long tool-calling sequences.
Each platform has a dedicated toolset in toolsets.py:
hermes-telegram: Full tools including terminal (with safety checks)hermes-discord: Full tools including terminalhermes-whatsapp: Full tools including terminal
Configuration files are stored in ~/.hermes/ for easy user access:
~/.hermes/config.yaml- All settings (model, terminal, compression, etc.)~/.hermes/.env- API keys and secrets
When adding new configuration variables, you MUST follow this process:
- Add to
DEFAULT_CONFIGinhermes_cli/config.py - CRITICAL: Bump
_config_versioninDEFAULT_CONFIGwhen adding required fields - This triggers migration prompts for existing users on next
hermes updateorhermes setup
Example:
DEFAULT_CONFIG = {
# ... existing config ...
"new_feature": {
"enabled": True,
"option": "default_value",
},
# BUMP THIS when adding required fields
"_config_version": 2, # Was 1, now 2
}- Add to
REQUIRED_ENV_VARSorOPTIONAL_ENV_VARSinhermes_cli/config.py - Include metadata for the migration system:
OPTIONAL_ENV_VARS = {
# ... existing vars ...
"NEW_API_KEY": {
"description": "What this key is for",
"prompt": "Display name in prompts",
"url": "https://where-to-get-it.com/",
"tools": ["tools_it_enables"], # What tools need this
"password": True, # Mask input
},
}hermes_cli/setup.py- Add prompts in the setup wizardcli-config.yaml.example- Add example with comments- Update README.md if user-facing
The system uses _config_version to detect outdated configs:
check_for_missing_config()compares user config toDEFAULT_CONFIGmigrate_config()interactively prompts for missing values- Called automatically by
hermes updateand optionally byhermes setup
API keys are loaded from ~/.hermes/.env:
OPENROUTER_API_KEY- Main LLM API access (primary provider)FIRECRAWL_API_KEY- Web search/extract toolsBROWSERBASE_API_KEY/BROWSERBASE_PROJECT_ID- Browser automationFAL_KEY- Image generation (FLUX model)NOUS_API_KEY- Vision and Mixture-of-Agents tools
Terminal tool configuration (in ~/.hermes/config.yaml):
terminal.backend- Backend: local, docker, singularity, modal, or sshterminal.cwd- Working directory ("." = host CWD for local only; for remote backends set an absolute path inside the target, or omit to use the backend's default)terminal.docker_image- Image for Docker backendterminal.singularity_image- Image for Singularity backendterminal.modal_image- Image for Modal backend- SSH:
TERMINAL_SSH_HOST,TERMINAL_SSH_USER,TERMINAL_SSH_KEYin .env
Agent behavior (in ~/.hermes/.env):
HERMES_MAX_ITERATIONS- Max tool-calling iterations (default: 60)MESSAGING_CWD- Working directory for messaging platforms (default: ~)display.tool_progressin config.yaml - Tool progress:off,new,all,verboseOPENAI_API_KEY- Voice transcription (Whisper STT)SLACK_BOT_TOKEN/SLACK_APP_TOKEN- Slack integration (Socket Mode)SLACK_ALLOWED_USERS- Comma-separated Slack user IDsHERMES_HUMAN_DELAY_MODE- Response pacing: off/natural/customHERMES_HUMAN_DELAY_MIN_MS/HERMES_HUMAN_DELAY_MAX_MS- Custom delay range
The terminal tool includes safety checks for potentially destructive commands (e.g., rm -rf, DROP TABLE, chmod 777, etc.):
Behavior by Backend:
- Docker/Singularity/Modal: Commands run unrestricted (isolated containers)
- Local/SSH: Dangerous commands trigger approval flow
Approval Flow (CLI):
⚠️ Potentially dangerous command detected: recursive delete
rm -rf /tmp/test
[o]nce | [s]ession | [a]lways | [d]eny
Choice [o/s/a/D]:
Approval Flow (Messaging):
- Command is blocked with explanation
- Agent explains the command was blocked for safety
- User must add the pattern to their allowlist via
hermes config editor run the command directly on their machine
Configuration:
command_allowlistin~/.hermes/config.yamlstores permanently allowed patterns- Add patterns via "always" approval or edit directly
Sudo Handling (Messaging):
- If sudo fails over messaging, output includes tip to add
SUDO_PASSWORDto~/.hermes/.env
The process tool works alongside terminal for managing long-running background processes:
Starting a background process:
terminal(command="pytest -v tests/", background=true)
# Returns: {"session_id": "proc_abc123", "pid": 12345, ...}Managing it with the process tool:
process(action="list")-- show all running/recent processesprocess(action="poll", session_id="proc_abc123")-- check status + new outputprocess(action="log", session_id="proc_abc123")-- full output with paginationprocess(action="wait", session_id="proc_abc123", timeout=600)-- block until doneprocess(action="kill", session_id="proc_abc123")-- terminateprocess(action="write", session_id="proc_abc123", data="y")-- send stdinprocess(action="submit", session_id="proc_abc123", data="yes")-- send + Enter
Key behaviors:
- Background processes execute through the configured terminal backend (local/Docker/Modal/SSH/Singularity) -- never directly on the host unless
TERMINAL_ENV=local - The
waitaction blocks the tool call until the process finishes, times out, or is interrupted by a new user message - PTY mode (
pty=trueon terminal) enables interactive CLI tools (Codex, Claude Code) - In RL training, background processes are auto-killed when the episode ends (
tool_context.cleanup()) - In the gateway, sessions with active background processes are exempt from idle reset
- The process registry checkpoints to
~/.hermes/processes.jsonfor crash recovery
Files: tools/process_registry.py (registry + handler), tools/terminal_tool.py (spawn integration)
Adding a tool requires changes in 2 files (the tool file and toolsets.py):
- Create
tools/your_tool.pywith handler, schema, check function, and registry call:
# tools/example_tool.py
import json
import os
from tools.registry import registry
def check_example_requirements() -> bool:
"""Check if required API keys/dependencies are available."""
return bool(os.getenv("EXAMPLE_API_KEY"))
def example_tool(param: str, task_id: str = None) -> str:
"""Execute the tool and return JSON string result."""
try:
result = {"success": True, "data": "..."}
return json.dumps(result, ensure_ascii=False)
except Exception as e:
return json.dumps({"error": str(e)}, ensure_ascii=False)
EXAMPLE_SCHEMA = {
"name": "example_tool",
"description": "Does something useful.",
"parameters": {
"type": "object",
"properties": {
"param": {"type": "string", "description": "The parameter"}
},
"required": ["param"]
}
}
registry.register(
name="example_tool",
toolset="example",
schema=EXAMPLE_SCHEMA,
handler=lambda args, **kw: example_tool(
param=args.get("param", ""), task_id=kw.get("task_id")),
check_fn=check_example_requirements,
requires_env=["EXAMPLE_API_KEY"],
)-
Add to
toolsets.py: Add"example_tool"to_HERMES_CORE_TOOLSif it should be in all platform toolsets, or create a new toolset entry. -
Add discovery import in
model_tools.py's_discover_tools()list:"tools.example_tool".
That's it. The registry handles schema collection, dispatch, availability checking, and error wrapping automatically. No edits to TOOLSET_REQUIREMENTS, handle_function_call(), get_all_tool_names(), or any other data structure.
Optional: Add to OPTIONAL_ENV_VARS in hermes_cli/config.py for the setup wizard, and to toolset_distributions.py for batch processing.
Special case: tools that need agent-level state (like todo, memory):
These are intercepted by run_agent.py's tool dispatch loop before handle_function_call(). The registry still holds their schemas, but dispatch returns a stub error as a safety fallback. See todo_tool.py for the pattern.
All tool handlers MUST return a JSON string. The registry's dispatch() wraps all exceptions in {"error": "..."} automatically.
Tools declare their requirements at registration time via check_fn and requires_env. The registry checks check_fn() when building tool definitions -- tools whose check fails are silently excluded.
Tools that maintain state (terminal, browser) require:
task_idparameter for session isolation between concurrent taskscleanup_*()function to release resources- Cleanup is called automatically in run_agent.py after conversation completes
Conversations are saved in ShareGPT format for training:
{"from": "system", "value": "System prompt with <tools>...</tools>"}
{"from": "human", "value": "User message"}
{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
{"from": "gpt", "value": "Final response"}Tool calls use <tool_call> XML tags, responses use <tool_response> tags, reasoning uses <think> tags.
agent = AIAgent(save_trajectories=True)
agent.chat("Do something")
# Saves to trajectories/*.jsonl in ShareGPT formatFor processing multiple prompts:
- Parallel execution with multiprocessing
- Content-based resume for fault tolerance (matches on prompt text, not indices)
- Toolset distributions control probabilistic tool availability per prompt
- Output:
data/<run_name>/trajectories.jsonl(combined) + individual batch files
python batch_runner.py \
--dataset_file=prompts.jsonl \
--batch_size=20 \
--num_workers=4 \
--run_name=my_runSkills are on-demand knowledge documents the agent can load. Compatible with the agentskills.io open standard.
skills/
├── mlops/ # Category folder
│ ├── axolotl/ # Skill folder
│ │ ├── SKILL.md # Main instructions (required)
│ │ ├── references/ # Additional docs, API specs
│ │ ├── templates/ # Output formats, configs
│ │ └── assets/ # Supplementary files (agentskills.io)
│ └── vllm/
│ └── SKILL.md
├── .hub/ # Skills Hub state (gitignored)
│ ├── lock.json # Installed skill provenance
│ ├── quarantine/ # Pending security review
│ ├── audit.log # Security scan history
│ ├── taps.json # Custom source repos
│ └── index-cache/ # Cached remote indexes
Progressive disclosure (token-efficient):
skills_categories()- List category names (~50 tokens)skills_list(category)- Name + description per skill (~3k tokens)skill_view(name)- Full content + tags + linked files
SKILL.md files use YAML frontmatter (agentskills.io format):
---
name: skill-name
description: Brief description for listing
version: 1.0.0
metadata:
hermes:
tags: [tag1, tag2]
related_skills: [other-skill]
---
# Skill Content...Skills Hub — user-driven skill search/install from online registries (GitHub, ClawHub, Claude marketplaces, LobeHub). Not exposed as an agent tool — the model cannot search for or install skills. Users manage skills via hermes skills ... CLI commands or the /skills slash command in chat.
Key files:
tools/skills_tool.py— Agent-facing skill list/view (progressive disclosure)tools/skills_guard.py— Security scanner (regex + LLM audit, trust-aware install policy)tools/skills_hub.py— Source adapters (GitHub, ClawHub, Claude marketplace, LobeHub), lock file, authhermes_cli/skills_hub.py— CLI subcommands +/skillsslash command handler
After making changes:
- Run
hermes doctorto check setup - Run
hermes config checkto verify config - Test with
hermes chat -q "test message" - For new config options, test fresh install:
rm -rf ~/.hermes && hermes setup