Skip to content

Commit 2d0e71d

Browse files
authored
Cli/breaking flag cleanup (#167)
* feat(cli)!: rename --show-reasoning, add --very-verbose + --base-dir, fix verbose tool-output gate BREAKING CHANGE: --show-reasoning / HYPERAGENT_SHOW_REASONING removed. Use --reasoning-effort / HYPERAGENT_REASONING_EFFORT instead. Same accepted levels (low / medium / high / xhigh, default 'high' when the flag is given without a value), same wiring into the Copilot SDK session -- the old name is gone with no silent fallback. Added: - --very-verbose / -vv / HYPERAGENT_VERY_VERBOSE: extends --verbose so the full result body prints for *every* tool (audit progress, plugin enable/disable, module registration, intent reports, handler registration, etc.), not just sandbox tools. Plain --verbose keeps the leaner default (sandbox tool bodies only; one-line ✅ Done for everything else). - --base-dir <path> / HYPERAGENT_BASE_DIR: auto-enables both the fs-read and fs-write plugins at startup with the supplied directory as their baseDir. Directory is created if missing; symlinks are still rejected. Independent of --auto-approve. Fixed: - --verbose was silently dropping non-sandbox tool result bodies. The event handler returned early for anything other than execute_javascript / execute_bash, so plugin_info, module_info, report_intent, register_handler, etc. always rendered as a terse ✅ Done even in verbose mode. The early-return is gone; --verbose now prints sandbox tool bodies and --very-verbose prints every tool body. Tests: new tests/cli-parser.test.ts covers all new flags plus rejection of the removed --show-reasoning. just check is green (45 files, 2491 tests pass). Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fixup! feat(cli)!: rename --show-reasoning, add --very-verbose + --base-dir, fix verbose tool-output gate Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fixup! feat(cli)!: rename --show-reasoning, add --very-verbose + --base-dir, fix verbose tool-output gate Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> --------- Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
1 parent 8d3b3c6 commit 2d0e71d

8 files changed

Lines changed: 576 additions & 51 deletions

File tree

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,19 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66

77
## [Unreleased]
88

9+
### Breaking Changes
10+
11+
- **`--show-reasoning` / `HYPERAGENT_SHOW_REASONING` removed** — the flag was misnamed (it controls reasoning effort, not the display of reasoning). Use `--reasoning-effort [level]` / `HYPERAGENT_REASONING_EFFORT` instead. Same accepted levels (`low` / `medium` / `high` / `xhigh`, default `high`), same wiring into the Copilot SDK session — the old name is gone, no silent fallback.
12+
13+
### Added
14+
15+
- **`--very-verbose` / `-vv` / `HYPERAGENT_VERY_VERBOSE`** — extends `--verbose` so the full result body is printed for **every** tool (audit progress, plugin enable/disable, module registration, intent reports, handler registration, …), not just the sandbox tools. `--verbose` on its own keeps the leaner default (sandbox tool bodies only; one-line `✅ Done` for everything else).
16+
- **`--base-dir <path>` / `HYPERAGENT_BASE_DIR`** — auto-enables both the `fs-read` and `fs-write` plugins at startup with the supplied directory as their `baseDir`. The directory is created if missing and symlinks are still rejected. Independent of `--auto-approve` — the flag itself is the approval signal for the two first-party path-jailed plugins.
17+
18+
### Fixed
19+
20+
- **`--verbose` ignored non-sandbox tool result bodies** — the event handler returned early for anything other than `execute_javascript` / `execute_bash`, so `plugin_info`, `module_info`, `report_intent`, `register_handler`, and friends always rendered as a terse `✅ Done` even in verbose mode. The early-return is gone; `--verbose` now prints sandbox tool bodies and `--very-verbose` prints every tool body.
21+
922
## [v0.6.1] - 2026-05-15
1023

1124
### Fixed

docs/USAGE.md

Lines changed: 30 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -6,31 +6,33 @@ Complete reference for HyperAgent configuration, features, and commands.
66

77
| Flag | Description |
88
| -------------------------- | -------------------------------------------------------------- |
9-
| `--model <name>` | LLM model (default: `claude-opus-4.6`) |
10-
| `--cpu-timeout <ms>` | CPU time limit per JS execution (default: 1000) |
11-
| `--wall-timeout <ms>` | Wall-clock backstop per execution (default: 5000) |
12-
| `--send-timeout <ms>` | Agent inactivity timeout (default: 300000) |
13-
| `--heap-size <MB>` | Guest heap size (default: 16) |
14-
| `--scratch-size <MB>` | Guest scratch size, includes stack (default: 16) |
15-
| `--profile <name>` | Apply resource profile at startup (stackable) |
16-
| `--skill <name>` | Invoke skill(s) before the prompt |
17-
| `--auto-approve` | Auto-approve all interactive prompts |
18-
| `--prompt "<text>"` | Non-interactive: send prompt, wait for completion, exit |
19-
| `--prompt-file <path>` | Read the non-interactive prompt from a file |
20-
| `--show-code` | Log generated JS to a timestamped file |
21-
| `--show-timing` | Log timing breakdown to a timestamped file |
22-
| `--show-reasoning [level]` | Set reasoning effort (low\|medium\|high\|xhigh, default: high) |
23-
| `--verbose` | Verbose output mode (scrolling reasoning, turn details) |
24-
| `--transcript` | Record session transcript to `~/.hyperagent/logs/` |
25-
| `--tune` | Capture LLM decision/reasoning logs to JSONL |
26-
| `--plugins-dir <path>` | Custom plugins directory (default: `./plugins`) |
27-
| `--list-models` | List available models and exit |
28-
| `--resume [id]` | Resume a previous session (latest if no ID given) |
29-
| `--skip-suggest` | Skip mandatory suggest_approach/API-discovery enforcement |
30-
| `--output-threshold <n>` | Large output threshold in bytes (default: 20480) |
31-
| `--debug` | Enable debug event/lifecycle logging |
32-
| `--version` | Show version and exit |
33-
| `--help` | Show help message |
9+
| `--model <name>` | LLM model (default: `claude-opus-4.6`) |
10+
| `--cpu-timeout <ms>` | CPU time limit per JS execution (default: 1000) |
11+
| `--wall-timeout <ms>` | Wall-clock backstop per execution (default: 5000) |
12+
| `--send-timeout <ms>` | Agent inactivity timeout (default: 300000) |
13+
| `--heap-size <MB>` | Guest heap size (default: 16) |
14+
| `--scratch-size <MB>` | Guest scratch size, includes stack (default: 16) |
15+
| `--profile <name>` | Apply resource profile at startup (stackable) |
16+
| `--skill <name>` | Invoke skill(s) before the prompt |
17+
| `--auto-approve` / `--yolo` | Auto-approve all interactive prompts |
18+
| `--base-dir <path>` | Auto-enable fs-read + fs-write with this directory as their base |
19+
| `--prompt "<text>"` | Non-interactive: send prompt, wait for completion, exit |
20+
| `--prompt-file <path>` | Read the non-interactive prompt from a file |
21+
| `--show-code` | Log generated JS to a timestamped file |
22+
| `--show-timing` | Log timing breakdown to a timestamped file |
23+
| `--reasoning-effort [level]` | Set reasoning effort (low\|medium\|high\|xhigh, default: high) |
24+
| `--verbose` | Stream reasoning + show sandbox tool result bodies |
25+
| `--very-verbose` / `-vv` | Like `--verbose` plus full result bodies for **every** tool (audit/registration/…) |
26+
| `--transcript` | Record session transcript to `~/.hyperagent/logs/` |
27+
| `--tune` | Capture LLM decision/reasoning logs to JSONL |
28+
| `--plugins-dir <path>` | Custom plugins directory (default: `./plugins`) |
29+
| `--list-models` | List available models and exit |
30+
| `--resume [id]` | Resume a previous session (latest if no ID given) |
31+
| `--skip-suggest` | Skip mandatory suggest_approach/API-discovery enforcement |
32+
| `--output-threshold <n>` | Large output threshold in bytes (default: 20480) |
33+
| `--debug` | Enable debug event/lifecycle logging |
34+
| `--version` | Show version and exit |
35+
| `--help` | Show help message |
3436

3537
## Environment Variables
3638

@@ -52,8 +54,10 @@ All configuration is also available via environment variables (overridden by CLI
5254
| `HYPERAGENT_PROMPT_FILE` | _(none)_ | File containing the non-interactive prompt |
5355
| `HYPERAGENT_SKILL` | _(none)_ | Skill name(s) to invoke |
5456
| `HYPERAGENT_TUNE` | _(none)_ | Set to `1` to capture LLM decision logs |
55-
| `HYPERAGENT_SHOW_REASONING` | _(none)_ | Reasoning effort level (low/medium/high/xhigh) |
57+
| `HYPERAGENT_REASONING_EFFORT` | _(none)_ | Reasoning effort level (low/medium/high/xhigh) |
5658
| `HYPERAGENT_VERBOSE` | _(none)_ | Set to `1` for verbose output mode |
59+
| `HYPERAGENT_VERY_VERBOSE` | _(none)_ | Set to `1` for very-verbose output (full body for **every** tool) |
60+
| `HYPERAGENT_BASE_DIR` | _(none)_ | Base directory for fs-read/fs-write plugins (auto-enables both) |
5761
| `HYPERAGENT_LIST_MODELS` | _(none)_ | Set to `1` to list models and exit |
5862
| `HYPERAGENT_RESUME_SESSION` | _(none)_ | Session ID to resume, or `__last__` for latest |
5963
| `HYPERAGENT_PLUGINS_DIR` | _(none)_ | Custom plugins directory path |

src/agent/cli-parser.ts

Lines changed: 113 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,38 @@
1010
import { readFileSync } from "node:fs";
1111
import type { MCPSetupCommand } from "./mcp/setup-commands.js";
1212

13+
/**
14+
* Reasoning-effort levels accepted by the Copilot SDK session config.
15+
* The CLI flag handler and the env-var initialiser both validate against
16+
* this list so an unexpected value (e.g. `HYPERAGENT_REASONING_EFFORT=potato`)
17+
* is rejected before it can reach the SDK and cause a runtime error.
18+
*/
19+
const VALID_REASONING_EFFORTS = ["low", "medium", "high", "xhigh"] as const;
20+
21+
/**
22+
* Normalise a reasoning-effort string from env or CLI: lowercase, then
23+
* accept only values in {@link VALID_REASONING_EFFORTS}. Anything else
24+
* (including `undefined`/empty) maps to `""` which the agent treats as
25+
* "unset" — the SDK falls back to the model's default.
26+
*/
27+
function normaliseReasoningEffort(raw: string | undefined): string {
28+
if (!raw) return "";
29+
const lower = raw.toLowerCase();
30+
return (VALID_REASONING_EFFORTS as readonly string[]).includes(lower)
31+
? lower
32+
: "";
33+
}
34+
35+
/**
36+
* Trim a base-dir string from env or CLI. A value of `" "` (whitespace-only)
37+
* becomes `""`, which the agent treats as "unset" and skips the fs-read /
38+
* fs-write auto-enable block — avoiding the surprising failure mode where
39+
* `resolve("")` returns `process.cwd()` and silently makes CWD the sandbox.
40+
*/
41+
function normaliseBaseDir(raw: string | undefined): string {
42+
return raw?.trim() ?? "";
43+
}
44+
1345
export interface CliConfig {
1446
model: string;
1547
cpuTimeout: string;
@@ -19,8 +51,23 @@ export interface CliConfig {
1951
scratchSize: string;
2052
showCode: boolean;
2153
showTiming: boolean;
22-
showReasoning: string;
54+
/**
55+
* Reasoning effort level requested for the session, one of
56+
* "low" | "medium" | "high" | "xhigh" — or "" when unset.
57+
* Wired to `state.reasoningEffort` at startup. CLI: `--reasoning-effort <level>`
58+
* (default: "high" when flag given without value). Env: HYPERAGENT_REASONING_EFFORT.
59+
*/
60+
reasoningEffort: string;
2361
verbose: boolean;
62+
/**
63+
* Very-verbose output: show full result bodies for *all* tools, including
64+
* non-sandbox protocol tools (plugin_info, module_info, register_handler,
65+
* suggest_approach, etc.). Plain `--verbose` only shows full bodies for
66+
* `execute_javascript` / `execute_bash`; `--very-verbose` adds the rest.
67+
* CLI: `--very-verbose` or `-vv`. Env: HYPERAGENT_VERY_VERBOSE.
68+
* Implies `--verbose`; standalone `-vv` enables both.
69+
*/
70+
veryVerbose: boolean;
2471
/** Render LLM markdown output with ANSI formatting (headings, code, lists). */
2572
markdown: boolean;
2673
transcript: boolean;
@@ -40,6 +87,15 @@ export interface CliConfig {
4087
* audit approvals, module registration). YOLO mode. 🎸
4188
*/
4289
autoApprove: boolean;
90+
/**
91+
* Base directory for the fs-read and fs-write plugins. When set, both
92+
* plugins are auto-enabled at startup with this directory as their
93+
* `baseDir` config, replacing the default "unique-temp-dir" sandbox.
94+
* CLI: `--base-dir <path>`. Env: HYPERAGENT_BASE_DIR.
95+
* Independent of `--auto-approve` / `--yolo` (works on its own).
96+
* Path is resolved relative to cwd; created if missing; symlinks rejected.
97+
*/
98+
baseDir: string;
4399
/**
44100
* Non-interactive prompt — send this message, wait for completion, exit.
45101
* Combines with --auto-approve for fully autonomous operation.
@@ -102,8 +158,12 @@ Options:
102158
--scratch-size <MB> Guest scratch size (default: ${defaults.scratchSize})
103159
--show-code Log generated JS to ~/.hyperagent/logs/
104160
--show-timing Log timing breakdown to ~/.hyperagent/logs/
105-
--show-reasoning [level] Set reasoning effort (low|medium|high|xhigh, default: high)
106-
--verbose Verbose output mode (scrolling reasoning, turn details)
161+
--reasoning-effort [level] Set reasoning effort (low|medium|high|xhigh, default: high)
162+
Env: HYPERAGENT_REASONING_EFFORT
163+
--verbose Stream reasoning + show sandbox tool result bodies
164+
--very-verbose, -vv Like --verbose, plus show full bodies for ALL tools
165+
(including plugin_info, module_info, register_handler, etc.)
166+
Env: HYPERAGENT_VERY_VERBOSE
107167
--[no-]markdown Toggle markdown rendering (default: on, env: HYPERAGENT_MARKDOWN)
108168
Aliases: --md, --no-md
109169
--transcript Record session transcript to ~/.hyperagent/logs/
@@ -115,7 +175,10 @@ Options:
115175
--profile <name> Apply resource profile at startup (limits only)
116176
Stack: --profile "web-research heavy-compute"
117177
Profiles: default, file-builder, web-research, heavy-compute, mcp-network
118-
--auto-approve Auto-approve all interactive prompts (YOLO mode)
178+
--auto-approve, --yolo Auto-approve all interactive prompts (YOLO mode)
179+
--base-dir <path> Base dir for fs-read + fs-write (auto-enables both plugins,
180+
created if missing, symlinks rejected)
181+
Env: HYPERAGENT_BASE_DIR
119182
--prompt "<text>" Send a prompt non-interactively and exit after completion
120183
--prompt-file <path> Read prompt from a file (avoids shell quoting issues)
121184
--skill <name> Invoke skill(s) before the prompt (e.g. --skill pptx-expert)
@@ -151,7 +214,10 @@ Environment variables (overridden by CLI flags):
151214
HYPERLIGHT_HEAP_SIZE_MB Heap size (megabytes)
152215
HYPERLIGHT_SCRATCH_SIZE_MB Scratch size (megabytes)
153216
HYPERAGENT_DEBUG Set to '1' for debug logging
217+
HYPERAGENT_REASONING_EFFORT Reasoning effort level (low/medium/high/xhigh)
154218
HYPERAGENT_VERBOSE Set to '1' for verbose output mode
219+
HYPERAGENT_VERY_VERBOSE Set to '1' for very-verbose output (all tool bodies)
220+
HYPERAGENT_BASE_DIR Base dir for fs-read + fs-write plugins
155221
HYPERAGENT_PROFILE Profile name(s) to apply at startup
156222
HYPERAGENT_PROMPT Non-interactive prompt text
157223
HYPERAGENT_PROMPT_FILE Path to file containing prompt text
@@ -180,8 +246,18 @@ export function parseCliArgs(
180246
scratchSize: process.env.HYPERLIGHT_SCRATCH_SIZE_MB || "16",
181247
showCode: false,
182248
showTiming: false,
183-
showReasoning: process.env.HYPERAGENT_SHOW_REASONING || "",
184-
verbose: process.env.HYPERAGENT_VERBOSE === "1",
249+
reasoningEffort: normaliseReasoningEffort(
250+
process.env.HYPERAGENT_REASONING_EFFORT,
251+
),
252+
// HYPERAGENT_VERY_VERBOSE implies HYPERAGENT_VERBOSE — keeps the env-var
253+
// path symmetric with the CLI flag (--very-verbose implies --verbose).
254+
// Without this, env-var-only --very-verbose would set `veryVerbose=true`
255+
// but `verbose=false`, and the event-handler gate (`verboseOutput && ...`)
256+
// would silently suppress all tool bodies.
257+
verbose:
258+
process.env.HYPERAGENT_VERBOSE === "1" ||
259+
process.env.HYPERAGENT_VERY_VERBOSE === "1",
260+
veryVerbose: process.env.HYPERAGENT_VERY_VERBOSE === "1",
185261
markdown: process.env.HYPERAGENT_MARKDOWN !== "0",
186262
transcript: process.env.HYPERAGENT_TRANSCRIPT === "1",
187263
listModels: process.env.HYPERAGENT_LIST_MODELS === "1",
@@ -191,6 +267,7 @@ export function parseCliArgs(
191267
tune: process.env.HYPERAGENT_TUNE === "1",
192268
profile: process.env.HYPERAGENT_PROFILE || "",
193269
autoApprove: process.env.HYPERAGENT_AUTO_APPROVE === "1",
270+
baseDir: normaliseBaseDir(process.env.HYPERAGENT_BASE_DIR),
194271
prompt: process.env.HYPERAGENT_PROMPT || "",
195272
promptFile: process.env.HYPERAGENT_PROMPT_FILE || "",
196273
skill: process.env.HYPERAGENT_SKILL || "",
@@ -251,22 +328,34 @@ export function parseCliArgs(
251328
case "--show-timing":
252329
config.showTiming = true;
253330
break;
254-
case "--show-reasoning": {
255-
// --show-reasoning can optionally take an effort level argument
331+
case "--reasoning-effort": {
332+
// --reasoning-effort can optionally take an effort level argument
256333
const nextArg = argv[i + 1];
257-
const validEfforts = ["low", "medium", "high", "xhigh"];
258-
if (nextArg && validEfforts.includes(nextArg.toLowerCase())) {
259-
config.showReasoning = nextArg.toLowerCase();
334+
if (
335+
nextArg &&
336+
(VALID_REASONING_EFFORTS as readonly string[]).includes(
337+
nextArg.toLowerCase(),
338+
)
339+
) {
340+
config.reasoningEffort = nextArg.toLowerCase();
260341
i++;
261342
} else {
262343
// No argument or invalid → default to "high"
263-
config.showReasoning = "high";
344+
config.reasoningEffort = "high";
264345
}
265346
break;
266347
}
267348
case "--verbose":
268349
config.verbose = true;
269350
break;
351+
case "--very-verbose":
352+
case "-vv":
353+
// --very-verbose implies --verbose: standalone -vv enables both.
354+
// Plain --verbose without --very-verbose only shows full bodies for
355+
// sandbox tools (execute_javascript / execute_bash).
356+
config.verbose = true;
357+
config.veryVerbose = true;
358+
break;
270359
case "--no-markdown":
271360
case "--no-md":
272361
config.markdown = false;
@@ -316,6 +405,18 @@ export function parseCliArgs(
316405
case "--yolo":
317406
config.autoApprove = true;
318407
break;
408+
case "--base-dir": {
409+
// Trim at parse-time: --base-dir " " should be rejected, not
410+
// silently resolved to process.cwd() later via resolve("").
411+
const raw = argv[++i] ?? "";
412+
const trimmed = raw.trim();
413+
if (!trimmed) {
414+
console.error("--base-dir requires a non-empty path");
415+
process.exit(1);
416+
}
417+
config.baseDir = trimmed;
418+
break;
419+
}
319420
case "--prompt":
320421
config.prompt = argv[++i] ?? "";
321422
if (!config.prompt) {

src/agent/commands.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ const COMMANDS: readonly CommandEntry[] = Object.freeze([
7070
"\n" +
7171
"Levels: low | medium | high | xhigh\n" +
7272
"\n" +
73-
"Also: --show-reasoning [level] CLI flag.",
73+
"Also: --reasoning-effort [level] CLI flag.",
7474
},
7575
{
7676
completion: "/reasoning audit ",

0 commit comments

Comments
 (0)