Allow nelson to act on cost-saving instructions#23
Allow nelson to act on cost-saving instructions#23LannyRipple wants to merge 1 commit intoharrymunro:mainfrom
Conversation
LannyRipple
commented
Mar 4, 2026
- Add cost-savings primer in SKILL.md referencing model-selection
- Add haiku only blocks in crew-briefing
- Add cost-weight value in crew-roles and royal-marines. Weights and adjustments are explained in model-selection
|
Will collide with PR #22 because of formatting changes. Easy fix and I'll rebase when #22 goes in. Had Claude do most of the heavy lifting and I like what we came up with. Here's an overview of what was decided and some of the motivation. Cost-Savings ModeWhat Was AddedNelson can now operate in cost-savings mode. When the sailing orders express cost concern ("keep costs low", "be aggressive with cost savings", etc.), Nelson loads Five files were changed or created:
Design DecisionsTriggering: Natural Language Inference OnlyCost-savings mode activates through the admiral's reading of the sailing orders, not through an explicit flag or parameter. This keeps the interface simple and consistent with how Nelson handles other constraints (budget, scope, compliance rules) — they are part of the sailing orders, not separate controls. The intensity of the language matters: modest language ("stay within budget") produces modest weight pressure; emphatic language ("be aggressive") licenses the admiral to push more roles below the haiku threshold. Weight System: Hybrid Static/DynamicEach crew role and marine specialisation carries a default
This avoids a purely static table (which would be wrong for too many edge cases) while also avoiding a fully dynamic system (which would be inconsistent and hard to reason about). The defaults are an anchor, not a ceiling. Binary Model Selection: Haiku or Admiral's ModelThe threshold is simple: weight ≤ 4 → haiku; weight ≥ 5 → admiral's model. No middle tier. The key reason: the binary approach has a forcing function built in. When a task sits in a gray zone — too complex for haiku, but not clearly worth the admiral's model — the correct response is to decompose the task further, not to reach for an intermediate model. Decomposition produces better parallelism and clearer deliverables. A middle tier would paper over under-decomposed tasks rather than surface them. A secondary reason: the admiral's model is already chosen by the user as the right capability level for the mission. Anything that doesn't qualify for haiku should match that ceiling. The "admiral sets the ceiling" pattern also scales cleanly — a user running sonnet as their admiral model gets haiku vs sonnet automatically; a user running opus gets haiku vs opus. Nelson doesn't need to reason about the relative capabilities of specific model versions. No Version StringsAll agents at weight ≥ 5 inherit the admiral's model via the Briefing Enhancements Are ConditionalThe three haiku-specific blocks (identity anchor, output format, task decomposition prompt) are only added when haiku is assigned. The motivation: less capable models benefit from explicit grounding — confirming they are operating as a real agent, not in a roleplay context — and from tighter output contracts. These additions would be noise in a full-capability briefing and are omitted there. Transparency: No AnnouncementNelson does not announce that cost-savings mode is active. Users who specify cost savings understand the implication. Adding a banner or summary would add ceremony without value. Future ConsiderationsIntroducing a Mid-Tier ModelThe current binary threshold works well when the weight distribution is bimodal — most tasks are clearly haiku-worthy or clearly not. If a user running opus notices a large class of weight-5–7 tasks that feel overserved by opus but underserved by haiku, a The sonnet/opus performance relationship is not stable across task types or model generations, which is an additional reason to keep any mid-tier designation opt-in rather than baked into the default weight logic. Per-Mission Weight OverridesThe weight table provides role-level defaults. A future extension could allow the sailing orders to override specific role weights directly (e.g., "treat PWOs as weight 3 for this mission"). This would let the admiral dial in cost pressure at the role level without editing the reference files. Cost Reporting in the Captain's LogThe captain's log currently records decisions, diffs, and validation evidence. Adding a cost section — estimated or actual token usage per agent, and total mission cost — would close the feedback loop for users trying to calibrate cost-savings settings across missions. Explicit Cost-Savings FlagNatural language inference is the current trigger. An explicit sailing-orders field (e.g., |
* Add cost-savings primer in SKILL.md referencing model-selection * Add haiku only blocks in crew-briefing * Add cost-weight value in crew-roles and royal-marines. Weights and adjustments are explained in model-selection
1140750 to
ec8f157
Compare
|
Hey @LannyRipple thanks for this. All sounds good. Please fix the conflict with the main branch. By the way if you haven't already you may wish to disable Claude Code author commits - it means that on merging these PRs to main Claude gets noted as the contributor and not yourself. See: https://www.reddit.com/r/ClaudeCode/comments/1qndhpd/how_to_remove_claude_as_coauthor_from_commits/ |