Skip to content

Commit 0de0198

Browse files
committed
feat(rules): add context window budget configuration for local/GPU systems
Adds opt-in AIWG_CONTEXT_WINDOW directive to guide parallel subagent limits and compaction aggressiveness for systems with smaller context windows (e.g., local GPU inference with 64k-256k tokens). New files: - agentic/code/addons/aiwg-utils/rules/context-budget.md Full rule with lookup table, formula, and platform guidance - agentic/code/addons/aiwg-utils/rules/subagent-scoping.md Moved from .claude/rules/ to addon source for deployment Modified: - CLAUDE.md: Added commented-out AIWG_CONTEXT_WINDOW directive in team directives section Key features: - Opt-in only (no behavior change when unset) - Limits concurrent parallel subagents, not total count - Formula: max_parallel = max(1, floor(context_window / 50000)) - Deploys to all 8 platforms via aiwg-utils addon - Guides compaction aggressiveness (aggressive/moderate/standard) Parallel limits table: - ≤64k: 1-2 parallel (aggressive compaction) - 65k-128k: 2-4 parallel (moderate compaction) - 129k-256k: 4-8 parallel (standard) - 257k-512k: 8-12 parallel (relaxed) - >512k: 12-20 parallel (normal) - Unset: no limit (platform decides) Closes #332
1 parent 2f806a6 commit 0de0198

File tree

3 files changed

+747
-0
lines changed

3 files changed

+747
-0
lines changed

CLAUDE.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -344,6 +344,15 @@ aiwg doctor
344344

345345
<!-- TEAM DIRECTIVES: Add project-specific guidance below this line -->
346346

347+
## Context Window Configuration (Optional)
348+
349+
<!-- Uncomment and set if running on a local/GPU system with limited context.
350+
This guides parallel subagent limits and compaction aggressiveness.
351+
Leave commented out for Anthropic cloud systems (1M+ context).
352+
See @.claude/rules/context-budget.md for the full lookup table. -->
353+
354+
<!-- AIWG_CONTEXT_WINDOW: 100000 -->
355+
347356
## What AIWG Is
348357

349358
**AI Writing Guide (AIWG)** is a framework that provides AI coding assistants with structured workflows, specialized agents, and artifact management. It's not just documentation - it's an operational system that:
Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
# Context Window Budget Rules
2+
3+
**Enforcement Level**: MEDIUM
4+
**Scope**: All parallel subagent spawning and task scheduling
5+
**Addon**: aiwg-utils (core, universal)
6+
7+
## Overview
8+
9+
When `AIWG_CONTEXT_WINDOW` is set in CLAUDE.md (or equivalent platform context file), agents MUST respect the declared context budget when deciding how many parallel subagents to spawn and how aggressively to compact context.
10+
11+
This directive is **opt-in**. When unset, behavior is unchanged and the platform decides parallelism limits. When set, it provides guidance for systems with smaller context windows (e.g., local GPU inference with 100k-256k context) where unconstrained parallel spawning causes client crashes.
12+
13+
## How to Configure
14+
15+
Users add the following directive to their CLAUDE.md team directives section (or equivalent for their platform):
16+
17+
```markdown
18+
<!-- AIWG_CONTEXT_WINDOW: 100000 -->
19+
```
20+
21+
The value is the total context window size in tokens. Common values:
22+
23+
| System | Typical Context Window |
24+
|--------|----------------------|
25+
| Local GPU (small) | 32,000-64,000 |
26+
| Local GPU (medium) | 100,000-128,000 |
27+
| Local GPU (large) | 200,000-256,000 |
28+
| Cloud API (standard) | 200,000-500,000 |
29+
| Anthropic cloud | 1,000,000+ |
30+
31+
## Mandatory Rules
32+
33+
### Rule 1: Check for AIWG_CONTEXT_WINDOW Before Parallel Spawning
34+
35+
Before spawning multiple subagents in parallel, check whether `AIWG_CONTEXT_WINDOW` is declared in the project's context file.
36+
37+
**If unset**: No change to default behavior. Spawn as many parallel subagents as the task decomposition requires.
38+
39+
**If set**: Use the lookup table below to determine the maximum number of concurrent parallel subagents.
40+
41+
### Rule 2: Parallel Subagent Limits
42+
43+
When `AIWG_CONTEXT_WINDOW` is set, respect these limits:
44+
45+
| Context Window | Max Parallel Subagents | Compaction Behavior |
46+
|----------------|----------------------|---------------------|
47+
| Unset (default) | No limit (platform decides) | Normal |
48+
| ≤64k | 1-2 | Aggressive — prefer sequential |
49+
| 65k-128k | 2-4 | Moderate — batch in groups of 2-3 |
50+
| 129k-256k | 4-8 | Standard |
51+
| 257k-512k | 8-12 | Relaxed |
52+
| >512k | 12-20 | Normal (cloud default) |
53+
54+
**Formula**: `max_parallel = max(1, floor(context_window / 50000))` capped at 20.
55+
56+
**Important**: This limits how many subagents run *concurrently*, not the total number of subagents an orchestrator can spawn. If a task decomposes into 20 atomic subtasks but the budget allows only 3 parallel, run them in waves of 3.
57+
58+
### Rule 3: Prefer Sequential Batching When Budget Is Tight
59+
60+
When context budget is ≤128k tokens:
61+
62+
**FORBIDDEN**:
63+
```
64+
Task decomposes into 10 subtasks.
65+
Spawn all 10 subagents simultaneously.
66+
→ Client crashes or context overflow
67+
```
68+
69+
**REQUIRED**:
70+
```
71+
Task decomposes into 10 subtasks.
72+
Context budget allows 2-3 parallel.
73+
74+
Wave 1: Subagents 1, 2, 3 (parallel)
75+
→ Wait for completion
76+
Wave 2: Subagents 4, 5, 6 (parallel)
77+
→ Wait for completion
78+
Wave 3: Subagents 7, 8, 9 (parallel)
79+
→ Wait for completion
80+
Wave 4: Subagent 10
81+
→ Done
82+
```
83+
84+
### Rule 4: Keep Individual Subagent Tasks Smaller Under Tight Budgets
85+
86+
When context budget is constrained, each subagent should produce less output to avoid accumulating too much context in the parent agent's window.
87+
88+
| Context Window | Target Output Per Subagent |
89+
|----------------|--------------------------|
90+
| ≤64k | <3k tokens |
91+
| 65k-128k | <5k tokens |
92+
| 129k-256k | <10k tokens |
93+
| >256k | Normal (no specific limit) |
94+
95+
### Rule 5: This Is Guidance, Not Programmatic Enforcement
96+
97+
These rules operate at the prompt/convention level. There is no runtime system that enforces them. Agents are expected to read the directive and adjust their behavior accordingly. The rules provide clear lookup tables so the agent can make the right decision without complex reasoning.
98+
99+
## Compaction Guidance
100+
101+
### Aggressive Compaction (≤64k)
102+
103+
- Run subagents sequentially or in pairs
104+
- Summarize subagent results before spawning the next batch
105+
- Prefer returning concise structured data (JSON, tables) over prose
106+
- Each subagent task should be self-contained with minimal context
107+
108+
### Moderate Compaction (65k-128k)
109+
110+
- Batch subagents in groups of 2-3
111+
- Wait for each batch to complete before starting the next
112+
- Keep subagent output focused — specify word/token limits in the task description
113+
- Summarize intermediate results when aggregating across batches
114+
115+
### Standard Operation (129k-256k)
116+
117+
- Parallel fan-out up to 4-8 subagents
118+
- Normal output expectations
119+
- Standard context management
120+
121+
### Relaxed/Normal (>256k)
122+
123+
- Full parallel fan-out as task decomposition dictates
124+
- No special compaction needed
125+
- Standard behavior as if directive is unset
126+
127+
## Integration
128+
129+
### With Subagent Scoping Rules
130+
131+
This rule extends `@.claude/rules/subagent-scoping.md` Rule 7 (Context Budget Estimation). When `AIWG_CONTEXT_WINDOW` is set, the context budget formula uses the declared value instead of assuming a default window size.
132+
133+
### With Ralph Loops
134+
135+
Ralph loops that spawn parallel subagents (e.g., via `rlm-batch` or `parallel-dispatch`) should batch subagent waves according to the context budget table. A Ralph loop on a 100k system should run 2-3 subagents per wave, not 10.
136+
137+
### With Orchestrator Fan-Out
138+
139+
Orchestrator patterns that decompose tickets into N subtasks should cap concurrent fan-out to the budget limit. All N subtasks still execute — just in sequential waves rather than all at once.
140+
141+
## Examples
142+
143+
### Example 1: Local GPU System (100k Context)
144+
145+
```markdown
146+
<!-- In CLAUDE.md team directives -->
147+
<!-- AIWG_CONTEXT_WINDOW: 100000 -->
148+
```
149+
150+
Agent receives task requiring 8 parallel subagents:
151+
152+
```
153+
Context budget: 100k → max_parallel = max(1, floor(100000/50000)) = 2
154+
Compaction: Moderate
155+
156+
Wave 1: Subagents 1, 2 (parallel)
157+
Wave 2: Subagents 3, 4 (parallel)
158+
Wave 3: Subagents 5, 6 (parallel)
159+
Wave 4: Subagents 7, 8 (parallel)
160+
161+
Each subagent targets <5k output tokens.
162+
```
163+
164+
### Example 2: Cloud System (No Directive)
165+
166+
```markdown
167+
<!-- No AIWG_CONTEXT_WINDOW set -->
168+
```
169+
170+
Agent receives task requiring 8 parallel subagents:
171+
172+
```
173+
No budget constraint. Spawn all 8 in parallel.
174+
Normal output expectations.
175+
```
176+
177+
### Example 3: Small Local Model (32k Context)
178+
179+
```markdown
180+
<!-- AIWG_CONTEXT_WINDOW: 32000 -->
181+
```
182+
183+
Agent receives task requiring 5 parallel subagents:
184+
185+
```
186+
Context budget: 32k → max_parallel = max(1, floor(32000/50000)) = 1
187+
Compaction: Aggressive — fully sequential
188+
189+
Run subagents 1 through 5 sequentially.
190+
Each subagent targets <3k output tokens.
191+
Summarize each result before proceeding to next.
192+
```
193+
194+
## Platform Applicability
195+
196+
This directive works across all AIWG-supported platforms:
197+
198+
| Platform | Where to Set |
199+
|----------|-------------|
200+
| Claude Code | `CLAUDE.md` team directives |
201+
| GitHub Copilot | `.github/copilot-instructions.md` |
202+
| OpenAI Codex | `~/.codex/instructions.md` or `AGENTS.md` |
203+
| Cursor | `.cursor/rules/` |
204+
| Warp | `WARP.md` |
205+
| Factory AI | `.factory/` context |
206+
| OpenCode | `.opencode/` context |
207+
| Windsurf | `AGENTS.md` or `.windsurf/rules/` |
208+
209+
The `aiwg use` and `aiwg regenerate` commands will include the directive (commented out) in generated context files when scaffolding new projects.
210+
211+
## Checklist
212+
213+
Before spawning parallel subagents:
214+
215+
- [ ] Check if `AIWG_CONTEXT_WINDOW` is declared in the project context
216+
- [ ] If set, look up the max parallel count from the table
217+
- [ ] If parallel count exceeds budget, batch into sequential waves
218+
- [ ] Adjust subagent output expectations based on budget tier
219+
- [ ] If unset, proceed with normal parallel behavior
220+
221+
## References
222+
223+
- @.claude/rules/subagent-scoping.md - Subagent scoping and delegation rules
224+
- @.claude/rules/subagent-scoping.md#rule-7 - Context budget estimation
225+
- @.claude/rules/instruction-comprehension.md - Understanding user constraints
226+
227+
---
228+
229+
**Rule Status**: ACTIVE
230+
**Last Updated**: 2026-02-09

0 commit comments

Comments
 (0)