Skip to content

Commit 8228514

Browse files
authored
Merge pull request #193 from Opencode-DCP/refactor/simplify-extract-distillation
Simplify extract distillation to array format
2 parents 276ad40 + 92bff5e commit 8228514

File tree

7 files changed

+47
-88
lines changed

7 files changed

+47
-88
lines changed

lib/messages/prune.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ import { extractParameterKey, buildToolIdList } from "./utils"
66
import { getLastAssistantMessage, getLastUserMessage, isMessageCompacted } from "../shared-utils"
77
import { AssistantMessage, UserMessage } from "@opencode-ai/sdk"
88

9-
const PRUNED_TOOL_INPUT_REPLACEMENT = "[Input removed to save context]"
9+
const PRUNED_TOOL_INPUT_REPLACEMENT =
10+
"[content removed to save context, this is not what was written to the file, but a placeholder]"
1011
const PRUNED_TOOL_OUTPUT_REPLACEMENT =
1112
"[Output removed to save context - information superseded or no longer needed]"
1213
const getNudgeString = (config: PluginConfig): string => {
@@ -50,7 +51,7 @@ const SYNTHETIC_MESSAGE_ID = "msg_01234567890123456789012345"
5051
const SYNTHETIC_PART_ID = "prt_01234567890123456789012345"
5152
const SYNTHETIC_USER_MESSAGE_ID = "msg_01234567890123456789012346"
5253
const SYNTHETIC_USER_PART_ID = "prt_01234567890123456789012346"
53-
const REASONING_MODEL_USER_MESSAGE_CONTENT = "<system-context-injection/>"
54+
const REASONING_MODEL_USER_MESSAGE_CONTENT = "[internal: context sync - no response needed]"
5455

5556
const buildPrunableToolsList = (
5657
state: SessionState,

lib/prompts/discard-tool-spec.txt

Lines changed: 11 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -3,34 +3,29 @@ Discards tool outputs from context to manage conversation size and reduce noise.
33
## IMPORTANT: The Prunable List
44
A `<prunable-tools>` list is provided to you showing available tool outputs you can discard when there are tools available for pruning. Each line has the format `ID: tool, parameter` (e.g., `20: read, /path/to/file.ts`). You MUST only use numeric IDs that appear in this list to select which tools to discard.
55

6-
**Note:** For `write` and `edit` tools, discarding removes the input content (the code being written/edited) while preserving the output confirmation. This is useful after completing a file modification when you no longer need the raw content in context.
7-
86
## When to Use This Tool
97

10-
Use `discard` for removing tool outputs that are no longer needed **without preserving their content**:
11-
12-
### 1. Task Completion (Clean Up)
13-
**When:** You have successfully completed a specific unit of work (e.g., fixed a bug, wrote a file, answered a question).
14-
**Action:** Discard the tools used for that task with reason `completion`.
8+
Use `discard` for removing tool content that is no longer needed
159

16-
### 2. Removing Noise (Garbage Collection)
17-
**When:** You have read files or run commands that turned out to be irrelevant, unhelpful, or outdated (meaning later tools have provided fresher, more valid information).
18-
**Action:** Discard these specific tool outputs immediately with reason `noise`.
10+
- **Noise:** Irrelevant, unhelpful, or superseded outputs that provide no value.
11+
- **Task Completion:** Work is complete and there's no valuable information worth preserving.
1912

2013
## When NOT to Use This Tool
2114

22-
- **If you need to preserve information:** Keep the raw output in context rather than discarding it.
23-
- **If you'll need the output later:** Don't discard files you plan to edit, or context you'll need for implementation.
15+
- **If the output contains useful information:** Use `extract` instead to preserve key findings.
16+
- **If you'll need the output later:** Don't discard files you plan to edit or context you'll need for implementation.
2417

2518
## Best Practices
2619
- **Strategic Batching:** Don't discard single small tool outputs (like short bash commands) unless they are pure noise. Wait until you have several items to perform high-impact discards.
2720
- **Think ahead:** Before discarding, ask: "Will I need this output for an upcoming task?" If yes, keep it.
2821

2922
## Format
30-
The `ids` parameter is an array where the first element is the reason, followed by numeric IDs:
31-
`ids: ["reason", "id1", "id2", ...]`
3223

33-
## Examples
24+
- `ids`: Array where the first element is the reason, followed by numeric IDs from the `<prunable-tools>` list
25+
26+
Reasons: `noise` | `completion`
27+
28+
## Example
3429

3530
<example_noise>
3631
Assistant: [Reads 'wrong_file.ts']
@@ -40,17 +35,7 @@ This file isn't relevant to the auth system. I'll remove it to clear the context
4035

4136
<example_completion>
4237
Assistant: [Runs tests, they pass]
43-
The tests passed. I'll clean up now.
38+
The tests passed and I don't need to preserve any details. I'll clean up now.
4439
[Uses discard with ids: ["completion", "20", "21"]]
4540
</example_completion>
4641

47-
<example_keep>
48-
Assistant: [Reads 'auth.ts' to understand the login flow]
49-
I've understood the auth flow. I'll need to modify this file to add the new validation, so I'm keeping this read in context rather than discarding.
50-
</example_keep>
51-
52-
<example_edit_completion>
53-
Assistant: [Edits 'auth.ts' to add validation]
54-
The edit was successful. I no longer need the raw edit content in context.
55-
[Uses discard with ids: ["completion", "15"]]
56-
</example_edit_completion>

lib/prompts/extract-tool-spec.txt

Lines changed: 18 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -7,70 +7,38 @@ A `<prunable-tools>` list is provided to you showing available tool outputs you
77

88
Use `extract` when you have gathered useful information that you want to **preserve in distilled form** before removing the raw outputs:
99

10-
### 1. Task Completion
11-
**When:** You have completed a unit of work and want to preserve key findings.
12-
**Action:** Extract with distillation scaled to the value of the content. High-value insights require comprehensive capture; routine completions can use lighter distillation.
13-
14-
### 2. Knowledge Preservation
15-
**When:** You have read files, run commands, or gathered context that contains valuable information you'll need to reference later, but the full raw output is too large to keep.
16-
**Action:** Convert raw data into distilled knowledge. This allows you to remove large outputs (like full file reads) while keeping only the specific parts you need (like a single function signature or constant).
17-
18-
## CRITICAL: Distillation Requirements
19-
20-
You MUST provide distilled findings in the `distillation` parameter. This is not optional.
21-
22-
- **Comprehensive Capture:** Distillation is not just a summary. It must be a high-fidelity representation of the technical details. If you read a file, the distillation should include function signatures, specific logic flows, constant values, and any constraints or edge cases discovered.
23-
- **Task-Relevant Verbosity:** Be as verbose as necessary to ensure that the "distilled" version is a complete substitute for the raw output for the task at hand. If you will need to reference a specific algorithm or interface later, include it in its entirety within the distillation.
24-
- **Extract Per-ID:** When extracting from multiple tools, your `distillation` object MUST contain a corresponding entry for EVERY ID being extracted. You must capture high-fidelity findings for each tool individually to ensure no signal is lost.
25-
- **Structure:** Map EVERY `ID` from the `ids` array to its specific distilled findings.
26-
Example: `{ "20": { ... }, "21": { ... } }`
27-
- Capture all relevant details (function names, logic, constraints) to ensure no signal is lost.
28-
- Prioritize information that is essential for the immediate next steps of your plan.
10+
- **Task Completion:** You completed a unit of work and want to preserve key findings.
11+
- **Knowledge Preservation:** You have context that contains valuable information, but also a lot of unnecessary detail - you only need to preserve some specifics.
2912

3013
## When NOT to Use This Tool
3114

32-
- **If you need precise syntax:** If you'll need to edit a file, grep for exact strings, or reference precise syntax, keep the raw output. Distillation works for understanding; implementation often requires the original.
33-
- **If uncertain:** Prefer keeping over re-fetching. The cost of retaining context is lower than the cost of redundant tool calls.
15+
- **If you need precise syntax:** If you'll edit a file or grep for exact strings, keep the raw output.
16+
- **If uncertain:** Prefer keeping over re-fetching.
17+
3418

3519
## Best Practices
36-
- **Technical Fidelity:** Ensure that types, parameters, and return values are preserved if they are relevant to upcoming implementation steps.
3720
- **Strategic Batching:** Wait until you have several items or a few large outputs to extract, rather than doing tiny, frequent extractions. Aim for high-impact extractions that significantly reduce context size.
3821
- **Think ahead:** Before extracting, ask: "Will I need the raw output for an upcoming task?" If you researched a file you'll later edit, do NOT extract it.
3922

4023
## Format
41-
The `ids` parameter is an array of numeric IDs as strings:
42-
`ids: ["id1", "id2", ...]`
4324

44-
The `distillation` parameter is an object mapping each ID to its distilled findings:
45-
`distillation: { "id1": { ...findings... }, "id2": { ...findings... } }`
25+
- `ids`: Array of numeric IDs as strings from the `<prunable-tools>` list
26+
- `distillation`: Array of strings, one per ID (positional: distillation[0] is for ids[0], etc.)
27+
28+
Each distillation string should capture the essential information you need to preserve - function signatures, logic, constraints, values, etc. Be as detailed as needed for your task.
4629

4730
## Example
4831

4932
<example_extraction>
50-
Assistant: [Reads service implementation, types, and config]
51-
I'll preserve the full technical specification and implementation logic before extracting.
52-
[Uses extract with ids: ["10", "11", "12"], distillation: {
53-
"10": {
54-
"file": "src/services/auth.ts",
55-
"signatures": [
56-
"async function validateToken(token: string): Promise<User | null>",
57-
"function hashPassword(password: string): string"
58-
],
59-
"logic": "The validateToken function first checks the local cache before calling the external OIDC provider. It uses a 5-minute TTL for cached tokens.",
60-
"dependencies": ["import { cache } from '../utils/cache'", "import { oidc } from '../config'"],
61-
"constraints": "Tokens must be at least 128 chars long. hashPassword uses bcrypt with 12 rounds."
62-
},
63-
"11": {
64-
"file": "src/types/user.ts",
65-
"interface": "interface User { id: string; email: string; permissions: ('read' | 'write' | 'admin')[]; status: 'active' | 'suspended'; }",
66-
"context": "The permissions array is strictly typed and used by the RBAC middleware."
67-
},
68-
"12": {
69-
"file": "config/default.json",
70-
"values": { "PORT": 3000, "RETRY_STRATEGY": "exponential", "MAX_ATTEMPTS": 5 },
71-
"impact": "The retry strategy affects all outgoing HTTP clients in the core module."
72-
}
73-
}]
33+
Assistant: [Reads auth service and user types]
34+
I'll preserve the key details before extracting.
35+
[Uses extract with:
36+
ids: ["10", "11"],
37+
distillation: [
38+
"auth.ts: validateToken(token: string) -> User|null checks cache first (5min TTL) then OIDC. hashPassword uses bcrypt 12 rounds. Tokens must be 128+ chars.",
39+
"user.ts: interface User { id: string; email: string; permissions: ('read'|'write'|'admin')[]; status: 'active'|'suspended' }"
40+
]
41+
]
7442
</example_extraction>
7543

7644
<example_keep>

lib/prompts/system/system-prompt-both.txt

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,9 @@ TWO TOOLS FOR CONTEXT MANAGEMENT
99
- `extract`: Extract key findings into distilled knowledge before removing raw outputs. Use when you need to preserve information.
1010

1111
CHOOSING THE RIGHT TOOL
12-
Ask: "Do I need to preserve any information from this output?"
13-
- **No** → `discard` (default for cleanup)
14-
- **Yes** → `extract` (preserves distilled knowledge)
15-
- **Uncertain** → `extract` (safer, preserves signal)
12+
Ask: "Is this output clearly noise or irrelevant?"
13+
- **Yes** → `discard` (pure cleanup, no preservation)
14+
- **No** → `extract` (default - preserves key findings)
1615

1716
Common scenarios:
1817
- Task complete, no valuable context → `discard`
@@ -39,5 +38,7 @@ When in doubt, keep it. Batch your actions and aim for high-impact prunes that s
3938
FAILURE TO PRUNE will result in context leakage and DEGRADED PERFORMANCES.
4039
There may be tools in session context that do not appear in the <prunable-tools> list, this is expected, you can ONLY prune what you see in <prunable-tools>.
4140

41+
If you see a user message containing only `[internal: context sync - no response needed]`, this is an internal system marker used for context injection - it is NOT user input. Do not acknowledge it, do not respond to it, and do not mention it. Simply continue with your current task or wait for actual user input.
42+
4243
</instruction>
4344
</system-reminder>

lib/prompts/system/system-prompt-discard.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,5 +30,7 @@ When in doubt, keep it. Batch your actions and aim for high-impact discards that
3030
FAILURE TO DISCARD will result in context leakage and DEGRADED PERFORMANCES.
3131
There may be tools in session context that do not appear in the <prunable-tools> list, this is expected, you can ONLY discard what you see in <prunable-tools>.
3232

33+
If you see a user message containing only `[internal: context sync - no response needed]`, this is an internal system marker used for context injection - it is NOT user input. Do not acknowledge it, do not respond to it, and do not mention it. Simply continue with your current task or wait for actual user input.
34+
3335
</instruction>
3436
</system-reminder>

lib/prompts/system/system-prompt-extract.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,5 +30,7 @@ When in doubt, keep it. Batch your actions and aim for high-impact extractions t
3030
FAILURE TO EXTRACT will result in context leakage and DEGRADED PERFORMANCES.
3131
There may be tools in session context that do not appear in the <prunable-tools> list, this is expected, you can ONLY extract what you see in <prunable-tools>.
3232

33+
If you see a user message containing only `[internal: context sync - no response needed]`, this is an internal system marker used for context injection - it is NOT user input. Do not acknowledge it, do not respond to it, and do not mention it. Simply continue with your current task or wait for actual user input.
34+
3335
</instruction>
3436
</system-reminder>

lib/strategies/tools.ts

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,13 +32,13 @@ async function executePruneOperation(
3232
ids: string[],
3333
reason: PruneReason,
3434
toolName: string,
35-
distillation?: Record<string, any>,
35+
distillation?: string[],
3636
): Promise<string> {
3737
const { client, state, logger, config, workingDirectory } = ctx
3838
const sessionId = toolCtx.sessionID
3939

4040
logger.info(`${toolName} tool invoked`)
41-
logger.info(JSON.stringify({ ids, reason }))
41+
logger.info(JSON.stringify(reason ? { ids, reason } : { ids }))
4242

4343
if (!ids || ids.length === 0) {
4444
logger.debug(`${toolName} tool called but ids is empty or undefined`)
@@ -171,17 +171,17 @@ export function createExtractTool(ctx: PruneToolContext): ReturnType<typeof tool
171171
.array(tool.schema.string())
172172
.describe("Numeric IDs as strings to extract from the <prunable-tools> list"),
173173
distillation: tool.schema
174-
.record(tool.schema.string(), tool.schema.any())
174+
.array(tool.schema.string())
175175
.describe(
176-
"REQUIRED. An object mapping each ID to its distilled findings. Must contain an entry for every ID being pruned.",
176+
"REQUIRED. Array of strings, one per ID (positional: distillation[0] is for ids[0], etc.)",
177177
),
178178
},
179179
async execute(args, toolCtx) {
180-
if (!args.distillation || Object.keys(args.distillation).length === 0) {
180+
if (!args.distillation || args.distillation.length === 0) {
181181
ctx.logger.debug(
182182
"Extract tool called without distillation: " + JSON.stringify(args),
183183
)
184-
return 'Missing distillation. You must provide distillation data when using extract. Format: distillation: { "id": { ...findings... } }'
184+
return "Missing distillation. You must provide a distillation string for each ID."
185185
}
186186

187187
// Log the distillation for debugging/analysis

0 commit comments

Comments
 (0)