feat(client): add hard compaction option preserving system segment by HUQIANTAO · Pull Request #2522 · Hmbown/CodeWhale

HUQIANTAO · 2026-06-01T16:24:07Z

Summary

Add a hard compaction mode that replaces the middle conversation history with a single summary message while preserving the system prompt segment and the last N recent messages (default 8). This is opt-in via the hard_enabled configuration flag.

Background

The existing compaction system uses a soft-seam approach: it appends <archived_context> summary blocks to the message array without removing old messages. This preserves all verbatim history but keeps growing the message array. When the context window is under pressure, the soft-seam approach may not free enough budget.

Hard compaction is more aggressive: it replaces the middle history entirely with a summary message. The trade-off is that old messages are lost (they're summarized), but the system prompt segment remains byte-identical, so the prefix cache stays hot after compaction.

Design

Message structure after hard compaction

[summary_message, recent_message_1, recent_message_2, ..., recent_message_8]

The system prompt (stored separately in session.system_prompt) is never touched. The summary message replaces the middle history, and the recent tail is preserved verbatim.

Configuration

Two new fields in CompactionConfig:

hard_enabled: bool (default: false) — opt-in flag
hard_keep_recent: usize (default: 8) — number of recent messages to preserve

API

plan_hard_compaction(messages, keep_recent) -> Option<HardCompactionPlan>: Plan which messages to summarize and which to keep.
compact_hard_safe(client, messages, config, workspace) -> Result<HardCompactionResult>: Execute the hard compaction using the existing LLM summarization infrastructure.

Summary format

The summary is wrapped in <hard_compaction_summary> tags so the model can distinguish it from verbatim history:

<hard_compaction_summary>
[LLM-generated summary of the middle history]
</hard_compaction_summary>

Why this matters

Hard compaction is the highest-leverage tool for freeing context budget when the window is under pressure. Unlike soft seams (which keep growing the message array), hard compaction replaces the middle history entirely. The system prompt segment (which is the largest stable block) is never touched, so the prefix cache remains hot after compaction.

This is especially valuable for long sessions where the context window is exhausted: instead of starting a new cycle (which resets the prefix cache), hard compaction can free enough budget to continue the current cycle with the prefix cache intact.

Testing

3 new unit tests:

plan_hard_compaction_returns_none_when_too_few_messages: Verify graceful handling of insufficient messages.
plan_hard_compaction_preserves_recent_tail: Verify the recent tail is preserved correctly.
hard_compaction_config_defaults: Verify the default configuration values.

All 59 compaction tests pass.

Files changed

crates/tui/src/compaction.rs: Add hard compaction types, functions, and tests (223 lines added)

greptile-apps

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

gemini-code-assist

Code Review

This pull request introduces a hard compaction mode to crates/tui/src/compaction.rs that replaces the middle conversation history with a single summary message while preserving the system prompt and recent messages. The review feedback identifies three issues: first, plan_hard_compaction can split tool calls and their corresponding results, which violates API contracts and causes errors; second, generating a summary_prompt in compact_hard_safe defeats prefix cache stability and causes redundancy; and third, the workspace parameter is unused and should be prefixed with an underscore to avoid compiler warnings.

gemini-code-assist · 2026-06-01T16:25:07Z

+pub fn plan_hard_compaction(
+    messages: &[Message],
+    keep_recent: usize,
+) -> Option<HardCompactionPlan> {
+    let len = messages.len();
+    if len < keep_recent + MIN_SUMMARIZE_MESSAGES {
+        return None;
+    }
+
+    let summarize_end = len.saturating_sub(keep_recent);
+    let summarize_indices: Vec<usize> = (0..summarize_end).collect();
+    let recent_indices: Vec<usize> = (summarize_end..len).collect();
+
+    Some(HardCompactionPlan {
+        summarize_indices,
+        recent_indices,
+    })
+}


The current implementation of plan_hard_compaction splits the message history at an arbitrary index (len - keep_recent). This can easily split a tool call and its corresponding tool result (e.g., the tool call is summarized, but the tool result is kept in the recent tail, or vice versa).

DeepSeek and OpenAI APIs strictly require that any tool role message must be preceded by an assistant message containing the matching tool_calls ID. If a tool result is orphaned in the active messages array, the API will reject the request with a 400 error.

To prevent this, we should adjust the summarize_end boundary backwards if any message at or after summarize_end is a ToolResult whose corresponding ToolUse is before summarize_end.

gemini-code-assist · 2026-06-01T16:25:07Z

+                let summary_prompt = if config.cache_summary {
+                    Some(SystemPrompt::Text(format!(
+                        "Previous conversation summary:\n{summary_text}"
+                    )))
+                } else {
+                    None
+                };
+
+                return Ok(HardCompactionResult {
+                    messages: new_messages,
+                    summary_prompt,
+                    removed_messages: plan.summarize_indices,
+                });


In compact_hard_safe, if config.cache_summary is enabled, a summary_prompt is generated and returned in HardCompactionResult.

If the caller merges this summary_prompt into the system prompt (using merge_system_prompts), the system prompt will be modified. This completely defeats the primary design goal of hard compaction, which is to keep the system prompt byte-identical so that the prefix cache remains hot.

Furthermore, since the summary is already injected directly into the messages array as summary_msg (role: user), adding it to the system prompt is redundant and will cause the model to see the summary twice.

We should set summary_prompt to None for hard compaction to guarantee prefix cache stability.

return Ok(HardCompactionResult { messages: new_messages, summary_prompt: None, removed_messages: plan.summarize_indices, });

gemini-code-assist · 2026-06-01T16:25:07Z

+pub async fn compact_hard_safe(
+    client: &DeepSeekClient,
+    messages: &[Message],
+    config: &CompactionConfig,
+    workspace: Option<&Path>,
+) -> Result<HardCompactionResult> {


The workspace parameter is passed to compact_hard_safe but is never used within the function body. This will trigger an unused_variables compiler warning in strict builds.

If this parameter is kept for signature compatibility with compact_messages_safe, please prefix it with an underscore (_workspace).

Suggested change

pub async fn compact_hard_safe(

client: &DeepSeekClient,

messages: &[Message],

config: &CompactionConfig,

workspace: Option<&Path>,

) -> Result<HardCompactionResult> {

pub async fn compact_hard_safe(

client: &DeepSeekClient,

messages: &[Message],

config: &CompactionConfig,

_workspace: Option<&Path>,

) -> Result<HardCompactionResult> {

greptile-apps

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Add hard compaction mode that replaces the middle conversation history with a single summary message while preserving the system prompt segment and the last N recent messages (default 8). This is opt-in via the hard_enabled configuration flag. Hard compaction differs from the existing soft-seam approach: - Soft seams append <archived_context> blocks without removing messages - Hard compaction replaces the middle messages with a summary Both approaches preserve prefix cache stability because the system prompt segment (stored separately in session.system_prompt) is never touched. The summary message replaces the middle history, and the recent tail is preserved verbatim. Why hard compaction matters: When the context window is under pressure, the existing soft-seam approach keeps growing the message array (appending summary blocks without removing old messages). Hard compaction is more aggressive: it replaces the middle history entirely, freeing more budget. The trade-off is that old messages are lost (they're summarized), but the system prompt segment remains byte-identical, so the prefix cache stays hot after compaction. Changes: - Add HardCompactionPlan and HardCompactionResult types - Add plan_hard_compaction() function that plans which messages to summarize and which to keep - Add compact_hard_safe() async function that executes the hard compaction using the existing LLM summarization infrastructure - Add hard_enabled and hard_keep_recent fields to CompactionConfig (defaults: false and 8) - Add 3 unit tests for the new functionality

greptile-apps

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

greptile-apps Bot reviewed Jun 1, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 1, 2026

View reviewed changes

HUQIANTAO force-pushed the feat/hard-compact-system-preserve branch from 75a02bf to be5745d Compare June 1, 2026 16:40

greptile-apps Bot reviewed Jun 1, 2026

View reviewed changes

HUQIANTAO force-pushed the feat/hard-compact-system-preserve branch from be5745d to 7321528 Compare June 3, 2026 12:23

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

Hmbown mentioned this pull request Jun 3, 2026

v0.9.0 Open PR harvest: merge, supersede, or close long-lived branches #2722

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(client): add hard compaction option preserving system segment#2522

feat(client): add hard compaction option preserving system segment#2522
HUQIANTAO wants to merge 1 commit into
Hmbown:mainfrom
HUQIANTAO:feat/hard-compact-system-preserve

HUQIANTAO commented Jun 1, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

carloslindhstrom-code Jun 1, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

greptile-apps Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HUQIANTAO commented Jun 1, 2026

Summary

Background

Design

Message structure after hard compaction

Configuration

API

Summary format

Why this matters

Testing

Files changed

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

carloslindhstrom-code Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants