Skip to content

refactor(client): relocate turn_meta to tail of user message for prefix stability#2517

Open
HUQIANTAO wants to merge 1 commit into
Hmbown:mainfrom
HUQIANTAO:feat/turn-meta-tail-relocate
Open

refactor(client): relocate turn_meta to tail of user message for prefix stability#2517
HUQIANTAO wants to merge 1 commit into
Hmbown:mainfrom
HUQIANTAO:feat/turn-meta-tail-relocate

Conversation

@HUQIANTAO
Copy link
Copy Markdown
Contributor

Summary

Move the <turn_meta> content block from position 0 to the last position in each user message's content array. This ensures the leading bytes of every user message stay stable across date boundaries, model route changes, and working set mutations, which directly improves DeepSeek's KV prefix cache hit rate.

Background

The <turn_meta> block contains volatile per-turn data:

  • Current local date (chrono::Local::now().format("%Y-%m-%d"))
  • Auto model route (when auto_model is enabled)
  • Auto reasoning effort (when auto-selected)
  • Working set summary (file paths observed this session)

When this block sits at content position 0, the leading bytes of every user message change at every date boundary. DeepSeek's KV prefix cache matches byte sequences from the start of each message; a change at position 0 invalidates the cache for the entire user message prefix, forcing a full re-prefill on the next turn.

What changed

The reorder is a single swap in user_text_message_with_turn_metadata_for_route:

Before:

content: vec![
    turn_metadata_block(...),  // volatile: date, model route, working set
    ContentBlock::Text { text, .. },  // stable: user input
]

After:

content: vec![
    ContentBlock::Text { text, .. },  // stable: user input (now at position 0)
    turn_metadata_block(...),  // volatile: date, model route, working set (now at tail)
]

Why this is safe

  • The model does not depend on the position of metadata within a single user message. V4 processes all content blocks in a user message sequentially regardless of order.
  • The messages_with_turn_metadata() function returns stored messages unchanged (it clones self.session.messages). The metadata is baked into messages at append time, not injected at request time. The reorder only affects the initial storage layout.
  • Tool result messages are unaffected; they carry a single ToolResult content block and are never mixed with turn_meta.

Testing

  • Updated 4 existing tests that previously asserted content.first() was the turn_meta block to use content.last() instead.
  • Added new test user_message_turn_meta_is_appended_not_prepended that explicitly verifies the user text is at position 0 and turn_meta is at position 1.
  • All 8 turn_metadata-related tests pass.

Cache impact

For sessions that span midnight or change model routes mid-conversation, the user message prefix bytes now remain stable. This means:

  • Cross-date sessions no longer bust the user message prefix cache at the date boundary.
  • Model route changes (auto_model switching between v4-pro and v4-flash) no longer invalidate the user message prefix.
  • Working set mutations (new files observed) only affect the trailing metadata block, not the leading user input.

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request swaps the order of content blocks in user messages so that the user text is placed first and the turn metadata block is placed last. This change prevents DeepSeek's KV prefix cache from being invalidated by volatile metadata changes (like dates). While the tests in the PR were updated to reflect this change, a review comment correctly identifies that an existing test (post_edit_hook_injects_diagnostics_message_before_next_request) was missed and will now fail because it still expects the metadata block to be at index 0.

meta.contains("Current local date:"),
"turn_meta must contain the date"
);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

An existing test post_edit_hook_injects_diagnostics_message_before_next_request (around line 2956-2960 in crates/tui/src/core/engine/tests.rs) is not modified in this PR but will now fail because of this change.\n\nIn that test, it asserts that the first content block of the appended message starts with <turn_meta>\\n:\n```rust\n let last = engine.session.messages.last().expect("message appended");\n assert_eq!(last.role, "user");\n let meta = match &last.content[0] {\n crate::models::ContentBlock::Text { text, .. } => text.clone(),\n other => panic!("expected text block, got {other:?}"),\n };\n assert!(meta.starts_with("<turn_meta>\n"));\n```\nSince this PR swaps the order of the content blocks, last.content[0] is now the diagnostics text block, and last.content[1] (or last.content.last()) is the turn_meta block. This assertion will fail.\n\nPlease update that test to check last.content.last() instead of last.content[0] for the turn_meta block.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

An existing test post_edit_hook_injects_diagnostics_message_before_next_request (around line 2956-2960 in crates/tui/src/core/engine/tests.rs) is not modified in this PR but will now fail because of this change.\n\nIn that test, it asserts that the first content block of the appended message starts with <turn_meta>\\n:\n```rust\n let last = engine.session.messages.last().expect("message appended");\n assert_eq!(last.role, "user");\n let meta = match &last.content[0] {\n crate::models::ContentBlock::Text { text, .. } => text.clone(),\n other => panic!("expected text block, got {other:?}"),\n };\n assert!(meta.starts_with("<turn_meta>\n"));\n```\nSince this PR swaps the order of the content blocks, last.content[0] is now the diagnostics text block, and last.content[1] (or last.content.last()) is the turn_meta block. This assertion will fail.\n\nPlease update that test to check last.content.last() instead of last.content[0] for the turn_meta block.

Copy link
Copy Markdown

@carloslindhstrom-code carloslindhstrom-code left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😃

…ix stability

Move the <turn_meta> content block from position 0 to the last position
in each user message's content array. The turn_meta block contains the
current local date (chrono::Local::now()), model route info, and working
set summary, all of which change across date boundaries or session state
transitions.

When turn_meta sits at position 0, the leading bytes of every user
message change at every date boundary, which invalidates DeepSeek's KV
prefix cache for the entire user message prefix. By moving the volatile
metadata block to the tail, the user input text occupies position 0 and
its bytes remain stable across date changes, model route updates, and
working set mutations.

This is a pure positional reorder with zero logic changes. The
turn_metadata_block() function, the metadata content, and the
messages_with_turn_metadata() passthrough are all unchanged.

Changes:
- Swap content vec order in user_text_message_with_turn_metadata_for_route
- Update 4 existing tests to check content.last() instead of content.first()
- Add new test user_message_turn_meta_is_appended_not_prepended
@HUQIANTAO HUQIANTAO force-pushed the feat/turn-meta-tail-relocate branch from 6201aad to 07c82a0 Compare June 1, 2026 16:28
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants