refactor(client): relocate turn_meta to tail of user message for prefix stability#2517
refactor(client): relocate turn_meta to tail of user message for prefix stability#2517HUQIANTAO wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
There was a problem hiding this comment.
Code Review
This pull request swaps the order of content blocks in user messages so that the user text is placed first and the turn metadata block is placed last. This change prevents DeepSeek's KV prefix cache from being invalidated by volatile metadata changes (like dates). While the tests in the PR were updated to reflect this change, a review comment correctly identifies that an existing test (post_edit_hook_injects_diagnostics_message_before_next_request) was missed and will now fail because it still expects the metadata block to be at index 0.
| meta.contains("Current local date:"), | ||
| "turn_meta must contain the date" | ||
| ); | ||
| } |
There was a problem hiding this comment.
An existing test post_edit_hook_injects_diagnostics_message_before_next_request (around line 2956-2960 in crates/tui/src/core/engine/tests.rs) is not modified in this PR but will now fail because of this change.\n\nIn that test, it asserts that the first content block of the appended message starts with <turn_meta>\\n:\n```rust\n let last = engine.session.messages.last().expect("message appended");\n assert_eq!(last.role, "user");\n let meta = match &last.content[0] {\n crate::models::ContentBlock::Text { text, .. } => text.clone(),\n other => panic!("expected text block, got {other:?}"),\n };\n assert!(meta.starts_with("<turn_meta>\n"));\n```\nSince this PR swaps the order of the content blocks, last.content[0] is now the diagnostics text block, and last.content[1] (or last.content.last()) is the turn_meta block. This assertion will fail.\n\nPlease update that test to check last.content.last() instead of last.content[0] for the turn_meta block.
There was a problem hiding this comment.
An existing test
post_edit_hook_injects_diagnostics_message_before_next_request(around line 2956-2960 incrates/tui/src/core/engine/tests.rs) is not modified in this PR but will now fail because of this change.\n\nIn that test, it asserts that the first content block of the appended message starts with<turn_meta>\\n:\n```rust\n let last = engine.session.messages.last().expect("message appended");\n assert_eq!(last.role, "user");\n let meta = match &last.content[0] {\n crate::models::ContentBlock::Text { text, .. } => text.clone(),\n other => panic!("expected text block, got {other:?}"),\n };\n assert!(meta.starts_with("<turn_meta>\n"));\n```\nSince this PR swaps the order of the content blocks,last.content[0]is now the diagnostics text block, andlast.content[1](orlast.content.last()) is theturn_metablock. This assertion will fail.\n\nPlease update that test to checklast.content.last()instead oflast.content[0]for theturn_metablock.
…ix stability Move the <turn_meta> content block from position 0 to the last position in each user message's content array. The turn_meta block contains the current local date (chrono::Local::now()), model route info, and working set summary, all of which change across date boundaries or session state transitions. When turn_meta sits at position 0, the leading bytes of every user message change at every date boundary, which invalidates DeepSeek's KV prefix cache for the entire user message prefix. By moving the volatile metadata block to the tail, the user input text occupies position 0 and its bytes remain stable across date changes, model route updates, and working set mutations. This is a pure positional reorder with zero logic changes. The turn_metadata_block() function, the metadata content, and the messages_with_turn_metadata() passthrough are all unchanged. Changes: - Swap content vec order in user_text_message_with_turn_metadata_for_route - Update 4 existing tests to check content.last() instead of content.first() - Add new test user_message_turn_meta_is_appended_not_prepended
6201aad to
07c82a0
Compare
There was a problem hiding this comment.
HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
Summary
Move the
<turn_meta>content block from position 0 to the last position in each user message's content array. This ensures the leading bytes of every user message stay stable across date boundaries, model route changes, and working set mutations, which directly improves DeepSeek's KV prefix cache hit rate.Background
The
<turn_meta>block contains volatile per-turn data:chrono::Local::now().format("%Y-%m-%d"))When this block sits at content position 0, the leading bytes of every user message change at every date boundary. DeepSeek's KV prefix cache matches byte sequences from the start of each message; a change at position 0 invalidates the cache for the entire user message prefix, forcing a full re-prefill on the next turn.
What changed
The reorder is a single swap in
user_text_message_with_turn_metadata_for_route:Before:
After:
Why this is safe
messages_with_turn_metadata()function returns stored messages unchanged (it clonesself.session.messages). The metadata is baked into messages at append time, not injected at request time. The reorder only affects the initial storage layout.ToolResultcontent block and are never mixed with turn_meta.Testing
content.first()was the turn_meta block to usecontent.last()instead.user_message_turn_meta_is_appended_not_prependedthat explicitly verifies the user text is at position 0 and turn_meta is at position 1.Cache impact
For sessions that span midnight or change model routes mid-conversation, the user message prefix bytes now remain stable. This means: