Skip to content

feat(client): add cross-session prompt base section disk cache#2520

Open
HUQIANTAO wants to merge 1 commit into
Hmbown:mainfrom
HUQIANTAO:feat/prompt-base-persist
Open

feat(client): add cross-session prompt base section disk cache#2520
HUQIANTAO wants to merge 1 commit into
Hmbown:mainfrom
HUQIANTAO:feat/prompt-base-persist

Conversation

@HUQIANTAO
Copy link
Copy Markdown
Contributor

Summary

Add a new prompt_persist.rs module that caches the immutable base section of the system prompt on disk for cross-session reuse. When the SHA-256 of the base section matches a cached entry, the entire base-section assembly can be skipped on session start, providing byte-identical bytes to the API immediately.

Background

DeepSeek's KV prefix cache matches byte sequences from the start of the system prompt. The base section (mode prompt, project context, skills, context management, compaction template) is stable across sessions for the same workspace. Currently, every new session rebuilds this section from scratch, which means:

  1. The base section bytes are re-serialized every session (CPU waste).
  2. If any intermediate state changes (e.g., a skill file is touched between sessions), the bytes might differ slightly, busting the server-side cache.

By caching the base section on disk and reusing it when the hash matches, we ensure byte-identical output across sessions for the same workspace. This is the infrastructure layer; wiring it into the prompt assembly pipeline (splitting base_section() + volatile_section()) will be done in a follow-up.

Cache design

Storage layout

~/.codewhale/prompt_cache/
  <system_hash>.bin   — the serialized base section text
  <system_hash>.meta  — JSON metadata (workspace path, mtime, timestamp)

Cache key

The cache key is the SHA-256 of the base section text, computed by the existing PrefixFingerprint::compute mechanism. This ensures that any change in the base section content produces a different cache key.

Invalidation

Cache entries are invalidated by:

  1. Hash mismatch: If the base section text changes (e.g., project context updated), the SHA-256 changes and a new entry is created.
  2. Workspace mtime change: If the workspace directory's mtime changes (files added/removed), the cache is considered stale even if the hash matches (guards against hash collisions).
  3. Age-based eviction: Entries older than a configurable threshold are evicted lazily.

API

Three public functions:

  • load_cached_base_section(hash, workspace) -> Option<String>: Try to load a cached base section.
  • save_cached_base_section(hash, text, workspace): Save a base section to disk.
  • evict_stale_entries(max_age_secs): Clean up old cache entries.

Why this matters

The DeepSeek service-side prefix cache has an entry lifetime. When the base section bytes are identical across sessions, the server can reuse its cached KV states for the entire base section, giving ~90% discount on cached tokens. Currently, each new session starts with a fresh base section that must be prefilled from scratch on the server side.

This is the single highest-leverage optimization for cross-session cache hit rate: the base section is the largest stable block in the system prompt, and making it byte-identical across sessions maximizes the server-side cache reuse.

Testing

4 unit tests:

  • save_and_load_round_trip: Verify save/load cycle works correctly.
  • load_returns_none_for_missing_cache: Verify graceful handling of cache miss.
  • load_returns_none_for_wrong_workspace: Verify workspace-scoped invalidation.
  • evict_preserves_fresh_entries: Verify fresh entries survive eviction.

Files changed

  • crates/tui/src/prompt_persist.rs: New module (254 lines)
  • crates/tui/src/main.rs: Register the new module

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a prompt persistence mechanism in prompt_persist.rs to cache the immutable base section of system prompts, facilitating cross-session reuse and optimizing DeepSeek's KV prefix caching. The review feedback provides several key recommendations to improve robustness and reliability: removing the spurious and ineffective workspace mtime invalidation check, utilizing the centralized codewhale_config::codewhale_home() helper for directory resolution, canonicalizing workspace paths to avoid relative/absolute comparison mismatches, performing atomic writes for cache and metadata files to prevent corruption, and ensuring corrupted metadata files are properly evicted rather than lingering on disk.

Comment on lines +41 to +52
#[derive(Debug, Clone, Serialize, Deserialize)]
struct CacheMetadata {
/// Absolute path to the workspace that produced this base section.
workspace: PathBuf,
/// Modification time of the workspace directory at cache-write time.
/// Used as a secondary invalidation signal: if the workspace mtime
/// changed, the cache is stale even if the base section hash matches
/// (which would require a hash collision).
workspace_mtime_secs: u64,
/// Unix timestamp when the cache was written.
cached_at_secs: u64,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The workspace directory mtime invalidation check is both spurious and ineffective:

  1. Spurious Cache Misses: A directory's mtime changes whenever any file is added, removed, or renamed in the root of the workspace (e.g., .git directory updates, temporary files, build artifacts). This causes unnecessary cache invalidation even when the base section text is completely identical.
  2. Ineffective for Subdirectories: Directory mtime is non-recursive. If a file inside a subdirectory (e.g., src/foo.rs) is modified, the root workspace directory's mtime does not change. Thus, it doesn't reliably detect workspace changes anyway.
  3. Redundant: SHA-256 is cryptographically secure. The probability of a hash collision is astronomically small (less than 1 in 2^256). Guarding against SHA-256 collisions is unnecessary.

Removing this check and the workspace_mtime_secs field from CacheMetadata will simplify the code, eliminate the dir_mtime_secs helper, and significantly improve the cache hit rate by avoiding spurious invalidations.

#[derive(Debug, Clone, Serialize, Deserialize)]
struct CacheMetadata {
    /// Absolute path to the workspace that produced this base section.
    workspace: PathBuf,
    /// Unix timestamp when the cache was written.
    cached_at_secs: u64,
}

Comment on lines +100 to +108
// Verify workspace mtime hasn't changed (guards against hash collisions).
let current_mtime = dir_mtime_secs(workspace);
if current_mtime != meta.workspace_mtime_secs {
logging::info(format!(
"Prompt cache stale: workspace mtime changed ({meta_mtime} → {current_mtime})",
meta_mtime = meta.workspace_mtime_secs
));
return None;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Remove the workspace mtime invalidation check as it is spurious and ineffective (as detailed in the CacheMetadata comment).

Comment on lines +132 to +136
// Write the base section text.
if let Err(err) = fs::write(&bin_path, base_text) {
logging::warn(format!("Failed to write prompt cache bin: {err}"));
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Writing directly to bin_path using fs::write is not atomic. If the process is interrupted (e.g., due to a crash, Ctrl+C, or power loss) midway through writing, it can leave a partially written or corrupted cache file on disk.

The project already provides a robust atomic write helper crate::utils::write_atomic. We should use it for both the .bin and .meta files to ensure cache integrity.

Suggested change
// Write the base section text.
if let Err(err) = fs::write(&bin_path, base_text) {
logging::warn(format!("Failed to write prompt cache bin: {err}"));
return;
}
// Write the base section text atomically.
if let Err(err) = crate::utils::write_atomic(&bin_path, base_text.as_bytes()) {
logging::warn(format!("Failed to write prompt cache bin: {err}"));
return;
}

Comment on lines +147 to +149
if let Err(err) = fs::write(&meta_path, serde_json::to_vec(&meta).unwrap_or_default()) {
logging::warn(format!("Failed to write prompt cache meta: {err}"));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Use crate::utils::write_atomic to write the metadata file atomically to prevent corruption.

Suggested change
if let Err(err) = fs::write(&meta_path, serde_json::to_vec(&meta).unwrap_or_default()) {
logging::warn(format!("Failed to write prompt cache meta: {err}"));
}
let meta_bytes = serde_json::to_vec(&meta).unwrap_or_default();
if let Err(err) = crate::utils::write_atomic(&meta_path, &meta_bytes) {
logging::warn(format!("Failed to write prompt cache meta: {err}"));
}

Comment on lines +57 to +65
fn cache_dir() -> Option<PathBuf> {
let home = dirs::home_dir()?;
let dir = home.join(".codewhale").join("prompt_cache");
if let Err(err) = fs::create_dir_all(&dir) {
logging::warn(format!("Failed to create prompt cache dir: {err}"));
return None;
}
Some(dir)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using dirs::home_dir()?.join(".codewhale") directly bypasses the application's standard home directory resolution logic. The application already has a centralized helper codewhale_config::codewhale_home() which respects custom configurations (such as environment variables or legacy .deepseek fallbacks).

Using codewhale_config::codewhale_home() ensures consistency across the entire application.

Suggested change
fn cache_dir() -> Option<PathBuf> {
let home = dirs::home_dir()?;
let dir = home.join(".codewhale").join("prompt_cache");
if let Err(err) = fs::create_dir_all(&dir) {
logging::warn(format!("Failed to create prompt cache dir: {err}"));
return None;
}
Some(dir)
}
fn cache_dir() -> Option<PathBuf> {
let home = codewhale_config::codewhale_home().ok()?;
let dir = home.join("prompt_cache");
if let Err(err) = fs::create_dir_all(&dir) {
logging::warn(format!("Failed to create prompt cache dir: {err}"));
return None;
}
Some(dir)
}

Comment on lines +67 to +75
/// Get the modification time of a directory as seconds since epoch.
fn dir_mtime_secs(path: &Path) -> u64 {
fs::metadata(path)
.and_then(|m| m.modified())
.ok()
.and_then(|t| t.duration_since(SystemTime::UNIX_EPOCH).ok())
.map(|d| d.as_secs())
.unwrap_or(0)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This helper function is no longer needed since we are removing the workspace mtime invalidation logic.

Comment on lines +95 to +98
// Verify workspace path matches.
if meta.workspace != workspace {
return None;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The workspace path comparison meta.workspace != workspace can fail if one path is relative (e.g., .) and the other is absolute/canonicalized, even if they refer to the exact same directory.

To make this comparison robust, we should canonicalize the workspace path before comparing.

Suggested change
// Verify workspace path matches.
if meta.workspace != workspace {
return None;
}
// Verify workspace path matches.
let canonical_workspace = workspace.canonicalize().unwrap_or_else(|_| workspace.to_path_buf());
if meta.workspace != canonical_workspace {
return None;
}

Comment on lines +139 to +146
let meta = CacheMetadata {
workspace: workspace.to_path_buf(),
workspace_mtime_secs: dir_mtime_secs(workspace),
cached_at_secs: SystemTime::now()
.duration_since(SystemTime::UNIX_EPOCH)
.map(|d| d.as_secs())
.unwrap_or(0),
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

We should canonicalize the workspace path before saving it to the metadata file to ensure consistent absolute path comparisons. Also, we can remove the workspace_mtime_secs field since we are removing the mtime check.

    let meta = CacheMetadata {
        workspace: workspace.canonicalize().unwrap_or_else(|_| workspace.to_path_buf()),
        cached_at_secs: SystemTime::now()
            .duration_since(SystemTime::UNIX_EPOCH)
            .map(|d| d.as_secs())
            .unwrap_or(0),
    };

Comment thread crates/tui/src/prompt_persist.rs Outdated
Comment on lines +178 to +196
if let Ok(bytes) = fs::read(&path)
&& let Ok(meta) = serde_json::from_slice::<CacheMetadata>(&bytes)
{
let stale = now.saturating_sub(meta.cached_at_secs) > max_age_secs;
let workspace_gone = !meta.workspace.exists();
let mtime_changed =
workspace_gone || dir_mtime_secs(&meta.workspace) != meta.workspace_mtime_secs;

if stale || workspace_gone || mtime_changed {
let hash = path
.file_stem()
.and_then(|s| s.to_str())
.unwrap_or("unknown");
let _ = fs::remove_file(&path);
let _ = fs::remove_file(path.with_extension("bin"));
logging::info(format!("Evicted prompt cache: {hash}"));
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If a metadata file is corrupted or unreadable, fs::read or serde_json::from_slice will fail, causing the if let block to be skipped entirely. This means corrupted cache files will linger on disk indefinitely and never be cleaned up.

We should explicitly handle read/parse failures by marking those files for eviction/deletion, and we can also remove the mtime_changed check as part of removing the workspace mtime invalidation logic.

            let mut should_evict = false;
            if let Ok(bytes) = fs::read(&path) {
                if let Ok(meta) = serde_json::from_slice::<CacheMetadata>(&bytes) {
                    let stale = now.saturating_sub(meta.cached_at_secs) > max_age_secs;
                    let workspace_gone = !meta.workspace.exists();
                    if stale || workspace_gone {
                        should_evict = true;
                    }
                } else {
                    should_evict = true;
                }
            } else {
                should_evict = true;
            }

            if should_evict {
                let hash = path
                    .file_stem()
                    .and_then(|s| s.to_str())
                    .unwrap_or("unknown");
                let _ = fs::remove_file(&path);
                let _ = fs::remove_file(path.with_extension("bin"));
                logging::info(format!("Evicted prompt cache: {hash}"));
            }

@HUQIANTAO HUQIANTAO force-pushed the feat/prompt-base-persist branch from 1989209 to 91d70ff Compare June 1, 2026 16:32
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Add prompt_persist.rs module that caches the immutable base section of
the system prompt on disk for cross-session reuse. The base section
(mode prompt, project context, skills, context management, compaction
template) is stable across sessions for the same workspace. By caching
this section and reusing it when the SHA-256 matches, we can skip the
entire base-section assembly on session start and immediately provide
byte-identical bytes to the API.

This is especially valuable for DeepSeek's service-side prefix cache:
when the base section bytes are identical across sessions, the server
can reuse its cached KV states for the entire base section, giving ~90%
discount on cached tokens.

Cache layout:
  ~/.codewhale/prompt_cache/<system_hash>.bin   — the base section text
  ~/.codewhale/prompt_cache/<system_hash>.meta  — JSON metadata

The cache key is the SHA-256 of the base section text. The metadata
includes the workspace path and its mtime for invalidation on workspace
changes. Stale entries are evicted lazily based on age and workspace
mtime consistency.

The module exposes three public functions:
- load_cached_base_section(): try to load a cached base section
- save_cached_base_section(): save a base section to disk
- evict_stale_entries(): clean up old cache entries

This is the infrastructure layer only. Wiring it into the prompt
assembly pipeline (splitting base_section() + volatile_section()) will
be done in a follow-up change.
@HUQIANTAO HUQIANTAO force-pushed the feat/prompt-base-persist branch from 91d70ff to 921949e Compare June 3, 2026 12:26
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant