feat(caching): add seed_cache_dir for cross-run cache reuse#777
Open
feat(caching): add seed_cache_dir for cross-run cache reuse#777
Conversation
72a6da8 to
16c4c92
Compare
4a5bc90 to
4a15ea8
Compare
Contributor
Author
|
/ok to test 4a15ea8 |
Add a seed_cache_dir parameter to CachingInterceptor that enables
reusing cached responses from a previous evaluation run. On cache
miss in the primary cache, the interceptor falls back to the seed
cache directory (read-only). New responses are always written to
the primary cache only.
This is useful when migrating evaluations between clusters with
separate filesystems (e.g., CW-PDX to CW-DFW). The cache keys
are identical (SHA-256 of request JSON body), but previously the
caches were physically isolated. Users can now copy the cache
directory from one run and point seed_cache_dir at it.
Usage (legacy config):
adapter_config:
seed_cache_dir: /path/to/previous/run/cache
Usage (interceptor config):
interceptors:
- name: caching
config:
seed_cache_dir: /path/to/previous/run/cache
Changes:
- caching_interceptor.py: Add seed_cache_dir param, initialize
read-only seed Cache instances, fall back on primary miss
- adapter_config.py: Add seed_cache_dir to LegacyAdapterConfig,
pass through to caching interceptor in from_legacy_config()
- test_seed_cache.py: 9 tests covering fallback, precedence,
isolation, nonexistent dirs, and legacy config passthrough
Signed-off-by: Grzegorz Chlebus <gchlebus@nvidia.com>
Seed cache entries are now automatically copied into the primary cache on hit, making the output cache self-contained after a run. This means future runs can use the primary cache directly without needing the original seed cache. Previously, seed hits were returned without writing to primary, leaving the output cache incomplete (only containing newly generated responses). - Updated _get_from_cache to call _save_to_cache on seed hits - Updated field description to document promotion behavior - Added test_seed_hit_promoted_to_primary test - Updated existing tests to verify promotion side effects Signed-off-by: Grzegorz Chlebus <gchlebus@nvidia.com>
Document the seed cache feature including: - Configuration via interceptor config and legacy adapter config - Cache lookup and promotion behavior - Cross-cluster usage guide with rsync + mount example - Cache key portability explanation Signed-off-by: Grzegorz Chlebus <gchlebus@nvidia.com>
4a15ea8 to
01868ba
Compare
- Warn when seed_cache_dir is configured but subdirs are missing - Avoid inflating response counter during seed-to-primary promotion - Add test for partial seed directory (responses/ without headers/) - Add interceptor config example to seed cache documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Grzegorz Chlebus <gchlebus@nvidia.com>
Contributor
Author
|
/ok to test 8ddbf16 |
piojanu
reviewed
Mar 12, 2026
Comment on lines
+107
to
+108
| self.seed_responses_cache = Cache(directory=seed_responses_dir) | ||
| self.seed_headers_cache = Cache(directory=seed_headers_dir) |
Contributor
There was a problem hiding this comment.
Please seed also the requests cache. Now it's confusing that we omit only that one.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When the same evaluation is run on different clusters (e.g., CW-PDX → CW-DFW), each gets a separate
output_dirand therefore a separatecache_dir. The cache keys are identical (SHA-256 of the request JSON body), but the caches are physically isolated on separate Lustre filesystems. If a run times out on one cluster and is restarted on another, all cached results are lost — even though the exact same requests would produce the exact same cache keys.Solution
Add a
seed_cache_dirparameter toCachingInterceptorthat acts as a read-only fallback cache. On a cache miss in the primary cache, the interceptor checks the seed cache. Seed cache hits are automatically promoted into the primary cache, so the output cache is always self-contained after a run. The seed cache itself is never modified.Usage
Legacy config:
Interceptor config:
Workflow
{output_dir}/cache/rsync -a clusterA:/path/cache/ clusterB:/path/seed-cache/seed_cache_dir: /path/seed-cacheChanges
caching_interceptor.py: Addedseed_cache_dirparam toParams, seedCacheinitialization with directory existence validation (warns when subdirs are missing), fallback logic in_get_from_cache, automatic promotion of seed hits into primary cache via direct cache writes (bypasses response counter to avoid inflating_cached_responses_count)adapter_config.py: Addedseed_cache_dirtoLegacyAdapterConfig, wired through to caching interceptor infrom_legacy_config()test_seed_cache.py: 11 tests covering fallback, promotion to primary, primary precedence, both-miss, no-seed, nonexistent dir, partial seed dir, write isolation, seed immutability, and legacy config passthroughdocs/libraries/nemo-evaluator/interceptors/caching.md: Added "Seed Cache" section documenting both legacy and interceptor configuration formats, promotion behavior, and cross-cluster usage guideTesting
Unit tests
Real cluster validation
1. End-to-end test (AA_math_test_500, CW-PDX → CW-DFW):
pre_cmdinstalling this branch2. Production use (HLE benchmark, CW-DFW → CW-PDX):
seed_cache_dirpointing to copied cacheTest details
84774da8d9154adc(AA_math test),d8f3b1db276dd6e2(HLE production)pip install nemo-evaluator @ git+...@feature/cache-seed-dirviapre_cmd