feat(caching): add seed_cache_dir for cross-run cache reuse by gchlebus · Pull Request #777 · NVIDIA-NeMo/Evaluator

gchlebus · 2026-02-26T23:14:51Z

Problem

When the same evaluation is run on different clusters (e.g., CW-PDX → CW-DFW), each gets a separate output_dir and therefore a separate cache_dir. The cache keys are identical (SHA-256 of the request JSON body), but the caches are physically isolated on separate Lustre filesystems. If a run times out on one cluster and is restarted on another, all cached results are lost — even though the exact same requests would produce the exact same cache keys.

Solution

Add a seed_cache_dir parameter to CachingInterceptor that acts as a read-only fallback cache. On a cache miss in the primary cache, the interceptor checks the seed cache. Seed cache hits are automatically promoted into the primary cache, so the output cache is always self-contained after a run. The seed cache itself is never modified.

Usage

Legacy config:

adapter_config:
  seed_cache_dir: /path/to/previous/run/cache

Interceptor config:

interceptors:
  - name: caching
    config:
      seed_cache_dir: /path/to/previous/run/cache

Workflow

Run eval on Cluster A → cache populated at {output_dir}/cache/
Copy cache directory to Cluster B: rsync -a clusterA:/path/cache/ clusterB:/path/seed-cache/
Run eval on Cluster B with seed_cache_dir: /path/seed-cache
Cluster B gets instant cache hits for all previously completed items
Output cache on Cluster B is self-contained — includes both promoted seed entries and newly generated responses

Changes

caching_interceptor.py: Added seed_cache_dir param to Params, seed Cache initialization with directory existence validation (warns when subdirs are missing), fallback logic in _get_from_cache, automatic promotion of seed hits into primary cache via direct cache writes (bypasses response counter to avoid inflating _cached_responses_count)
adapter_config.py: Added seed_cache_dir to LegacyAdapterConfig, wired through to caching interceptor in from_legacy_config()
test_seed_cache.py: 11 tests covering fallback, promotion to primary, primary precedence, both-miss, no-seed, nonexistent dir, partial seed dir, write isolation, seed immutability, and legacy config passthrough
docs/libraries/nemo-evaluator/interceptors/caching.md: Added "Seed Cache" section documenting both legacy and interceptor configuration formats, promotion behavior, and cross-cluster usage guide

Testing

Unit tests

11 tests, all passing
Existing tests pass with zero regressions

Real cluster validation

1. End-to-end test (AA_math_test_500, CW-PDX → CW-DFW):

Copied cache (2,500 entries, 56MB) from CW-PDX to CW-DFW
Ran eval with pre_cmd installing this branch
2,500 / 2,500 LLM responses served from seed cache — 100% hit rate
Zero GPU inference on DFW — all responses from cache

2. Production use (HLE benchmark, CW-DFW → CW-PDX):

DFW multi-node run timed out at 63.5% (1,677/2,158 items)
Copied 151MB cache from DFW to PDX
Ran HLE eval on PDX with seed_cache_dir pointing to copied cache
1,677 cached items served in ~11 seconds, then generated remaining 481 items in ~2h
GPT-4o judging completed in ~30 min
Result: 23.17% judge_correct — complete end-to-end evaluation with cross-cluster cache reuse
Primary cache on PDX ended up self-contained (all 2,158 entries) thanks to seed promotion

Test details

Invocations: 84774da8d9154adc (AA_math test), d8f3b1db276dd6e2 (HLE production)
Clusters: CW-DFW, CW-PDX
Model: Qwen3.5-122B-A10B (vLLM, 8×GPU per node, 8 nodes)
Branch install: pip install nemo-evaluator @ git+...@feature/cache-seed-dir via pre_cmd

copy-pr-bot · 2026-02-26T23:14:54Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

gchlebus · 2026-02-28T00:22:14Z

/ok to test 4a15ea8

Add a seed_cache_dir parameter to CachingInterceptor that enables reusing cached responses from a previous evaluation run. On cache miss in the primary cache, the interceptor falls back to the seed cache directory (read-only). New responses are always written to the primary cache only. This is useful when migrating evaluations between clusters with separate filesystems (e.g., CW-PDX to CW-DFW). The cache keys are identical (SHA-256 of request JSON body), but previously the caches were physically isolated. Users can now copy the cache directory from one run and point seed_cache_dir at it. Usage (legacy config): adapter_config: seed_cache_dir: /path/to/previous/run/cache Usage (interceptor config): interceptors: - name: caching config: seed_cache_dir: /path/to/previous/run/cache Changes: - caching_interceptor.py: Add seed_cache_dir param, initialize read-only seed Cache instances, fall back on primary miss - adapter_config.py: Add seed_cache_dir to LegacyAdapterConfig, pass through to caching interceptor in from_legacy_config() - test_seed_cache.py: 9 tests covering fallback, precedence, isolation, nonexistent dirs, and legacy config passthrough Signed-off-by: Grzegorz Chlebus <gchlebus@nvidia.com>

Seed cache entries are now automatically copied into the primary cache on hit, making the output cache self-contained after a run. This means future runs can use the primary cache directly without needing the original seed cache. Previously, seed hits were returned without writing to primary, leaving the output cache incomplete (only containing newly generated responses). - Updated _get_from_cache to call _save_to_cache on seed hits - Updated field description to document promotion behavior - Added test_seed_hit_promoted_to_primary test - Updated existing tests to verify promotion side effects Signed-off-by: Grzegorz Chlebus <gchlebus@nvidia.com>

Document the seed cache feature including: - Configuration via interceptor config and legacy adapter config - Cache lookup and promotion behavior - Cross-cluster usage guide with rsync + mount example - Cache key portability explanation Signed-off-by: Grzegorz Chlebus <gchlebus@nvidia.com>

- Warn when seed_cache_dir is configured but subdirs are missing - Avoid inflating response counter during seed-to-primary promotion - Add test for partial seed directory (responses/ without headers/) - Add interceptor config example to seed cache documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Grzegorz Chlebus <gchlebus@nvidia.com>

gchlebus · 2026-03-03T09:54:30Z

/ok to test 8ddbf16

piojanu · 2026-03-12T16:19:04Z

packages/nemo-evaluator/src/nemo_evaluator/adapters/interceptors/caching_interceptor.py

+                self.seed_responses_cache = Cache(directory=seed_responses_dir)
+                self.seed_headers_cache = Cache(directory=seed_headers_dir)


Please seed also the requests cache. Now it's confusing that we omit only that one.

gchlebus requested review from a team as code owners February 26, 2026 23:14

github-actions bot added nemo-evaluator tests labels Feb 26, 2026

gchlebus force-pushed the feature/cache-seed-dir branch 2 times, most recently from 72a6da8 to 16c4c92 Compare February 28, 2026 00:13

github-actions bot added the documentation Improvements or additions to documentation label Feb 28, 2026

gchlebus force-pushed the feature/cache-seed-dir branch from 4a5bc90 to 4a15ea8 Compare February 28, 2026 00:17

copy-pr-bot bot temporarily deployed to test February 28, 2026 00:23 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 28, 2026 00:23 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 28, 2026 00:25 Inactive

gchlebus added 3 commits March 3, 2026 08:54

gchlebus force-pushed the feature/cache-seed-dir branch from 4a15ea8 to 01868ba Compare March 3, 2026 07:55

copy-pr-bot bot temporarily deployed to test March 3, 2026 09:55 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 3, 2026 09:56 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci March 3, 2026 09:58 Inactive

piojanu reviewed Mar 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(caching): add seed_cache_dir for cross-run cache reuse#777

feat(caching): add seed_cache_dir for cross-run cache reuse#777
gchlebus wants to merge 4 commits intomainfrom
feature/cache-seed-dir

gchlebus commented Feb 26, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Feb 26, 2026

Uh oh!

gchlebus commented Feb 28, 2026

Uh oh!

gchlebus commented Mar 3, 2026

Uh oh!

piojanu Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		self.seed_responses_cache = Cache(directory=seed_responses_dir)
		self.seed_headers_cache = Cache(directory=seed_headers_dir)

Conversation

gchlebus commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Usage

Workflow

Changes

Testing

Unit tests

Real cluster validation

Test details

Uh oh!

copy-pr-bot bot commented Feb 26, 2026

Uh oh!

gchlebus commented Feb 28, 2026

Uh oh!

gchlebus commented Mar 3, 2026

Uh oh!

piojanu Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gchlebus commented Feb 26, 2026 •

edited

Loading