Mcv binary cache by maryamtahhan · Pull Request #166 · redhat-et/MCU

maryamtahhan · 2026-02-09T12:26:22Z

Enable vllm binary cache support for MCV

fixes: #147

maryamtahhan · 2026-02-10T15:39:51Z

TODO - add torch_inductor dir

maryamtahhan · 2026-02-23T11:14:32Z

No precache

(EngineCore_DP0 pid=22) INFO 02-23 01:35:37 [backends.py:812] Using cache directory: /root/.cache/vllm/torch_compile_cache/8d0a361fbc/rank_0_0/backbone for vLLM's torch.compile
(EngineCore_DP0 pid=22) INFO 02-23 01:35:37 [backends.py:872] Dynamo bytecode transform time: 28.30 s
(EngineCore_DP0 pid=22) [rank0]:W0223 01:35:45.613000 22 torch/_inductor/utils.py:1613] Not enough SMs to use max_autotune_gemm mode
(EngineCore_DP0 pid=22) INFO 02-23 01:35:55 [backends.py:302] Cache the graph of compile range (1, 2048) for later use
(EngineCore_DP0 pid=22) INFO 02-23 01:36:01 [backends.py:319] Compiling a graph for compile range (1, 2048) takes 18.20 s
(EngineCore_DP0 pid=22) INFO 02-23 01:36:01 [monitor.py:34] torch.compile takes 46.50 s in total

with pre-cache:

(EngineCore_DP0 pid=22) INFO 02-23 03:12:47 [backends.py:812] Using cache directory: /root/.cache/vllm/torch_compile_cache/8d0a361fbc/rank_0_0/backbone for vLLM's torch.compile
(EngineCore_DP0 pid=22) INFO 02-23 03:12:47 [backends.py:872] Dynamo bytecode transform time: 7.85 s
(EngineCore_DP0 pid=22) INFO 02-23 03:12:54 [backends.py:267] Directly load the compiled graph(s) for compile range (1, 2048) from the cache, took 1.273 s
(EngineCore_DP0 pid=22) INFO 02-23 03:12:54 [monitor.py:34] torch.compile takes 9.12 s in total

Billy99

Looks good, only minor comments.
One part I was sure about is the GPU detection. I feel like there is a limited (small) number of GPUs that we detect, or am I missing something?

mcv/pkg/cache/cache.go

mcv/README.md

maryamtahhan · 2026-02-25T09:57:38Z

Looks good, only minor comments.
One part I was sure about is the GPU detection. I feel like there is a limited (small) number of GPUs that we detect, or am I missing something?

ATM it's using CUDA or ROCM - so should detect all NVIDIA or AMD GPUs?

this is something I plan on changing and doing through kube moving forward

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

Bumps [github.com/sigstore/fulcio](https://github.com/sigstore/fulcio) from 1.8.3 to 1.8.5. - [Release notes](https://github.com/sigstore/fulcio/releases) - [Changelog](https://github.com/sigstore/fulcio/blob/main/CHANGELOG.md) - [Commits](sigstore/fulcio@v1.8.3...v1.8.5) --- updated-dependencies: - dependency-name: github.com/sigstore/fulcio dependency-version: 1.8.5 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>

_has_artifact_compile_range_with_triton() only checked for a triton/ subdirectory, which cannot exist when the artifact is a packed binary file. Recognize binary artifact_compile_range_* files as valid vLLM cache indicators so detect_cache_mode() returns 'vllm' instead of falling through to 'triton'. Also: - sync requirements.txt with pyproject.toml (typer[all], structlog) - silence pylint R0903 on Pydantic data models - disable pylint import-error for declared but not-installed deps Signed-off-by: Alessandro Sangiorgi <asangior@redhat.com>

maryamtahhan force-pushed the mcv-binary-cache branch 3 times, most recently from c602a2a to 89b9145 Compare February 9, 2026 13:16

maryamtahhan marked this pull request as ready for review February 9, 2026 13:17

maryamtahhan force-pushed the mcv-binary-cache branch from 194babb to caf79c1 Compare February 10, 2026 10:03

maryamtahhan requested a review from Billy99 February 10, 2026 10:43

maryamtahhan force-pushed the mcv-binary-cache branch from caf79c1 to 5ab5cd3 Compare February 10, 2026 11:02

maryamtahhan removed the request for review from Billy99 February 10, 2026 16:07

maryamtahhan marked this pull request as draft February 10, 2026 16:07

maryamtahhan marked this pull request as ready for review February 23, 2026 11:13

maryamtahhan requested a review from Billy99 February 23, 2026 11:14

maryamtahhan force-pushed the mcv-binary-cache branch from 4329659 to 5d5d4eb Compare February 23, 2026 11:37

Billy99 approved these changes Feb 23, 2026

View reviewed changes

mcv/pkg/cache/cache.go Show resolved Hide resolved

mcv/README.md Outdated Show resolved Hide resolved

maryamtahhan force-pushed the mcv-binary-cache branch from 77efdb4 to 913e0fc Compare February 24, 2026 12:33

maryamtahhan mentioned this pull request Feb 25, 2026

Mcv aot cache #169

Open

maryamtahhan and others added 12 commits February 25, 2026 14:57

mcv: add binary cache create support

99be242

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: build binary cache examples

b963ab1

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: binary cache extraction

b21d8f0

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: update binary cache docs

976b991

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: binary cache fixes

c8e1bd1

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: skip precommit on example caches

5047ae0

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: fix golang linting issues

a64e0ea

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: don't sanitize triton paths

9aecac0

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: fix pre-commit issues

35082f7

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

mcv: address binary cache review comments

eb07b8d

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>

Auto-update pre-commit hooks

184f2f8

maryamtahhan force-pushed the mcv-binary-cache branch from bd41903 to 6073411 Compare February 25, 2026 14:58

Merge branch 'main' into mcv-binary-cache

9adfb9d

maryamtahhan merged commit e59d588 into redhat-et:main Feb 25, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Mcv binary cache#166

Mcv binary cache#166
maryamtahhan merged 14 commits intoredhat-et:mainfrom
maryamtahhan:mcv-binary-cache

maryamtahhan commented Feb 9, 2026 •

edited

Loading

Uh oh!

maryamtahhan commented Feb 10, 2026

Uh oh!

maryamtahhan commented Feb 23, 2026

Uh oh!

Billy99 left a comment

Uh oh!

Uh oh!

Uh oh!

maryamtahhan commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

maryamtahhan commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maryamtahhan commented Feb 10, 2026

Uh oh!

maryamtahhan commented Feb 23, 2026

Uh oh!

Billy99 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

maryamtahhan commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maryamtahhan commented Feb 9, 2026 •

edited

Loading