Skip to content

Commit 15b8aff

Browse files
authored
[CI] Add max_split_size_mb for e2e test to avoid oom (#3252)
### What this PR does / why we need it? we add a patch for model weight loader to avoid using vLLM weight loader v2, since v2 will lead unknown issue for torchair. While this patch make some unknown memory usage problem. To quick fix the problem, let's expend the `max_split_size_mb` to a larger value to avoid weight load oom issue. Further solution is to remove the patch and address weight loader v2 from vLLM. Closes: #3251 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@releases/v0.11.0 Signed-off-by: wangxiyuan <[email protected]>
1 parent 050d202 commit 15b8aff

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

tests/e2e/singlecard/spec_decode_v1/test_v1_spec_decode.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
from tests.e2e.conftest import VllmRunner
1212

1313
os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"
14+
os.environ["PYTORCH_NPU_ALLOC_CONF"] = "max_split_size_mb:256"
1415

1516

1617
@pytest.fixture

0 commit comments

Comments
 (0)