You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[CI] Add max_split_size_mb for e2e test to avoid oom (#3252)
### What this PR does / why we need it?
we add a patch for model weight loader to avoid using vLLM weight loader
v2, since v2 will lead unknown issue for torchair. While this patch make
some unknown memory usage problem. To quick fix the problem, let's
expend the `max_split_size_mb` to a larger value to avoid weight load
oom issue.
Further solution is to remove the patch and address weight loader v2
from vLLM.
Closes: #3251
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- vLLM version: v0.10.2
- vLLM main:
vllm-project/vllm@releases/v0.11.0
Signed-off-by: wangxiyuan <[email protected]>
0 commit comments