Skip to content

Commit 1dab9bc

Browse files
[Bugfix] set OMP_NUM_THREADS to 1 by default for multiprocessing (#6109)
Signed-off-by: Travis Johnson <[email protected]> Co-authored-by: Nick Hill <[email protected]>
1 parent 3de6e6a commit 1dab9bc

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm/executor/multiproc_gpu_executor.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,11 @@ def _init_executor(self) -> None:
3737
# Disable torch async compiling which won't work with daemonic processes
3838
os.environ["TORCHINDUCTOR_COMPILE_THREADS"] = "1"
3939

40+
# Set OMP_NUM_THREADS to 1 if it is not set explicitly, avoids CPU
41+
# contention amongst the shards
42+
if "OMP_NUM_THREADS" not in os.environ:
43+
os.environ["OMP_NUM_THREADS"] = "1"
44+
4045
assert world_size <= cuda_device_count_stateless(), (
4146
"please set tensor_parallel_size to less than max local gpu count")
4247

0 commit comments

Comments
 (0)