Skip to content

Commit 1eb169c

Browse files
authored
Fix torch compile cache dir conflicts when enforce_eager=False (#368)
1 parent c670d0a commit 1eb169c

File tree

10 files changed

+16
-9
lines changed

10 files changed

+16
-9
lines changed

benchmark/config/countdown-template.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ explorer:
5454
rollout_model:
5555
engine_num: 2
5656
tensor_parallel_size: 1
57-
enforce_eager: true
57+
enforce_eager: false
5858
enable_prefix_caching: false
5959
enable_chunked_prefill: false
6060
gpu_memory_utilization: 0.9

docs/sphinx_doc/source/tutorial/example_step_wise.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ explorer:
140140
engine_num: 2
141141
tensor_parallel_size: 2
142142
enable_prefix_caching: false
143-
enforce_eager: true
143+
enforce_eager: false
144144
dtype: bfloat16
145145
seed: 42
146146
gpu_memory_utilization: 0.7

docs/sphinx_doc/source_zh/tutorial/example_step_wise.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@ explorer:
135135
engine_num: 2
136136
tensor_parallel_size: 2
137137
enable_prefix_caching: false
138-
enforce_eager: true
138+
enforce_eager: false
139139
dtype: bfloat16
140140
seed: 42
141141
gpu_memory_utilization: 0.7

examples/agentscope_react/gsm8k.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ explorer:
4343
engine_num: 4
4444
tensor_parallel_size: 1
4545
enable_prefix_caching: false
46-
enforce_eager: true
46+
enforce_eager: false
4747
enable_openai_api: true
4848
enable_history: true
4949
enable_auto_tool_choice: true

examples/agentscope_tool_react/agentscopev0_tool_react_dapo.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ explorer:
4444
engine_num: 4
4545
tensor_parallel_size: 1
4646
enable_prefix_caching: false
47-
enforce_eager: true
47+
enforce_eager: false
4848
enable_openai_api: true
4949
enable_history: true
5050
dtype: bfloat16

examples/agentscope_tool_react/agentscopev0_tool_react_gsm8k.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ explorer:
4444
engine_num: 4
4545
tensor_parallel_size: 1
4646
enable_prefix_caching: false
47-
enforce_eager: true
47+
enforce_eager: false
4848
enable_openai_api: true
4949
enable_history: true
5050
dtype: bfloat16

examples/agentscope_tool_react/agentscopev1_tool_react_dapo.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ explorer:
4444
engine_num: 4
4545
tensor_parallel_size: 1
4646
enable_prefix_caching: false
47-
enforce_eager: true
47+
enforce_eager: false
4848
enable_openai_api: true
4949
enable_history: true
5050
dtype: bfloat16

examples/agentscope_websearch/agentscopev1_websearch_agent.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ explorer:
6868
engine_num: 4
6969
tensor_parallel_size: 1
7070
enable_prefix_caching: false
71-
enforce_eager: true
71+
enforce_eager: false
7272
dtype: bfloat16
7373
seed: 42
7474
gpu_memory_utilization: 0.7

trinity/common/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -447,7 +447,7 @@ class InferenceModelConfig:
447447
engine_num: int = 1
448448
tensor_parallel_size: int = 1
449449
use_v1: bool = True
450-
enforce_eager: bool = True
450+
enforce_eager: bool = False
451451
enable_prefix_caching: bool = False
452452
enable_chunked_prefill: bool = False
453453
gpu_memory_utilization: float = 0.9

trinity/common/models/vllm_model.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,13 @@ def __init__(
5353
os.environ["VLLM_ENABLE_V1_MULTIPROCESSING"] = "0"
5454
if get_vllm_version() >= parse_version("0.11.0"):
5555
os.environ["VLLM_ALLREDUCE_USE_SYMM_MEM"] = "0"
56+
if not config.enforce_eager:
57+
# To avoid torch compile conflicts when multiple model are started simultaneously.
58+
# remove this when the following PR is released:
59+
# https://github.com/vllm-project/vllm/pull/27616
60+
os.environ["VLLM_CACHE_ROOT"] = os.path.expanduser(
61+
f"~/.cache/vllm/{config.bundle_indices}"
62+
)
5663
self.default_sampling_params = vllm.SamplingParams(
5764
n=1,
5865
temperature=0.0,

0 commit comments

Comments
 (0)