Skip to content

Commit 89e6254

Browse files
authored
Remove inference_mode() from platforms.hpu (#1691)
Inference_mode() is causing recompilations with t.compile - we don't need it as we already put inference_mode on particular functions in model runner. It was introduced by Rebase 0.9.0.1 (#1507) - previously we didn't have such call.
1 parent 646db5e commit 89e6254

File tree

1 file changed

+0
-4
lines changed

1 file changed

+0
-4
lines changed

vllm/platforms/hpu.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -68,10 +68,6 @@ def is_async_output_supported(cls, enforce_eager: Optional[bool]) -> bool:
6868
def get_device_name(cls, device_id: int = 0) -> str:
6969
return cls.device_name
7070

71-
@classmethod
72-
def inference_mode(cls):
73-
return torch.no_grad()
74-
7571
@classmethod
7672
def check_and_update_config(cls, vllm_config: VllmConfig) -> None:
7773

0 commit comments

Comments
 (0)