Skip to content

Commit b2e0ad3

Browse files
authored
[Perf] Reduce peak memory usage of llama (#10339)
Signed-off-by: andoorve <[email protected]>
1 parent 4a18fd1 commit b2e0ad3

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

vllm/model_executor/models/llama.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,8 +90,8 @@ def __init__(
9090
self.act_fn = SiluAndMul()
9191

9292
def forward(self, x):
93-
gate_up, _ = self.gate_up_proj(x)
94-
x = self.act_fn(gate_up)
93+
x, _ = self.gate_up_proj(x)
94+
x = self.act_fn(x)
9595
x, _ = self.down_proj(x)
9696
return x
9797

0 commit comments

Comments
 (0)