Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Commit e3acb5c

Browse files
committed
Revert "[AOTI] Remove the original model weights in Python deployment"
This reverts commit 962ec0d.
1 parent 6c27b00 commit e3acb5c

File tree

1 file changed

+0
-13
lines changed

1 file changed

+0
-13
lines changed

torchchat/cli/builder.py

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -558,19 +558,6 @@ def _initialize_model(
558558
# attributes will NOT be seen on by AOTI-compiled forward
559559
# function, e.g. calling model.setup_cache will NOT touch
560560
# AOTI compiled and maintained model buffers such as kv_cache.
561-
# Using cpp runner to run AOTI compiled model is recommended.
562-
#
563-
# Released the loaded model to free up device memory.
564-
# The AOTI-compiled model contains a copy of the model weights.
565-
model.model = None
566-
import gc
567-
gc.collect()
568-
torch.cuda.empty_cache()
569-
570-
def do_nothing(max_batch_size, max_seq_length):
571-
pass
572-
model.setup_caches = do_nothing
573-
574561
model.forward = torch._export.aot_load(
575562
str(builder_args.dso_path.absolute()), builder_args.device
576563
)

0 commit comments

Comments
 (0)