Skip to content

Commit 82c55b7

Browse files
mmy360mmy360
andauthored
fix: prevent OOM when converting DeepSeek-V3 models by enabling memory-efficient loading (#524)
Co-authored-by: mmy360 <mmy360@foxmail.com>
1 parent 7c2856a commit 82c55b7

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

tools/convert_hf_to_torch_dist.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ def main():
8686
# Load model
8787
hf_model_path = args.hf_checkpoint
8888
bridge = AutoBridge.from_pretrained(hf_model_path, trust_remote_code=True)
89-
bridge.load_weights(model, hf_model_path)
89+
bridge.load_weights(model, hf_model_path, memory_efficient=True)
9090
print(f"Model loaded: {hf_model_path}")
9191

9292
save_checkpoint(1, model, None, None, 0)

0 commit comments

Comments
 (0)