Skip to content

Commit f12bb5f

Browse files
SimengLiu-nvfarazkh80
authored andcommitted
[None][fix] Reduce load_weight_shard warning message on integrated systems to only log once.
Signed-off-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com> Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
1 parent c2c5331 commit f12bb5f

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

tensorrt_llm/_torch/modules/linear.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,9 @@ def load_weight_shard(
7575
# For integrated GPU systems (e.g., DGX Spark), CPU and GPU share limited physical memory.
7676
# Avoiding device transfers reduces memory consumption and unnecessary data copies,
7777
# enabling support for larger models on memory-constrained systems.
78-
logger.debug(
79-
f"[load_weight_shard] Skipping device transfer from {weight.device} to {device} on integrated GPU to conserve shared memory."
80-
)
78+
logger.warning_once(
79+
f"[load_weight_shard] Skipping device transfer from {weight.device} to {device} on integrated GPU to conserve shared memory.",
80+
key="load_weight_shard_skip_device_transfer_with_integrated_gpu")
8181
device = weight.device
8282
if isinstance(weight, torch.Tensor):
8383
tensor_shape = weight.shape

0 commit comments

Comments
 (0)