use torch.cuda.mem_get_info() instead of nvidia-smi to get free mem gpu #132

luclmt · 2025-07-17T09:59:34Z

This function queries the available GPUs on the system and determines which one has
the highest amount of free memory. It uses PyTorch's CUDA APIs instead of nvidia-smi,
ensuring better compatibility with the rest of the code and reliability when
environment variables like CUDA_VISIBLE_DEVICES are used.

Using nvidia-smi together with CUDA_VISIBLE_DEVICES can lead to errors like this:

torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal

because nvidia-smi shows all GPUs visible to the system, while CUDA_VISIBLE_DEVICES only limits the GPUs visible at the CUDA API level.

Use torch instead of nvidia-smi to get free mem gpu

Update networks.py

0d483f3

Use torch instead of nvidia-smi to get free mem gpu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use torch.cuda.mem_get_info() instead of nvidia-smi to get free mem gpu #132

use torch.cuda.mem_get_info() instead of nvidia-smi to get free mem gpu #132

Uh oh!

luclmt commented Jul 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

use torch.cuda.mem_get_info() instead of nvidia-smi to get free mem gpu #132

Are you sure you want to change the base?

use torch.cuda.mem_get_info() instead of nvidia-smi to get free mem gpu #132

Uh oh!

Conversation

luclmt commented Jul 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant