Skip to content

Commit 1e618fa

Browse files
authored
Allow multiple workers to share a CUDA device, intended for use with MPS mode (#3509)
This change allows in the case of CUDA devices the ability to set the same value of CUDA_VISIBLE_DEVICES for multiple Parsl workers on a node when using the high throughput executor. This allows the user to make use of the MPS mode for CUDA devices to partition a GPU to run multiple processes per GPU. To use MPS mode with this functionality several settings must be set by the user in their config. * available_accelerators should be set to the total number of GPU processes to be run on the node. For example, for a node with 4 Nvidia GPUS, if you wish to run 4 processes per GPU, available_accelerators should be set to 16. * worker_init should include commands to start the MPS service and set any associated environment variables. For example on the ALCF machine Polaris, it is recommended the user make use of a bash script that starts the MPS service on a node called enable_mps_polaris.sh. worker_init should then contain: worker_init='export NNODES='wc -l < $PBS_NODEFILE'; mpiexec -n ${NNODES} --ppn 1 /path/to/mps/script/enable_mps_polaris.sh'
1 parent 2e8b10e commit 1e618fa

File tree

2 files changed

+22
-2
lines changed

2 files changed

+22
-2
lines changed

docs/userguide/configuring.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -346,7 +346,8 @@ Provide either the number of executors (Parsl will assume they are named in inte
346346
strategy='none',
347347
)
348348
349-
349+
For hardware that uses Nvidia devices, Parsl allows for the oversubscription of workers to GPUS. This is intended to make use of Nvidia's `Multi-Process Service (MPS) <https://docs.nvidia.com/deploy/mps/>`_ available on many of their GPUs that allows users to run multiple concurrent processes on a single GPU. The user needs to set in the ``worker_init`` commands to start MPS on every node in the block (this is machine dependent). The ``available_accelerators`` option should then be set to the total number of GPU partitions run on a single node in the block. For example, for a node with 4 Nvidia GPUs, to create 8 workers per GPU, set ``available_accelerators=32``. GPUs will be assigned to workers in ascending order in contiguous blocks. In the example, workers 0-7 will be placed on GPU 0, workers 8-15 on GPU 1, workers 16-23 on GPU 2, and workers 24-31 on GPU 3.
350+
350351
Multi-Threaded Applications
351352
---------------------------
352353

parsl/executors/high_throughput/process_worker_pool.py

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
import pickle
1010
import platform
1111
import queue
12+
import subprocess
1213
import sys
1314
import threading
1415
import time
@@ -731,9 +732,27 @@ def worker(
731732
os.sched_setaffinity(0, my_cores) # type: ignore[attr-defined, unused-ignore]
732733
logger.info("Set worker CPU affinity to {}".format(my_cores))
733734

735+
# If CUDA devices, find total number of devices to allow for MPS
736+
# See: https://developer.nvidia.com/system-management-interface
737+
nvidia_smi_cmd = "nvidia-smi -L > /dev/null && nvidia-smi -L | wc -l"
738+
nvidia_smi_ret = subprocess.run(nvidia_smi_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
739+
if nvidia_smi_ret.returncode == 0:
740+
num_cuda_devices = int(nvidia_smi_ret.stdout.split()[0])
741+
else:
742+
num_cuda_devices = None
743+
734744
# If desired, pin to accelerator
735745
if accelerator is not None:
736-
os.environ["CUDA_VISIBLE_DEVICES"] = accelerator
746+
try:
747+
if num_cuda_devices is not None:
748+
procs_per_cuda_device = pool_size // num_cuda_devices
749+
partitioned_accelerator = str(int(accelerator) // procs_per_cuda_device) # multiple workers will share a GPU
750+
os.environ["CUDA_VISIBLE_DEVICES"] = partitioned_accelerator
751+
logger.info(f'Pinned worker to partitioned cuda device: {partitioned_accelerator}')
752+
else:
753+
os.environ["CUDA_VISIBLE_DEVICES"] = accelerator
754+
except (TypeError, ValueError, ZeroDivisionError):
755+
os.environ["CUDA_VISIBLE_DEVICES"] = accelerator
737756
os.environ["ROCR_VISIBLE_DEVICES"] = accelerator
738757
os.environ["ZE_AFFINITY_MASK"] = accelerator
739758
os.environ["ZE_ENABLE_PCI_ID_DEVICE_ORDER"] = '1'

0 commit comments

Comments
 (0)