You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/userguide/configuring.rst
+39-4Lines changed: 39 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -306,9 +306,13 @@ and Work Queue does not require Python to run.
306
306
Accelerators
307
307
------------
308
308
309
-
Many modern clusters provide multiple accelerators per compute note, yet many applications are best suited to using a single accelerator per task.
310
-
Parsl supports pinning each worker to difference accelerators using ``available_accelerators`` option of the :class:`~parsl.executors.HighThroughputExecutor`.
311
-
Provide either the number of executors (Parsl will assume they are named in integers starting from zero) or a list of the names of the accelerators available on the node.
309
+
Many modern clusters provide multiple accelerators per compute note, yet many applications are best suited to using a
310
+
single accelerator per task. Parsl supports pinning each worker to different accelerators using
311
+
``available_accelerators`` option of the :class:`~parsl.executors.HighThroughputExecutor`. Provide either the number of
312
+
executors (Parsl will assume they are named in integers starting from zero) or a list of the names of the accelerators
313
+
available on the node. Parsl will limit the number of workers it launches to the number of accelerators specified,
314
+
in other words, you cannot have more workers per node than there are accelerators. By default, Parsl will launch
315
+
as many workers as the accelerators specified via ``available_accelerators``.
312
316
313
317
.. code-block:: python
314
318
@@ -327,7 +331,38 @@ Provide either the number of executors (Parsl will assume they are named in inte
327
331
strategy='none',
328
332
)
329
333
330
-
For hardware that uses Nvidia devices, Parsl allows for the oversubscription of workers to GPUS. This is intended to make use of Nvidia's `Multi-Process Service (MPS) <https://docs.nvidia.com/deploy/mps/>`_ available on many of their GPUs that allows users to run multiple concurrent processes on a single GPU. The user needs to set in the ``worker_init`` commands to start MPS on every node in the block (this is machine dependent). The ``available_accelerators`` option should then be set to the total number of GPU partitions run on a single node in the block. For example, for a node with 4 Nvidia GPUs, to create 8 workers per GPU, set ``available_accelerators=32``. GPUs will be assigned to workers in ascending order in contiguous blocks. In the example, workers 0-7 will be placed on GPU 0, workers 8-15 on GPU 1, workers 16-23 on GPU 2, and workers 24-31 on GPU 3.
334
+
It is possible to bind multiple/specific accelerators to each worker by specifying a list of comma separated strings
335
+
each specifying accelerators. In the context of binding to NVIDIA GPUs, this works by setting ``CUDA_VISIBLE_DEVICES``
336
+
on each worker to a specific string in the list supplied to ``available_accelerators``.
337
+
338
+
Here's an example:
339
+
340
+
.. code-block:: python
341
+
342
+
# The following config is trimmed for clarity
343
+
local_config = Config(
344
+
executors=[
345
+
HighThroughputExecutor(
346
+
# Starts 2 workers per node, each bound to 2 GPUs
347
+
available_accelerators=["0,1", "2,3"],
348
+
349
+
# Start a single worker bound to all 4 GPUs
350
+
# available_accelerators=["0,1,2,3"]
351
+
)
352
+
],
353
+
)
354
+
355
+
GPU Oversubscription
356
+
""""""""""""""""""""
357
+
358
+
For hardware that uses Nvidia devices, Parsl allows for the oversubscription of workers to GPUS. This is intended to
359
+
make use of Nvidia's `Multi-Process Service (MPS) <https://docs.nvidia.com/deploy/mps/>`_ available on many of their
360
+
GPUs that allows users to run multiple concurrent processes on a single GPU. The user needs to set in the
361
+
``worker_init`` commands to start MPS on every node in the block (this is machine dependent). The
362
+
``available_accelerators`` option should then be set to the total number of GPU partitions run on a single node in the
363
+
block. For example, for a node with 4 Nvidia GPUs, to create 8 workers per GPU, set ``available_accelerators=32``.
364
+
GPUs will be assigned to workers in ascending order in contiguous blocks. In the example, workers 0-7 will be placed
365
+
on GPU 0, workers 8-15 on GPU 1, workers 16-23 on GPU 2, and workers 24-31 on GPU 3.
0 commit comments