You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cyclecloud/openpbs.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,10 +8,10 @@ ms.author: adjohnso
8
8
9
9
# OpenPBS
10
10
11
-
[//]: #(Need to link to the scheduler README on Github)
11
+
[//]: #(Need to link to the scheduler README on GitHub)
12
12
13
13
::: moniker range="=cyclecloud-7"
14
-
[OpenPBS](http://openpbs.org/) can easily be enabled on a CycleCloud cluster by modifying the "run_list" in the configuration section of your cluster definition. A PBS Professional (PBS Pro) cluster has two main parts: the 'master' node, which runs the software on a shared filesystem, and the 'execute' nodes, which mount that filesystem and run the submitted jobs. For example, a simple cluster template snippet may look like:
14
+
[OpenPBS](http://openpbs.org/) can easily be enabled on a CycleCloud cluster by modifying the "run_list", in the configuration section of your cluster definition. A PBS Professional (PBS Pro) cluster has two main parts: the 'master' node, which runs the software on a shared filesystem, and the 'execute' nodes, which mount that filesystem and run the submitted jobs. For example, a simple cluster template snippet may look like:
15
15
16
16
```ini
17
17
[cluster my-pbspro]
@@ -50,11 +50,11 @@ These resources can be used in combination as:
Which will autoscale only if the 'Standard_HB60rs' machines are specified in the 'hpc' node array.
53
+
Which autoscales only if the 'Standard_HB60rs' machines are specified in the 'hpc' node array.
54
54
55
55
## Adding extra queues assigned to nodearrays
56
56
57
-
On clusters with multiple nodearrays, it's common to create separate queues to automatically route jobs to the appropriate VM type. In this example, we'll assume the following "gpu" nodearray has been defined in your cluster template:
57
+
On clusters with multiple nodearrays, it's common to create separate queues to automatically route jobs to the appropriate VM type. In this example, we assume the following "gpu" nodearray is defined in your cluster template:
58
58
59
59
```bash
60
60
[[nodearray gpu]]
@@ -80,7 +80,7 @@ After importing the cluster template and starting the cluster, the following com
80
80
```
81
81
82
82
> [!NOTE]
83
-
> The above queue definition packs all VMs in the queue into a single VM scale set to support MPI jobs. To define the queue for serial jobs and allow multiple VM Scalesets, set `ungrouped = true` for both `resources_default` and `default_chunk`. You can also set `resources_default.place = pack` if you want the scheduler to pack jobs onto VMs instead of round-robin allocation of jobs. For more information on PBS job packing, see the official [PBS Professional OSS documentation](https://www.altair.com/pbs-works-documentation/).
83
+
> As shown in the example, queue definition packs all VMs in the queue into a single VM scale set to support MPI jobs. To define the queue for serial jobs and allow multiple VM Scalesets, set `ungrouped = true` for both `resources_default` and `default_chunk`. You can also set `resources_default.place = pack` if you want the scheduler to pack jobs onto VMs instead of round-robin allocation of jobs. For more information on PBS job packing, see the official [PBS Professional OSS documentation](https://www.altair.com/pbs-works-documentation/).
84
84
85
85
## PBS Professional Configuration Reference
86
86
@@ -89,7 +89,7 @@ The following are the PBS Professional(PBS Pro) specific configuration options y
89
89
| PBS Pro Options | Description |
90
90
| --------------- | ----------- |
91
91
| pbspro.slots | The number of slots for a given node to report to PBS Pro. The number of slots is the number of concurrent jobs a node can execute, this value defaults to the number of CPUs on a given machine. You can override this value in cases where you don't run jobs based on CPU but on memory, GPUs, etc. |
92
-
| pbspro.slot_type | The name of type of 'slot' a node provides. The default is 'execute'. When a job is tagged with the hard resource `slot_type=<type>`, that job runs *only* on the machine of same slot type. It allows you to create a different software and hardware configurations per node and ensure an appropriate job is always scheduled on the correct type of node. |
92
+
| pbspro.slot_type | The name of type of 'slot' a node provides. The default is 'execute'. When a job is tagged with the hard resource `slot_type=<type>`, that job runs *only* on the machine of the same slot type. It allows you to create a different software and hardware configurations per node and ensure an appropriate job is always scheduled on the correct type of node. |
93
93
| pbspro.version | Default: '18.1.3-0'. This is currently the default version and *only* option to install and run. This is currently the default version and *only* option. In the future more versions of the PBS Pro software may be supported. |
94
94
95
95
::: moniker-end
@@ -182,7 +182,7 @@ Currently, disk size is hardcoded to `size::20g`. Here's an example of handling
182
182
183
183
### Autoscale and Scalesets
184
184
185
-
CycleCloud treats spanning and serial jobs differently in OpenPBS clusters. Spanning jobs will land on nodes that are part of the same placement group. The placement group has a particular platform meaning VirtualMachineScaleSet with SinglePlacementGroup=true) and CycleCloud will manage a named placement group for each spanned node set. Use the PBS resource `group_id` for this placement group name.
185
+
CycleCloud treats spanning and serial jobs differently in OpenPBS clusters. Spanning jobs land on nodes that are part of the same placement group. The placement group has a particular platform meaning VirtualMachineScaleSet with SinglePlacementGroup=true) and CycleCloud manages a named placement group for each spanned node set. Use the PBS resource `group_id` for this placement group name.
186
186
187
187
The `hpc` queue appends the equivalent of `-l place=scatter:group=group_id` by using native queue defaults.
0 commit comments