You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cyclecloud/gridengine.md
+18-19Lines changed: 18 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,7 +50,7 @@ Azure CycleCloud supports autoscaling for Grid Engine. This behavior means the s
50
50
Autoscale = True
51
51
```
52
52
53
-
By default, all jobs submitted to the Grid Engine queue run on machines of type 'execute'. These machines defined by the node array named 'execute'. You are not limited to the name 'execute', nor are you limited to a single type of machine configuration to run jobs and autoscale on.
53
+
By default, all jobs submitted to the Grid Engine queue run on machines of type 'execute'. These machines defined by the node array named 'execute'. You aren't limited to the name 'execute', nor are you limited to a single type of machine configuration to run jobs and autoscale on.
54
54
55
55
For example, a common scenario involves a cluster with two different node definitions. One is designed for running 'normal' jobs that use standard CPUs. The other is intended for jobs that require GPU-enabled machines. In this case, you would want to independently scale your queue by both normal jobs and GPU jobs to make sure you have an appropriate amount of each machine to consume the work queue. An example definition would be something like:
56
56
@@ -85,7 +85,7 @@ Autoscale = True
85
85
gridengine.slots = 2
86
86
```
87
87
88
-
In the example shown, there are now two node arrays: One is a 'standard' execute node array, the second is named 'gpu' providing a MachineType that has two Nvidia GPU's (Standard_NV12 in Azure). Also note that there are now two new items in the configuration section besides the 'csge:sgeexec' recipe. Adding `gridengine.slot_type = gpu` tells the Grid Engine scheduler that these nodes should be named 'gpu' nodes and thus should only run 'gpu' jobs. The name 'gpu' is arbitrary, but a name that describes the node is most useful. Set `gridengine.slots = 2`, which tells the software to make sure that this type of node can only run two jobs at once (Standard_NV12 only has 2 GPUs).
88
+
In the example shown, there are now two node arrays: One is a 'standard' execute node array, the second is named 'gpu' providing a MachineType that has two NVIDIA GPUs (Standard_NV12 in Azure). Also note that there are now two new items in the configuration section besides the 'csge:sgeexec' recipe. Adding `gridengine.slot_type = gpu` tells the Grid Engine scheduler that these nodes should be named 'gpu' nodes and thus should only run 'gpu' jobs. The name 'gpu' is arbitrary, but a name that describes the node is most useful. Set `gridengine.slots = 2`, which tells the software to make sure that this type of node can only run two jobs at once (Standard_NV12 only has 2 GPUs).
89
89
90
90
By default, Grid Engine assigns the number of slots per node based on the system's CPU count. In this case, that default behavior could result in too many jobs running concurrently on a single node. In the example shown, `CoreCount=2` is set on the nodearray to match the number of GPUs available on the MachineType, allowing CycleCloud to correctly scale that array on GPU vs CPU count.
91
91
@@ -104,7 +104,7 @@ You can verify the number of slots and slot_type your machines have by running t
104
104
all.q@ip-0A000406 BIP 0/0/4 0.25 linux-x64
105
105
```
106
106
107
-
Notice that there is one of each 'slot_type' specified 'execute' and 'gpu'. These slot_types are configrured individually, and the number of slots for the 'execute' slot is 4, which is the number of CPUs on the machine. The number of slots for the 'gpu' slot type is 2, which we specified in our cluster configuration template. The third machine is the master node which does not run jobs.
107
+
Notice that there is one of each 'slot_type' specified 'execute' and 'gpu'. The slot_types are configured individually, and the number of slots for the 'execute' slot is 4, which is the number of CPUs on the machine. The number of slots for the 'gpu' slot type is 2, which we specified in our cluster configuration template. The third machine is the master node which doesn't run jobs.
108
108
109
109
## Grid Engine Advanced Usage
110
110
@@ -135,7 +135,7 @@ This command ensures that the job only runs on a 'slot_type' of 'gpu'.
135
135
136
136
If slot_type is omitted, 'execute' automatically assigns to the job. The user can modify the mechanism that automatically assigns slot_type's to jobs. A python script located at _/opt/cycle/jetpack/config/autoscale.py_ can be created which should define a single function "sge_job_handler". This function receives a dictionary representation of the job, similar to the output of a `qstat -j JOB_ID` command and should return a dictionary of hard resources that need to be updated for the job.
137
137
138
-
As an example, the following script assigns a job to the 'gpu' slot_type if the jobs name includes the letters 'gpu'. This lets users submit their jobs automatically without changing the job parameters, while still ensuring the jobs run on and autoscale the correct nodes:
138
+
As an example, the following script assigns a job to the 'gpu' slot_type if the jobs name includes the letters 'gpu'. Users are allowed to submit their jobs automatically without changing the job parameters, while still ensuring the jobs run on and autoscale the correct nodes:
139
139
140
140
```python
141
141
#!/usr/env python
@@ -208,15 +208,15 @@ The following are the Grid Engine specific configuration options you can toggle
| gridengine.slots | The number of slots for a given node to report to Grid Engine. The number of slots is the number of concurrent jobs a node can execute, this value defaults to the number of CPUs on a given machine. You can override this value in cases where you don't run jobs based on CPU but on memory, GPUs, etc. |
210
210
| gridengine.slot_type | The name of type of 'slot' a node provides. The default is 'execute'. When a job is tagged with the hard resource 'slot_type=', that job *only* runs on a machine of the same slot type. This tagging allows you to create different software and hardware configurations per node and ensure an appropriate job is always scheduled on the correct type of node. |
211
-
| gridengine.ignore_fqdn | Default: true. Set to false if all the nodes in your cluster are not part of a single DNS domain. |
211
+
| gridengine.ignore_fqdn | Default: true. Set to false if all the nodes in your cluster aren't part of a single DNS domain. |
212
212
| gridengine.version | Default: '2011.11'. This configuration option specifies the Grid Engine version to install and run. Currently, it's the default and the *only* available option. Other versions of the Grid Engine software may be supported in the future. |
213
-
| gridengine.root | Default: '/sched/sge/sge-2011.11' This location is where the Grid Engine installs and mounts on each node in the system. It is recommended to kep this value unchanged. However, if you modify, ensure to set the same value on **every** node in the cluster. |
213
+
| gridengine.root | Default: '/sched/sge/sge-2011.11' This location is where the Grid Engine installs and mounts on each node in the system. It's recommended to keep this value unchanged. However, if you modify, ensure to set the same value on **every** node in the cluster. |
Submitting a job by `qsub -q short.q -pe mpi02 12 my-script.sh` starts at lease one VM. When the cluster is added, it joins the _@mpihg02_ hostgroup because that's the hostgroup both available to the queue and to the parallel environment. It also joins _@allhosts_, a special hostgroup.
274
274
275
-
If you submit a job with `qsub -q short.q my-script.sh` and dont specify a parallel environment pe, the resulting VM joins the _@allhosts_ and _@lowpriority_ hostgroups linked to the queue that aren't assigned any pes.
275
+
If you submit a job with `qsub -q short.q my-script.sh` and don't specify a parallel environment pe, the resulting VM joins the _@allhosts_ and _@lowpriority_ hostgroups linked to the queue that isn't assigned any pes.
276
276
277
277
Finally, a job submitted with `qsub -q short.q -pe mpi0* 12 my-script.sh` results in a VM added to either _@mpihg01_ or _@mpihg02_ depending on CycleCloud allocation predictions.
278
278
@@ -290,10 +290,9 @@ Here we opt out of placement groups for the _make_ pe:
290
290
```
291
291
292
292
### CycleCloud Placement Groups
293
-
CycleCloud placement groups map one-to-one to Azure VMSS with SinglePlacementGroup - VMs in a placementgroup share an Infiniband Fabric and share only with VMs within the placement group. To intuitively preserve these silos, the placementgroups map 1:1 with gridengine parallel environment as well.
293
+
CycleCloud placement groups map one-to-one to Azure VMSS with SinglePlacementGroup - VMs in a placement group share an Infiniband Fabric and share only with VMs within the placement group. To intuitively preserve these silos, the placement groups map 1:1 with gridengine parallel environment as well.
294
294
295
-
Specifying a parallel environment for a job will restrict the job to run in a placement group via smart hostgroup assignment logic. You can disable this behavior through the corresponding configuration in
Specifying a parallel environment for a job restricts the job to run in a placement group via smart hostgroup assignment logic. You can disable this behavior through the corresponding configuration in _autoscale.json_: `"required_placement_groups" : false`.
297
296
298
297
### Autoscale config
299
298
@@ -321,7 +320,7 @@ The autoscaling program only considers *Relevant Resource*
321
320
322
321
### Another autoscaling resource
323
322
324
-
By default, jobs request a number of slots, and the cluster scales based on those requests.
323
+
By default, jobs request many slots, and the cluster scales based on those requests.
325
324
326
325
Let's say we want to autoscale by the job resource request for `m_mem_free`.
327
326
@@ -438,7 +437,7 @@ CycleCloud finds get the intersection of hostgroups which:
438
437
439
438
1. Are included in the _pe\_list_ for _cloud.q_ and match the pe name, for example, `pe_list [@allhosts=mpislots],[@hpc1=mpi]`.
440
439
1. Have adequate resources and subscription quota to provide all job resources.
441
-
1. The hostgroup constraints configuration does not filter them.
440
+
1. The hostgroup constraints configuration doesn't filter them.
442
441
443
442
Multiple hostgroups might meet these requirements. When that happens, the system must decide which one to use. There are three ways to resolve hostgroup membership conflicts:
444
443
@@ -473,7 +472,7 @@ When a queue or xproject defines multiple hostgroups, any of those groups can po
473
472
> Inspect all the available node properties by `azge buckets`.
474
473
475
474
## azge
476
-
This package comes with a command-line, _azge_. This program is used to perform autoscaling and has broken all the subprocesses under autoscale. These commands rely on the gridengine environment variables to be set - you must be able to call `qconf` and `qsub` from the same profile where `azge` is called.
475
+
This package comes with a command-line, _azge_. This program is used to perform autoscaling and breaks all subprocesses under autoscale into separate components. These commands rely on the gridengine environment variables to be set - you must be able to call `qconf` and `qsub` from the same profile where `azge` is called.
477
476
478
477
| _azge_ commands | Description |
479
478
|---|---|
@@ -484,12 +483,12 @@ This package comes with a command-line, _azge_. This program is used to perform
484
483
| demand | Matches job requirements to cyclecloud buckets and provides autoscale result
485
484
| autoscale | Does full autoscale, starting and removing nodes according to configurations
486
485
487
-
When modifying scheduler configurations (_qconf_) or autoscale configurations (_autoscale.json_), or even setting up for the first time, _azge_ can be used to check autoscale behavior is matching expections. As root, you can run the following operations. To understand how autoscale works, it's important to get familiar with these.
486
+
When modifying scheduler configurations (_qconf_) or autoscale configurations (_autoscale.json_), or even setting up for the first time, _azge_ can be used to check autoscale behavior is matching expectations. As root, you can run the following operations. To understand how autoscale works, it's important to get familiar with these concepts.
488
487
489
488
1. Run `azge validate` to verify configurations for known issues.
490
-
1. Run `azge buckets` to check what resources are offered by your CycleCloud cluster.
489
+
1. Run `azge buckets` to check the resources offered by the CycleCloud cluster.
491
490
1. Run `azge jobs` to inspect the queued job details.
492
-
1. Run `azge demand` perform the job to bucket matching, examine which jobs get matched to which buckets and hostgroups.
491
+
1. Run `azge demand` perform the job to bucket matching, examine which jobs are matched to which buckets and hostgroups.
493
492
1. Run `azge autoscale` to kickoff the node allocation process, or add nodes which are ready to join.
494
493
495
494
Once the commands are working as expected, enable ongoing autoscale by adding the `azge autoscale` command to the root crontab. Ensure to source the gridengine environment variables in advance.
@@ -502,7 +501,7 @@ Once the commands are working as expected, enable ongoing autoscale by adding th
502
501
503
502
CycleCloud supports cloud bursting scenario. The base configuration assumes that the `$SGE_ROOT` directory is available to the cloud nodes. This assumption can be relaxed by setting `gridengine.shared.spool = false`, `gridengine.shared.bin = false`, and installing GridEngine locally.
504
503
505
-
For a simple case, you should provide a filesystem that the execute nodes can mount. This filesystem must include the … directory, and you configure the mount in the optional settings. When the dependency of the sched and shared directories are released, you can shut down the scheduler node that is part of the cluster by-default and use the configurations from the external filesystem.
504
+
For a simple case, you should provide a filesystem that the execute nodes can mount. This filesystem must include the … directory, and you configure the mount in the optional settings. When the dependencies of the sched and shared directories are released, you can shut down the scheduler node that is part of the cluster by-default and use the configurations from the external filesystem.
506
505
507
506
1. Create a new gridengine cluster.
508
507
1. Disable return proxy.
@@ -545,7 +544,7 @@ Make a local copy of the gridengine template and modify it to use the UGE instal
In the _gridengine.txt_ file, locate the first occurrence of `[[[configuration]]]` and insert text to match the following snippet. The file is not sensitive to indentation.
547
+
In the _gridengine.txt_ file, locate the first occurrence of `[[[configuration]]]` and insert text to match the following snippet. The file isn't sensitive to indentation.
549
548
550
549
> NOTE:
551
550
> The details in the configuration, particularly version, should match the installer file name.
0 commit comments