Skip to content

Commit 844492e

Browse files
Merge pull request #120 from matyasselmeci/pr/pilot-section-docs-SOFTWARE-4430
Pilot sections docs
2 parents 2fd9b7e + 0b8c7df commit 844492e

File tree

3 files changed

+55
-14
lines changed

3 files changed

+55
-14
lines changed

README.md

Lines changed: 44 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ once the repos are enabled, install by running
1717
OSG-Configure can also be installed from a checkout.
1818
Run
1919

20-
git clone https://github.com/opensciencegrid/org-configure
20+
git clone https://github.com/opensciencegrid/osg-configure
2121
cd osg-configure
2222
make install
2323

@@ -93,12 +93,12 @@ In the tables below:
9393
Syntax and layout
9494
-----------------
9595

96-
The configuration files used by `osg-configure` are the one supported by Python's [SafeConfigParser](http://docs.python.org/library/configparser.html), similar in format to the [INI configuration file](http://en.wikipedia.org/wiki/INI_file) used by MS Windows:
96+
The configuration files used by `osg-configure` are the one supported by Python's [SafeConfigParser](https://docs.python.org/library/configparser.html), similar in format to the [INI configuration file](https://en.wikipedia.org/wiki/INI_file) used by MS Windows:
9797

9898
- Config files are separated into sections, specified by a section name in square brackets (e.g. `[Section 1]`)
9999
- Options should be set using `name = value` pairs
100100
- Lines that begin with `;` or `#` are comments
101-
- Long lines can be split up using continutations: each white space character can be preceded by a newline to fold/continue the field on a new line (same syntax as specified in [email RFC 822](http://tools.ietf.org/html/rfc822.html))
101+
- Long lines can be split up using continutations: each white space character can be preceded by a newline to fold/continue the field on a new line (same syntax as specified in [email RFC 822](https://tools.ietf.org/html/rfc822.html))
102102
- Variable substitutions are supported -- [see below](#variable-substitution)
103103

104104
`osg-configure` reads and uses all of the files in `/etc/osg/config.d` that have a ".ini" suffix. The files in this directory are ordered with a numeric prefix with higher numbers being applied later and thus having higher precedence (e.g. `00-foo.ini` has a lower precedence than `99-local-site-settings.ini`). Configuration sections and options can be specified multiple times in different files. E.g. a section called `[PBS]` can be given in `20-pbs.ini` as well as `99-local-site-settings.ini`.
@@ -417,11 +417,14 @@ If you would like to properly advertise multiple CEs per cluster, make sure that
417417

418418
#### Subcluster Configuration ####
419419

420-
Each homogeneous set of worker node hardware is called a **subcluster**. For each subcluster in your cluster, fill in the information about the worker node hardware by creating a new Subcluster section with a unique name in the following format: `[Subcluster CHANGEME]`, where CHANGEME is the globally unique subcluster name (yes, it must be a **globally** unique name for the whole grid, not just unique to your site. Get creative.)
420+
Each homogeneous set of worker node hardware is called a **subcluster**.
421+
For each subcluster in your cluster, fill in the information about the worker node hardware by creating a new Subcluster section in the following format:
422+
`[Subcluster CHANGEME]`, where CHANGEME is the subcluster name.
423+
If you have multiple subclusters, they must have different names.
421424

422425
| Option | Values Accepted | Explanation |
423426
|----------------------|-----------------------------|-------------------------------------------------------------------------------|
424-
| **name** | String | The same name that is in the Section label; it should be **globally unique** |
427+
| **name** | String | The same name that is in the Section label |
425428
| **ram\_mb** | Positive Integer | Megabytes of RAM per node |
426429
| **cores\_per\_node** | Positive Integer | Number of cores per node |
427430
| **allowed\_vos** | Comma-separated List or `*` | The VOs that are allowed to run jobs on this subcluster (autodetected if `*`) |
@@ -458,6 +461,42 @@ The following attributes are optional:
458461

459462

460463

464+
### 35-pilot.ini / [Pilot] ###
465+
466+
These sections describe the size and scale of GlideinWMS pilots that your site is willing to accept.
467+
This file contains multiple sections of the form `[Pilot <PILOT_TYPE>]`,
468+
where `<PILOT_TYPE>` is a free-form name of a type of pilot.
469+
The name should only have lower-case letters, numbers, and `-` or `_` characters.
470+
In addition, it must be unique within your cluster.
471+
We recommend a name that describes the capabilities of the pilots you accept.
472+
Good names are `singularity_8core`, `gpu`, `bigmem`, `main`.
473+
474+
The following attributes are required:
475+
| Option | Values Accepted | Explanation |
476+
|--------------------------|-----------------------------|------------------------------------------------------------------------------------------------------------------------------------|
477+
| **allowed\_vos** | Comma-separated List or `*` | The VOs that are allowed to run jobs on this resource (autodetected if `*`) |
478+
| **max\_pilots** | Positive Integer | The maximum number of pilots of this type that the factory can send to this CE |
479+
| **os** | Choice (see below) | The operating system on the workers the pilot should request. Not set by default. Only required if **require\_singularity** is `False` |
480+
| **require\_singularity** | `True`, `False` | `True` if the pilot should require Singularity support on any worker it lands on. Default `False`; **os** is optional if this is `True` |
481+
482+
Valid values for the **os** option are: `rhel6`, `rhel7`, `rhel8`, or `ubuntu18`.
483+
484+
The following attributes are optional:
485+
| Option | Values Accepted | Explanation |
486+
|---------------------|----------------------|------------------------------------------------------------------------------------------------------------------------------------------|
487+
| **cpucount** | Positive Integer | Number of cores that a job using this type of pilot can get. Default `1`; ignored if **whole\_node** is `True` |
488+
| **ram\_mb** | Positive Integer | Maximum amount of memory (in MB) that a job using this type of pilot can get. Default `2500`; ignored if **whole\_node** is `True` |
489+
| **whole\_node** | `True`, `False` | Whether this type of pilot can use all the resources on a node. Default `False`; **cpucount** and **ram\_mb** are ignored if this is `True` |
490+
| **gpucount** | Non-negative Integer | The number of GPUs to request. Default `0` |
491+
| **max\_wall\_time** | Positive Integer | Maximum wall-clock time, in minutes, that a job is allowed to run on this resource. Default `1440`, i.e. 24 hours |
492+
| **queue** | String | The queue or partition which jobs should be submitted to in order to run on this resource (see note). Not set by default |
493+
| **send\_tests** | `True`, `False` | Send test pilots. Default `False`; set it to `True` for testing job routes or pilot types |
494+
495+
**Note:** **queue** is equivalent to the HTCondor grid universe classad attribute **remote\_queue**.
496+
497+
498+
499+
461500
### 40-localsettings.ini / [Local Settings] ###
462501

463502
This section differs from other sections in that there are no set options in this section. Rather, the options set in this section will be placed in the `osg-local-job-environment.conf` verbatim. The options in this section are case sensitive and the case will be preserved when they are converted to environment variables. The `osg-local-job-environment.conf` file gets sourced by jobs run on your cluster so any variables set in this section will appear in the environment of jobs run on your system.
@@ -499,5 +538,3 @@ If your resource has multiple sponsors, you can separate them using commas or sp
499538
`osg, atlas, cms` or `osg:10, atlas:45, cms:45`.
500539
The percentages must add up to 100 if multiple sponsors are used.
501540
If you have a sponsor that is not an OSG VO, you can indicate this by using 'local' as the VO.
502-
503-

config/30-gip.ini

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,8 @@
33
;===================================================================
44

55
; For each subcluster, add a new subcluster section.
6-
; Each subcluster name must be unique for the entire grid, so make sure to not
7-
; pick anything generic like "MAIN". Each subcluster section must start with
8-
; the words "Subcluster", and cannot be named "CHANGEME".
6+
; Each subcluster section must start with the words "Subcluster", and cannot be
7+
; named "CHANGEME".
98

109
; There should be one subcluster section per set of homogeneous nodes in the
1110
; cluster.

config/35-pilot.ini

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,20 @@
33
;===================================================================
44

55
; For each pilot type, add a new pilot section.
6-
; Each pilot name must be unique for the entire grid, so make sure to not
7-
; pick anything generic like "MAIN". The name will be used as-is as the
8-
; "Name" attribute in the OSG_ResourceCatalog entry.
6+
; If you accept multiple pilot types, each section must have a different name.
7+
; Names should only contain lowercase letters, numbers, "-" or "_", and should
8+
; describe the capabilities of that type of pilot.
9+
10+
; Good names are "singularity_8core", "gpu", "bigmem", "main".
11+
12+
; The name will be used as-is as the "Name" attribute in the
13+
; OSG_ResourceCatalog entry.
914

1015
; This data is used to determine the resources requested by pilot jobs submitted by the OSG, so it's
1116
; important to keep it up to date.
1217

1318

14-
;[Pilot PILOT_NAME]
19+
;[Pilot PILOT_TYPE]
1520
;; The number of cores for this pilot type.
1621
;cpucount = 1
1722
;; The amount of memory (in megabytes) for this pilot type.

0 commit comments

Comments
 (0)