You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,14 +3,14 @@
3
3
4
4
## Batch vs Interactive Jobs
5
5
6
-
- HPC workloads are usually better suited to *batch processing* than *interactive*working.
6
+
- HPC workloads are usually better suited to *batch processing* than *interactive*workflows.
7
7
- A batch job is sent to the system when submitted with an **sbatch** command.
8
8
- The working pattern we are all familiar with is *interactive* - where we type ( or click ) something interactively, and the computer performs the associated action. Then we type ( or click ) the next thing.
9
9
- Comments at the start of the script, which match a special pattern ( `#SBATCH` ) are read as Slurm options.
10
10
11
-
### The trouble with interactive environments
11
+
### Challenges of Interactive Work
12
12
13
-
There is a reason why GUIs are less common in HPC environments: **point-and-click** is **necessarily interactive**. In HPC environments (*as we'll see in session 3*) work is scheduled in order to allow exclusive use of the shared resources. On a busy system there may be several hours wait between when you submit a job and when the resources become available, so a reliance on user interaction is not viable. In Unix, commands need not be run interactively at the prompt, you can write a sequence of commands into a file to be run as a script, either manually (for sequences you find yourself repeating frequently) or by another program such as the batch system.
13
+
There is a reason why GUIs are less common in HPC environments: **point-and-click** is **necessarily interactive**. In HPC environments (*as we'll see in section 3*) work is scheduled in order to allow exclusive use of the shared resources. On a busy system there may be several hours wait between when you submit a job and when the resources become available, so a reliance on user interaction is not viable. In Unix, commands need not be run interactively at the prompt, you can write a sequence of commands into a file to be run as a script, either manually (for sequences you find yourself repeating frequently) or by another program such as the batch system.
14
14
15
15
:::tip
16
16
The job might not start immediately, and might take hours or days, so we prefer a *batch* approach:
@@ -21,7 +21,7 @@ You can now run the script interactively, which is a great way to save effort if
21
21
- Submit the script to a batch system, to run on dedicated resources when they become available.
22
22
:::
23
23
24
-
### Where does the output go ?
24
+
### Job Output
25
25
26
26
- The batch system writes stdout and stderr from a job to a file named for example *"slurm-12345.out"*
27
27
- You can change either stdout or stderr using sbatch options.
@@ -34,7 +34,7 @@ There are two aspects to a batch job script:
34
34
- A set of *SBATCH* directives describing the resources required and other information about the job.
35
35
- The script itself, comprised of commands to set up and perform the computations without additional user interaction.
36
36
37
-
### A simple example
37
+
### A Simple Job Example
38
38
39
39
A typical batch script on an NYU HPC cluster looks something like these two examples:
40
40
@@ -227,7 +227,7 @@ or as a command-line option to sbatch when you submit the job:
- Give the job a name. The default is the filename of the job script. Within the job, `$SLURM_JOB_NAME` expands to the job name.
@@ -244,15 +244,15 @@ or as a command-line option to sbatch when you submit the job:
244
244
-`--mail-type=type`
245
245
- Valid type values are NONE, BEGIN, END, FAIL, REQUIRE, ALL.
246
246
247
-
### Options to set the job environment:
247
+
### Job Environment Options
248
248
249
249
-`--export=VAR1,VAR2="some value",VAR3`
250
250
- Pass variables to the job, either with a specific value (the `VAR=` form) or from the submitting environment ( without "`=`" )
251
251
252
252
-`--get-user-env`\[=timeout]\[mode]
253
253
- Run something like "su `-`\<username\> -c /usr/bin/env" and parse the output. Default timeout is 8 seconds. The mode value can be "S", or "L" in which case "su" is executed with "`-`" option.
254
254
255
-
### Options to request compute resources
255
+
### Resource Request Options
256
256
257
257
-`-t, --time=time`
258
258
-`Set a limit on the total run time. Acceptable formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds"`.
@@ -273,7 +273,7 @@ or as a command-line option to sbatch when you submit the job:
273
273
- Require ncpus number of CPU cores per task. Without this option, allocate one core per task.
274
274
- Requesting the resources you need, as accurately as possible, allows your job to be started at the earliest opportunity as well as helping the system to schedule work efficiently to everyone's benefit.
275
275
276
-
### Options for running interactively on the compute nodes with srun
276
+
### srun & Interactive Job Options
277
277
278
278
-`-nnum`
279
279
-`Specify the number of tasks to run, eg. -n4. Default is one CPU core per task.` Don't just submit the job, but also wait for it to start and connect `stdout`, `stderr`and `stdin` to the current terminal.
@@ -290,7 +290,7 @@ or as a command-line option to sbatch when you submit the job:
290
290
- Enable X forwarding, so programs using a GUI can be used during the session (provided you have X forwarding to your workstation set up)
291
291
- To leave an interactive batch session, type `exit` at the command prompt
292
292
293
-
### Options for delaying starting a job
293
+
### Delaying Jobs
294
294
295
295
-`--begin=time`
296
296
- Delay starting this job until after the specified date and time, eg. `--begin=9:42:00`, to start the job at 9:42:00 am
@@ -308,7 +308,7 @@ or as a command-line option to sbatch when you submit the job:
308
308
- Schedule second jobs to start when the first one ends
309
309
-`sbatch job2.sh`
310
310
311
-
### Options for running many similar jobs
311
+
### Submitting Similar Jobs
312
312
313
313
-`-a, --array=indexes`
314
314
- Submit an array of jobs with array ids as specified. Array ids can be specified as a numerical range, a comma-separated list of numbers, or as some combination of the two. Each job instance will have an environment variable `SLURM_ARRAY_JOB_ID` and `SLURM_ARRAY_TASK_ID`. For example:
Also as shown above: two additional options `%A` and `%a`, denoting the job ID and the task ID ( i.e. job array index ) respectively, are available for specifying a job's stdout, and stderr file names.
408
408
409
-
## More examples
409
+
## Additional Examples
410
410
411
411
You can find more examples in the slurm jobarray examples directory:
0 commit comments