Skip to content

Commit 55581f7

Browse files
authored
Merge pull request #173 from NYU-RTS/mdweisner-patch-1
Update 01_slurm_submitting_jobs.md
2 parents 3c0f4ab + db90a05 commit 55581f7

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@
33

44
## Batch vs Interactive Jobs
55

6-
- HPC workloads are usually better suited to *batch processing* than *interactive* working.
6+
- HPC workloads are usually better suited to *batch processing* than *interactive* workflows.
77
- A batch job is sent to the system when submitted with an **sbatch** command.
88
- The working pattern we are all familiar with is *interactive* - where we type ( or click ) something interactively, and the computer performs the associated action. Then we type ( or click ) the next thing.
99
- Comments at the start of the script, which match a special pattern ( `#SBATCH` ) are read as Slurm options.
1010

11-
### The trouble with interactive environments
11+
### Challenges of Interactive Work
1212

13-
There is a reason why GUIs are less common in HPC environments: **point-and-click** is **necessarily interactive**. In HPC environments (*as we'll see in session 3*) work is scheduled in order to allow exclusive use of the shared resources. On a busy system there may be several hours wait between when you submit a job and when the resources become available, so a reliance on user interaction is not viable. In Unix, commands need not be run interactively at the prompt, you can write a sequence of commands into a file to be run as a script, either manually (for sequences you find yourself repeating frequently) or by another program such as the batch system.
13+
There is a reason why GUIs are less common in HPC environments: **point-and-click** is **necessarily interactive**. In HPC environments (*as we'll see in section 3*) work is scheduled in order to allow exclusive use of the shared resources. On a busy system there may be several hours wait between when you submit a job and when the resources become available, so a reliance on user interaction is not viable. In Unix, commands need not be run interactively at the prompt, you can write a sequence of commands into a file to be run as a script, either manually (for sequences you find yourself repeating frequently) or by another program such as the batch system.
1414

1515
:::tip
1616
The job might not start immediately, and might take hours or days, so we prefer a *batch* approach:
@@ -21,7 +21,7 @@ You can now run the script interactively, which is a great way to save effort if
2121
- Submit the script to a batch system, to run on dedicated resources when they become available.
2222
:::
2323

24-
### Where does the output go ?
24+
### Job Output
2525

2626
- The batch system writes stdout and stderr from a job to a file named for example *"slurm-12345.out"*
2727
- You can change either stdout or stderr using sbatch options.
@@ -34,7 +34,7 @@ There are two aspects to a batch job script:
3434
- A set of *SBATCH* directives describing the resources required and other information about the job.
3535
- The script itself, comprised of commands to set up and perform the computations without additional user interaction.
3636

37-
### A simple example
37+
### A Simple Job Example
3838

3939
A typical batch script on an NYU HPC cluster looks something like these two examples:
4040

@@ -227,7 +227,7 @@ or as a command-line option to sbatch when you submit the job:
227227
[NetID@log-1 ~]$ sbatch --nodes=2 --ntasks-per-node=4 my_script.sh
228228
```
229229

230-
### Options to manage job output
230+
### Job Output Options
231231

232232
- `-J jobname`
233233
- Give the job a name. The default is the filename of the job script. Within the job, `$SLURM_JOB_NAME` expands to the job name.
@@ -244,15 +244,15 @@ or as a command-line option to sbatch when you submit the job:
244244
- `--mail-type=type`
245245
- Valid type values are NONE, BEGIN, END, FAIL, REQUIRE, ALL.
246246

247-
### Options to set the job environment:
247+
### Job Environment Options
248248

249249
- `--export=VAR1,VAR2="some value",VAR3`
250250
- Pass variables to the job, either with a specific value (the `VAR=` form) or from the submitting environment ( without "`=`" )
251251

252252
- `--get-user-env`\[=timeout]\[mode]
253253
- Run something like "su `-` \<username\> -c /usr/bin/env" and parse the output. Default timeout is 8 seconds. The mode value can be "S", or "L" in which case "su" is executed with "`-`" option.
254254

255-
### Options to request compute resources
255+
### Resource Request Options
256256

257257
- `-t, --time=time`
258258
- `Set a limit on the total run time. Acceptable formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds"`.
@@ -273,7 +273,7 @@ or as a command-line option to sbatch when you submit the job:
273273
- Require ncpus number of CPU cores per task. Without this option, allocate one core per task.
274274
- Requesting the resources you need, as accurately as possible, allows your job to be started at the earliest opportunity as well as helping the system to schedule work efficiently to everyone's benefit.
275275

276-
### Options for running interactively on the compute nodes with srun
276+
### srun & Interactive Job Options
277277

278278
- `-nnum`
279279
- `Specify the number of tasks to run, eg. -n4. Default is one CPU core per task.` Don't just submit the job, but also wait for it to start and connect `stdout`, `stderr`and `stdin` to the current terminal.
@@ -290,7 +290,7 @@ or as a command-line option to sbatch when you submit the job:
290290
- Enable X forwarding, so programs using a GUI can be used during the session (provided you have X forwarding to your workstation set up)
291291
- To leave an interactive batch session, type `exit` at the command prompt
292292

293-
### Options for delaying starting a job
293+
### Delaying Jobs
294294

295295
- `--begin=time`
296296
- Delay starting this job until after the specified date and time, eg. `--begin=9:42:00`, to start the job at 9:42:00 am
@@ -308,7 +308,7 @@ or as a command-line option to sbatch when you submit the job:
308308
- Schedule second jobs to start when the first one ends
309309
- `sbatch job2.sh`
310310

311-
### Options for running many similar jobs
311+
### Submitting Similar Jobs
312312

313313
- `-a, --array=indexes`
314314
- Submit an array of jobs with array ids as specified. Array ids can be specified as a numerical range, a comma-separated list of numbers, or as some combination of the two. Each job instance will have an environment variable `SLURM_ARRAY_JOB_ID` and `SLURM_ARRAY_TASK_ID`. For example:
@@ -406,7 +406,7 @@ Job array submission introduces an environment variable, `SLURM_ARRAY_TASK_ID`,
406406

407407
Also as shown above: two additional options `%A` and `%a`, denoting the job ID and the task ID ( i.e. job array index ) respectively, are available for specifying a job's stdout, and stderr file names.
408408

409-
## More examples
409+
## Additional Examples
410410

411411
You can find more examples in the slurm jobarray examples directory:
412412

0 commit comments

Comments
 (0)