Skip to content

Commit 577b2d1

Browse files
committed
SLURM -> Slurm
1 parent 8106217 commit 577b2d1

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

docs/running/hyperqueue.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
GREASY is not supported at CSCS anymore. We recommend using HyperQueue instead.
55

66
[HyperQueue](https://it4innovations.github.io/hyperqueue/stable/) is a meta-scheduler designed for high-throughput computing on high-performance computing (HPC) clusters.
7-
It addresses the inefficiency of using traditional schedulers like SLURM for a large number of small, short-lived tasks by allowing you to bundle them into a single, larger SLURM job.
7+
It addresses the inefficiency of using traditional schedulers like Slurm for a large number of small, short-lived tasks by allowing you to bundle them into a single, larger Slurm job.
88
This approach minimizes scheduling overhead and improves resource utilization.
99

1010
By using a meta-scheduler like HyperQueue, you get fine-grained control over your tasks within the allocated resources of a single batch job.
@@ -42,8 +42,8 @@ echo "$(date): end task ${HQ_TASK_ID}: $(hostname) CUDA_VISIBLE_DEVICES=${CUDA_V
4242
```
4343

4444
[](){#ref-hyperqueue-example-script-simple}
45-
### Simple SLURM batch job script
46-
Next, create a SLURM batch script that will launch the HyperQueue server and workers, submit your tasks, wait for the tasks to finish, and then shut everything down.
45+
### Simple Slurm batch job script
46+
Next, create a Slurm batch script that will launch the HyperQueue server and workers, submit your tasks, wait for the tasks to finish, and then shut everything down.
4747

4848
```bash title="job.sh"
4949
#!/usr/local/bin/bash
@@ -83,7 +83,7 @@ $ sbatch job.sh
8383
```
8484

8585
[](){#ref-hyperqueue-example-script-advanced}
86-
### More robust SLURM batch job script
86+
### More robust Slurm batch job script
8787
A powerful feature of HyperQueue is the ability to resume a job that was interrupted, for example, by reaching a time limit or a node failure.
8888
You can achieve this by using a journal file to save the state of your tasks.
8989
By adding a journal file, HyperQueue can track which tasks were completed and which are still pending.
@@ -113,7 +113,7 @@ else
113113
export JOURNAL=~/.hq-journal-${SLURM_JOBID}
114114
fi
115115

116-
# Ensure each SLURM job has its own HyperQueue server directory
116+
# Ensure each Slurm job has its own HyperQueue server directory
117117
export HQ_SERVER_DIR=~/.hq-server-${SLURM_JOBID}
118118

119119
# Start the HyperQueue server with the journal file
@@ -155,7 +155,7 @@ To submit a new job, use `sbatch`:
155155
$ sbatch job.sh
156156
```
157157

158-
If the job fails for any reason, you can resubmit it and tell HyperQueue to pick up where it left off by passing the original SLURM job ID as an argument:
158+
If the job fails for any reason, you can resubmit it and tell HyperQueue to pick up where it left off by passing the original Slurm job ID as an argument:
159159

160160
```bash
161161
$ sbatch job.sh <job-id>

0 commit comments

Comments
 (0)