-
A normal job submitted with:
sbatch my_job.sh
gets a single JobID, e.g.
2473820, and a job name:-
By default the job name is the script filename (
my_job.sh). -
You can override it with
--job-name, for example:sbatch --job-name=my_job_name my_job.sh
-
-
An array job like:
sbatch --job-name=my_array --array=0-3 my_array_job.sh
creates:
- One array JobID (parent), e.g.
2473824, withJobName = my_array. - One task per index:
2473824_0,2473824_1,2473824_2,2473824_3(all share the sameJobName). - SLURM may also log helper steps such as
2473824_0.batchand2473824_0.extern.
- One array JobID (parent), e.g.
So a single submission can produce many JobIDs in history: all of the array
elements share the same array JobID (e.g. 2473824) but their recorded JobName
values can differ (e.g. my_array, my_array.batch, my_array.extern), so
they are tied together primarily by the array JobID, not always by an identical
job name.
- In the QUEUED JOBS table:
- The
JOB NAMEcolumn comes from theNamefield insqueue(which is based on the job name / script name as described above).
- The
- In the FINISHED JOBS and FAILURES sections:
- The
JOB NAMEcolumn comes fromJobNameinsacct, which is the job name recorded by accounting (usually matching the submission’s job name, but it can include step variants such as.batch/.extern).
- The
Where the dashboard groups things “by name” (for example in QUEUED JOBS and the failures summary), it uses this job name label, not the script filename directly, even though the default job name is often the script filename.
The diagram below summarises how a single array submission flows through:
- User – the
sbatchcommand you run. - Slurm accounting – the array JobID, its task JobIDs, and helper steps.
- Dashboard – how those jobs and steps show up in QUEUED JOBS (live
queue, grouped by
JobName), FINISHED JOBS (one row per successful JobID), and FAILURES (non‑zero‑exit rows grouped byJobName).
{{SLURM_JOB_ARRAY_DIAGRAM}}
squeue– live view of what is still in the queue (PENDING / RUNNING / etc.).sacct– history of jobs that SLURM accounting has recorded as finished or failed.scontrol show job– detailed dump for one specific JobID (used on the Job inspector page).
If a job never appears in sacct (because of cluster settings or retention), the dashboard cannot show it under FINISHED JOBS or FAILURES once it leaves the queue.
- Based on
squeue. - Shows counts right now:
- TOTAL jobs
- RUNNING jobs
- WAITING jobs
- DEP problems (blocked by
DependencyNeverSatisfied).
- Based on
squeue. - Groups by job name and shows counts of RUN / WAIT / TOTAL in the current queue snapshot.
- As soon as a job leaves the queue (finishes or fails) it disappears from this table; long-term success/failure is tracked via the FINISHED JOBS and FAILURES sections (from
sacct).
- Based on
sacctwithin a time window:- Starts roughly when your longest-running current job started (derived
from the elapsed time in
squeue), or from the beginning of today (UTC) if nothing is running, subject to whatever accounting history your cluster retains.
- Starts roughly when your longest-running current job started (derived
from the elapsed time in
- Shows one row per JobID where:
StatecontainsCOMPLETED, andExitCodestarts with0:.
- Split into:
- Related to running jobs – finished jobs whose array JobID matches an array that currently has at least one RUNNING job.
- Other finished jobs – all other successful jobs in the window.
Additional columns shown here:
REQUESTED MEMORY– the memory request as recorded by Slurm accounting (ReqMem) on the main job / array-element row (e.g.2469691_5).MAX USED MEMORY in GB– the peak resident set size (MaxRSS), converted to GiB, usually reported on the corresponding batch step (e.g.2469691_5.batch).- For job arrays, you can match requested vs used memory for a given element by
using the shared array JobID / the prefix of the
JOB ID(e.g.2469691_5and2469691_5.batch).
- Also based on
sacctin the same time window. - Includes rows where:
StatematchesFAILED,CANCELLED,TIMEOUT,OUT_OF_MEMORY, orExitCodeis non-zero.
- Split into:
- Related to running job names – failures for names that still have something RUNNING.
- Other failures – everything else.
The failures tables reuse many of the same columns as FINISHED JOBS and add a
per-name summary view (grouped by JOB NAME) with:
Count– how many failures have been recorded for that name.Last*columns – details from the most recent failing job for that name, includingLastJobID,LastState,LastExitCode,LastElapsed,LastNode, andMaxRSS. Conditional styling highlights important failure states such asOUT_OF_MEMORYandTIMEOUTso they stand out visually.
Array script:
#!/bin/bash
#SBATCH --job-name=instant_test
#SBATCH --array=0-1
if [ "$SLURM_ARRAY_TASK_ID" -eq 0 ]; then
exit 0 # success
else
exit 1 # failure
fiSubmitting this once may produce history rows like:
2473824_0withCOMPLETED 0:0→ appears in FINISHED JOBS.2473824_1withFAILED 1:1→ appears in FAILURES.- Additional helper rows
2473824_0.batch,2473824_0.extern, etc., depending on cluster settings.
While the array is still in the queue, instant_test also appears in QUEUED JOBS. After it leaves the queue it is only visible via FINISHED / FAILURES (if sacct logs it).