-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Purpose
This issue captures the full SLURM configuration from CU Boulder's Alpine cluster as a working reference for Bodhi's SLURM setup. Alpine is a large production cluster (460 nodes, SLURM 24.11.5) with HA controllers and job submission from compute nodes working correctly.
See #1 for the specific compute-node job submission investigation this supports.
Alpine scontrol show config (2026-02-23)
Click to expand full config
Configuration data as of 2026-02-23T06:37:42
AccountingStorageBackupHost = (null)
AccountingStorageEnforce = associations,limits,qos,safe
AccountingStorageHost = slurmdb1
AccountingStorageExternalHost = (null)
AccountingStorageParameters = (null)
AccountingStoragePort = 6819
AccountingStorageTRES = cpu,mem,energy,node,billing,fs/disk,vmem,pages,gres/gpu,gres/gpu:a100,gres/gpu:a100_3g.20gb,gres/gpu:gh200,gres/gpu:l40,gres/gpu:mi100,gres/gpumem,gres/gpuutil
AccountingStorageType = accounting_storage/slurmdbd
AccountingStorageUser = N/A
AccountingStoreFlags = job_script
AcctGatherEnergyType = (null)
AcctGatherFilesystemType = (null)
AcctGatherInterconnectType = (null)
AcctGatherNodeFreq = 0 sec
AcctGatherProfileType = (null)
AllowSpecResourcesUsage = no
AuthAltTypes = auth/jwt
AuthAltParameters = jwt_key=/etc/jwt_hs256.key
AuthInfo = (null)
AuthType = auth/munge
BatchStartTimeout = 10 sec
BcastExclude = /lib,/usr/lib,/lib64,/usr/lib64
BcastParameters = (null)
BOOT_TIME = 2026-02-16T14:17:24
BurstBufferType = (null)
CertmgrParameters = (null)
CertmgrType = (null)
CliFilterPlugins = (null)
ClusterName = alpine
CommunicationParameters = (null)
CompleteWait = 0 sec
CpuFreqDef = Unknown
CpuFreqGovernors = OnDemand,Performance,UserSpace
CredType = cred/munge
DataParserParameters = (null)
DebugFlags = NO_CONF_HASH
DefMemPerNode = UNLIMITED
DependencyParameters = (null)
DisableRootJobs = no
EioTimeout = 60
EnforcePartLimits = NO
EpilogMsgTime = 2000 usec
FairShareDampeningFactor = 1
FederationParameters = (null)
FirstJobId = 1
GetEnvTimeout = 2 sec
GresTypes = gpu
GpuFreqDef = (null)
GroupUpdateForce = 1
GroupUpdateTime = 600 sec
HASH_VAL = Match
HashPlugin = hash/k12
HealthCheckInterval = 300 sec
HealthCheckNodeState = ANY
HealthCheckProgram = /usr/sbin/nhc
InactiveLimit = 0 sec
InteractiveStepOptions = --interactive --preserve-env --pty $SHELL
JobAcctGatherFrequency = 15
JobAcctGatherType = jobacct_gather/cgroup
JobAcctGatherParams = (null)
JobCompHost = localhost
JobCompLoc = (null)
JobCompParams = (null)
JobCompPort = 0
JobCompType = (null)
JobCompUser = root
JobContainerType = (null)
JobDefaults = (null)
JobFileAppend = 0
JobRequeue = 1
JobSubmitPlugins = lua
KillOnBadExit = 0
KillWait = 30 sec
LaunchParameters = (null)
Licenses = (null)
LogTimeFormat = iso8601_ms
MailDomain = (null)
MailProg = /usr/bin/smail
MaxArraySize = 4000001
MaxBatchRequeue = 5
MaxDBDMsgs = 101840
MaxJobCount = 50000
MaxJobId = 67043328
MaxMemPerNode = UNLIMITED
MaxNodeCount = 460
MaxStepCount = 40000
MaxTasksPerNode = 512
MCSPlugin = (null)
MCSParameters = (null)
MessageTimeout = 90 sec
MinJobAge = 300 sec
MpiDefault = (null)
MpiParams = (null)
NodeFeaturesPlugins = (null)
OverTimeLimit = 0 min
PluginDir = /usr/lib64/slurm
PlugStackConfig = (null)
PreemptMode = REQUEUE
PreemptParameters = (null)
PreemptType = preempt/qos
PreemptExemptTime = 00:00:00
PrEpParameters = (null)
PrEpPlugins = prep/script
PriorityParameters = (null)
PrioritySiteFactorParameters = (null)
PrioritySiteFactorPlugin = (null)
PriorityDecayHalfLife = 14-00:00:00
PriorityCalcPeriod = 00:05:00
PriorityFavorSmall = no
PriorityFlags =
PriorityMaxAge = 14-00:00:00
PriorityType = priority/multifactor
PriorityUsageResetPeriod = NONE
PriorityWeightAge = 20160
PriorityWeightAssoc = 0
PriorityWeightFairShare = 20160
PriorityWeightJobSize = 40320
PriorityWeightPartition = 0
PriorityWeightQOS = 30240
PriorityWeightTRES = (null)
PrivateData = none
ProctrackType = proctrack/cgroup
PrologEpilogTimeout = 65534
PrologFlags = Alloc,Contain,X11
PropagatePrioProcess = 0
PropagateResourceLimits = NONE
PropagateResourceLimitsExcept = (null)
RebootProgram = /usr/sbin/reboot
ReconfigFlags = (null)
RequeueExit = (null)
RequeueExitHold = (null)
ResumeFailProgram = (null)
ResumeProgram = /curc/slurm/alpine/scripts/resume_node
ResumeRate = 60 nodes/min
ResumeTimeout = 60 sec
ResvEpilog = (null)
ResvOverRun = 0 min
ResvProlog = (null)
ReturnToService = 2
SchedulerParameters = bf_max_job_test=12000,bf_max_job_user_part=200,bf_window=10080,bf_resolution=120,bf_continue,kill_invalid_depend,default_queue_depth=1000,max_switch_wait=604800,max_array_tasks=1000,bf_job_part_count_reserve=10
SchedulerTimeSlice = 30 sec
SchedulerType = sched/backfill
ScronParameters = enable
SelectType = select/cons_tres
SelectTypeParameters = CR_CORE_MEMORY
SlurmUser = slurm(515)
SlurmctldAddr = (null)
SlurmctldDebug = debug
SlurmctldHost[0] = alpine-slurmctl1
SlurmctldHost[1] = alpine-slurmctl2
SlurmctldLogFile = (null)
SlurmctldPort = 6817
SlurmctldSyslogDebug = debug
SlurmctldPrimaryOffProg = (null)
SlurmctldPrimaryOnProg = (null)
SlurmctldTimeout = 120 sec
SlurmctldParameters = idle_on_node_suspend
SlurmdDebug = error
SlurmdLogFile = (null)
SlurmdParameters = (null)
SlurmdPidFile = /var/run/slurmd.pid
SlurmdPort = 6818
SlurmdSpoolDir = /var/spool/slurmd
SlurmdSyslogDebug = error
SlurmdTimeout = 600 sec
SlurmdUser = root(0)
SlurmSchedLogFile = (null)
SlurmSchedLogLevel = 0
SlurmctldPidFile = /var/run/slurmctld.pid
SLURM_CONF = /etc/slurm/slurm.conf
SLURM_VERSION = 24.11.5
SrunEpilog = (null)
SrunPortRange = 0-0
SrunProlog = (null)
StateSaveLocation = /curc/slurm/alpine/state
SuspendExcNodes = (null)
SuspendExcParts = acompile,atesting,ahub,amilan,amilan128c,aa100,ami100,amem,amc,csu,rmacc,atesting_a100,atesting_mi100,gh200,al40,dtn
SuspendExcStates = (null)
SuspendProgram = /curc/slurm/alpine/scripts/suspend_node
SuspendRate = 60 nodes/min
SuspendTime = 3600 sec
SuspendTimeout = 60 sec
SwitchParameters = (null)
SwitchType = (null)
TaskEpilog = /etc/slurm/taskepilog
TaskPlugin = task/cgroup
TaskPluginParam = (null type)
TaskProlog = /etc/slurm/taskprolog
TCPTimeout = 2 sec
TLSParameters = (null)
TLSType = tls/none
TmpFS = /tmp
TopologyParam = TopoOptional
TopologyPlugin = topology/tree
TrackWCKey = no
TreeWidth = 16
UsePam = yes
UnkillableStepProgram = /curc/slurm/alpine/scripts/unkillable_step_program
UnkillableStepTimeout = 500 sec
VSizeFactor = 0 percent
WaitTime = 0 sec
X11Parameters = (null)
Cgroup Support Configuration:
AllowedRAMSpace = 100.0%
AllowedSwapSpace = 0.0%
CgroupMountpoint = /sys/fs/cgroup
CgroupPlugin = autodetect
ConstrainCores = yes
ConstrainDevices = yes
ConstrainRAMSpace = yes
ConstrainSwapSpace = yes
EnableControllers = no
IgnoreSystemd = no
IgnoreSystemdOnFailure = no
MaxRAMPercent = 100.0%
MaxSwapPercent = 100.0%
MemorySwappiness = (null)
MinRAMSpace = 30MB
SystemdTimeout = 1000 ms
MPI Plugins Configuration:
PMIxCliTmpDirBase = (null)
PMIxCollFence = (null)
PMIxDebug = 0
PMIxDirectConn = yes
PMIxDirectConnEarly = no
PMIxDirectConnUCX = no
PMIxDirectSameArch = no
PMIxEnv = (null)
PMIxFenceBarrier = no
PMIxNetDevicesUCX = (null)
PMIxTimeout = 300
PMIxTlsUCX = (null)
Slurmctld(primary) at alpine-slurmctl1 is UP
Slurmctld(backup) at alpine-slurmctl2 is UP
Key Settings Relevant to Bodhi
| Parameter | Alpine Value | Why It Matters |
|---|---|---|
AuthType |
auth/munge |
Primary auth — munge must be running on all nodes |
AuthAltTypes |
auth/jwt |
Fallback auth — helps with transient munge issues |
SlurmctldPort |
6817 |
Must be reachable from compute nodes |
SlurmdPort |
6818 |
Daemon port on compute nodes |
MessageTimeout |
90 sec |
9x the default (10s) — critical under load |
SlurmdTimeout |
600 sec |
Generous timeout before marking nodes down |
SlurmctldTimeout |
120 sec |
HA failover window |
ReturnToService |
2 |
Nodes automatically return to service after being marked down |
SrunPortRange |
0-0 |
Not explicitly set — uses ephemeral ports |
SelectType |
select/cons_tres |
Tracks CPU, memory, and GPU resources |
JobAcctGatherType |
jobacct_gather/cgroup |
Cgroup-based resource tracking |
TaskPlugin |
task/cgroup |
Cgroup-based task containment |
PrologFlags |
Alloc,Contain,X11 |
Node preparation and containment |
SchedulerType |
sched/backfill |
Standard backfill scheduler |
SLURM_VERSION |
24.11.5 |
Current version in use |
HA Controllers |
alpine-slurmctl1, alpine-slurmctl2 |
Dual controller for high availability |
Notes
- Alpine uses
DebugFlags = NO_CONF_HASHwhich disables config hash checking — useful during rolling updates but Bodhi should keep hash checking enabled for consistency HealthCheckProgram = /usr/sbin/nhc(Node Health Check) runs every 300s- Cgroup configuration constrains cores, devices, RAM, and swap
MaxArraySize = 4000001supports very large job arrays
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels