Skip to content

Conversation

faganihajizada
Copy link
Contributor

@faganihajizada faganihajizada commented Oct 15, 2025

Summary

Add SSH port (22) to worker pod container specification to enable SSH access to Slurm worker nodes. This change allows users to SSH into worker pods where they have running jobs, working in conjunction with updated slurmd images that include pam_slurm_adopt for job-based access control.

Changes:

  • Add SshPort constant (22) to worker container port definitions
  • Expose SSH port (22) on worker pods alongside the existing slurmd port (6818)
  • Add test coverage for SSH port validation in worker pod specifications

Related PR in containers repo: SlinkyProject/containers#6

Breaking Changes

N/A

Testing Notes

Built operator based on this branch and SlinkyProject/containers#6

srun -p <partition> -n1 --time=5:00 sleep 300 &

# 2. Find the worker node running the job
squeue -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R"

# 3. SSH from login node to worker node (should succeed)
ssh <worker-pod-hostname>

# 4. Verify you're on the worker node
hostname
ps aux | grep sleep

# 5. Cancel the job
scancel <job-id>

# 6. Try to SSH again (denied)
ssh <worker-pod-hostname>

Verified:

  • SSH daemon starts alongside slurmd
  • Unique SSH host keys generated per pod
  • Users can SSH to nodes where they have running jobs
  • Users cannot SSH to nodes without running jobs
  • PAM configuration applied correctly

Add SSH port (22) to worker container specification to enable users to
SSH into worker nodes where they have running jobs. This works with
updated slurmd images that include pam_slurm_adopt for access control.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant