Soperator can optionally enable Tailscale on Slurm login pods so you can SSH to login nodes securely over your Tailnet.
High level steps:
- Apply RBAC so Tailscale can store state in Kubernetes Secrets.
- Add a Tailscale container to login pods via the
SlurmClusterresource. - A Tailnet admin authenticates each login pod device using the short-lived URL printed in the Tailscale container logs.
The number of login pods is configurable. Authenticate every login pod you want reachable via Tailnet.
Note: This is install-tool agnostic. If you deploy Soperator via Helm/Flux, you still apply the RBAC and update the
SlurmClusterresource the same way.
Note: Some deployment templates may restrict RBAC to specific Secret names (for example
login-0/login-1). If you run more than two login pods, ensure RBAC allowsget/update/patchon Secrets for all login pods (or removeresourceNamesrestrictions).
Create a Role/RoleBinding in the soperator namespace.
kubectl apply -f - <<'EOF'
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: soperator
name: tailscale
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "create", "update", "patch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["get", "create", "patch"]
EOFBind the Role to the ServiceAccount used by login pods. Many deployments use default (adjust if yours differs):
kubectl apply -f - <<'EOF'
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: tailscale
namespace: soperator
subjects:
- kind: ServiceAccount
name: default
namespace: soperator
roleRef:
kind: Role
name: tailscale
apiGroup: rbac.authorization.k8s.io
EOFPatch the SlurmCluster to add a Tailscale container under:
spec.slurmNodes.login.customInitContainers[]
kubectl -n soperator patch SlurmCluster soperator \
--type='json' \
-p='[{
"op":"add",
"path":"/spec/slurmNodes/login/customInitContainers/-",
"value":{
"name":"tailscale",
"image":"ghcr.io/tailscale/tailscale:latest",
"imagePullPolicy":"Always",
"restartPolicy":"Always",
"securityContext":{"privileged":true},
"env":[
{"name":"POD_NAME","valueFrom":{"fieldRef":{"fieldPath":"metadata.name"}}},
{"name":"POD_UID","valueFrom":{"fieldRef":{"fieldPath":"metadata.uid"}}},
{"name":"TS_DEBUG_FIREWALL_MODE","value":"auto"},
{"name":"TS_KUBE_SECRET","valueFrom":{"fieldRef":{"fieldPath":"metadata.name"}}},
{"name":"TS_USERSPACE","value":"false"}
]
}
}]'Wait for login pods to restart:
kubectl -n soperator get pods -l app.kubernetes.io/component=login -wNote: The Kubernetes Secret referenced by
TS_KUBE_SECRETis created/updated automatically by the Tailscale container to persist its state. In this setupTS_KUBE_SECRETis set to the pod name (metadata.name), so each login pod typically uses a Secret with the same name as the pod. Deleting that Secret forces the pod to re-authenticate (a new auth URL will appear in logs).
List login pods:
kubectl -n soperator get pods -l app.kubernetes.io/component=loginFor each login pod, fetch the auth URL from logs:
kubectl -n soperator logs <login-pod-name> -c tailscaleLook for:
To authenticate, visit:
https://login.tailscale.com/a/...
A Tailnet admin must open the URL and approve/authenticate the device.
Tip: print auth URLs for all login pods:
for p in $(kubectl -n soperator get pods -l app.kubernetes.io/component=login -o jsonpath='{.items[*].metadata.name}'); do
echo "=== $p ==="
kubectl -n soperator logs "$p" -c tailscale | tail -n 50
echo
doneOnce approved, SSH to the Tailnet IP of a login pod:
ssh <user>@100.x.y.z- Remove the
tailscaleentry fromSlurmCluster.spec.slurmNodes.login.customInitContainers(edit theSlurmClusterand remove the container, then apply). - Delete RBAC:
kubectl -n soperator delete rolebinding tailscale
kubectl -n soperator delete role tailscale- This enables Tailnet connectivity to login pods only (not cluster-wide subnet routing).
- Auth URLs are short-lived; do authentication live with a Tailnet admin.