Skip to content

seqeralabs/seqera-nebius-kubernetes-tf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Seqera Logo

Nebius Kubernetes + Seqera Platform Setup

Deploy a managed Kubernetes cluster on Nebius AI Cloud with GPU support, shared filesystem storage, and Seqera Platform integration for bioinformatics workflow orchestration.

License Documentation Blog Slack


Overview

This repository contains Terraform configurations and Kubernetes manifests for deploying a production-ready Kubernetes cluster on Nebius AI Cloud. It provisions CPU and GPU node groups with InfiniBand fabric, shared ReadWriteMany storage via virtiofs + CSI, and Seqera Platform integration for running Nextflow bioinformatics pipelines at scale.

Features

  • Managed Kubernetes on Nebius: CPU and GPU (H100/H200/B200) node groups with InfiniBand fabric
  • Shared Filesystem: Nebius managed filesystem (network_ssd) exposed as ReadWriteMany PVC via virtiofs + CSI driver
  • GPU Support: Pre-installed CUDA drivers via drivers_preset, InfiniBand GPU cluster for high-performance networking
  • Seqera Platform Integration: Kubernetes credentials and compute environment managed via Terraform

Architecture

graph LR
  subgraph nebius["☁️ Nebius AI Cloud"]
    subgraph k8s["⎈ Managed Kubernetes Cluster"]
      cpu1["🖥️ CPU Node<br/>(cpu-e2)"] & cpu2["🖥️ CPU Node<br/>(cpu-e2)"] & gpu["⚡ GPU Node<br/>(H100 · InfiniBand)"]
      fs["💾 Shared FS · /mnt/data<br/>(network_ssd)"]
      pvc["📦 PVC · /scratch<br/>ReadWriteMany"]
      cpu1 & cpu2 & gpu -- "virtiofs" --> fs -- "CSI driver" --> pvc
    end
  end

  pvc -- "K8s API" --> seqera["🧬 Seqera Platform<br/>Credentials · Compute Env · Nextflow Runs"]

  style nebius fill:#e2f7f3,stroke:#087f68,color:#201637
  style k8s fill:#ffffff,stroke:#31C9AC,color:#201637
  style cpu1 fill:#e2f7f3,stroke:#087f68,color:#201637
  style cpu2 fill:#e2f7f3,stroke:#087f68,color:#201637
  style gpu fill:#087f68,stroke:#201637,color:#ffffff
  style fs fill:#e2f7f3,stroke:#31C9AC,color:#201637
  style pvc fill:#e2f7f3,stroke:#31C9AC,color:#201637
  style seqera fill:#201637,stroke:#31C9AC,color:#31C9AC
Loading

Getting Started

Prerequisites

  • Terraform installed
  • Nebius CLI installed and configured
  • kubectl installed
  • Helm installed
  • Tower CLI (tw) installed (for launching pipelines)
  • A Nebius AI Cloud account with a project and VPC subnet
  • A Seqera Platform account with an access token

Quick Start

The infrastructure deploys in two Terraform modules with a manual K8s configuration step in between.

Step 1: Nebius Infrastructure

export NEBIUS_TENANT_ID='tenant-...'
export NEBIUS_PROJECT_ID='project-...'
export NEBIUS_REGION='eu-north1'

cd nebius
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your cluster_name, filesystem_name, filestore_disk_size_gibibytes

source environment.sh
terraform init && terraform plan -out=tfplan && terraform apply tfplan

Step 2: Kubernetes Configuration

cd ..
./scripts/setup.sh

Step 3: Seqera Platform Integration

cd seqera
export TF_VAR_seqera_access_token="your-token"
terraform init && terraform apply

Usage

Launching Pipelines

Pipeline launches use the tw CLI (not Terraform):

# Simple test
tw launch https://github.com/nextflow-io/hello \
  --name "test-hello" \
  --workspace <your-workspace-id> \
  --compute-env "nebius-k8s-tf" \
  --work-dir "/scratch/work"

# nf-core pipeline
tw launch https://github.com/nf-core/demo \
  --name "test-nfcore-demo" \
  --workspace <your-workspace-id> \
  --compute-env "nebius-k8s-tf" \
  --work-dir "/scratch/work" \
  --profile "test,docker" \
  --params-file <(echo '{"outdir": "/scratch/results"}')

# Monitor
kubectl -n tower-nf get pods -w

Documentation

Deployment Details

Step 1: Nebius Infrastructure

This creates the cluster stack in a single apply: Kubernetes cluster, CPU and GPU node groups, shared filesystem, and InfiniBand GPU fabric.

environment.sh bootstraps all Nebius resources and credentials needed before terraform apply. It must be sourced (not executed) so that exported variables are available in your shell:

  1. IAM token — fetches a short-lived access token via nebius iam get-access-token
  2. VPC subnet — looks up the default subnet in your project
  3. S3 bucket — creates (or reuses) an Object Storage bucket for Terraform remote state, named with a deterministic hash of your tenant + project IDs
  4. Service account — creates (or reuses) a mk8s-seqera-sa service account in the project, ensures it is a member of the project-level editors group (creating the group if it doesn't exist)
  5. Access key — creates a temporary access key (expires in 24h) for the service account, used as AWS-compatible credentials for the S3 backend
  6. Backend config — writes terraform_backend_override.tf pointing at the S3 bucket
  7. TF_VAR exports — exports all required Terraform variables (iam_token, parent_id, region, subnet_id, tenant_id, ssh_user_name, ssh_public_key)

Copy terraform.tfvars.example to terraform.tfvars and fill in at minimum the three required variables (cluster_name, filesystem_name, filestore_disk_size_gibibytes). See Variables below for all options. Then deploy.

Step 2: Kubernetes Configuration

The setup.sh script:

  1. Fetches kubeconfig from the Nebius cluster
  2. Applies the consolidated K8s manifest (namespace tower-nf, service account, RBAC, token secret, PVC)
  3. Installs the csi-mounted-fs-path CSI driver via Helm
  4. Extracts the SA token and cluster connection details
  5. Writes cluster_server, cluster_ca_cert, and seqera_k8s_token to seqera/terraform.tfvars

Step 3: Seqera Platform Integration

setup.sh has populated seqera/terraform.tfvars with the cluster connection details. Add your Seqera credentials:

# seqera/terraform.tfvars — already written by setup.sh:
# cluster_server   = "https://..."
# cluster_ca_cert  = "-----BEGIN CERTIFICATE-----<br/>..."
# seqera_k8s_token = "<token>"

# Add these manually:
seqera_access_token = "YOUR_SEQERA_ACCESS_TOKEN"
seqera_workspace_id = 123456789

This creates:

  • A seqera_kubernetes_credential with the SA token
  • A seqera_compute_env (type k8s-platform) pointing at the Nebius cluster

Repository Structure

nebius/                          # Nebius infrastructure (Terraform root module)
  main.tf                       # K8s cluster, node groups, filesystem, GPU fabric
  variables.tf                  # All configuration variables
  provider.tf                   # Nebius, Kubernetes, Helm, kubectl providers
  environment.sh                # Sets IAM token, service account, S3 backend
  k8s-cloud-init.tftpl          # Cloud-init template (virtiofs mount, SSH keys)
  pvc.yaml                      # Example CSI PVC + test pod
  tower-launcher.yml             # Example tower launcher manifest

seqera/                          # Seqera Platform integration (Terraform root module)
  terraform.tf                   # Seqera provider requirement
  providers.tf                   # Bearer token auth
  variables.tf                   # Cluster connection + Seqera platform variables
  seqera.tf                      # K8s credential + compute environment
  outputs.tf                     # credential_id, compute_env_id
  terraform.tfvars.example

kubernetes/
  seqera-k8s-config.yaml         # Consolidated manifest: NS, SA, RBAC, token, PVC

scripts/
  setup.sh                       # Post-cluster K8s setup (RBAC, CSI, PVC, token extraction)

Shared Filesystem: The Full Chain

Nextflow requires ReadWriteMany storage. The complete chain is:

  1. Terraform creates a nebius_compute_v1_filesystem (NETWORK_SSD)
  2. Node group attaches the filesystem via template.filesystems
  3. cloud-init mounts it on each node at /mnt/data via virtiofs
  4. CSI driver (csi-mounted-fs-path) exposes node-local mounts as K8s volumes
  5. PVC with storageClassName: csi-mounted-fs-path-sc and ReadWriteMany access

Without the cloud-init virtiofs mount, the CSI driver crashes with: mounted on ext4 fs, data loss may occur, aborting

Terraform Variables

Nebius Variables

These are configured in nebius/terraform.tfvars. Sensitive values (tenant_id, parent_id, subnet_id, region) are set via environment variables by environment.sh.

Cluster
Variable Default Description
cluster_name — (required) Kubernetes cluster name
ssh_user_name ubuntu SSH username for node access
ssh_public_key — (required) SSH public key
CPU Node Group
Variable Default Description
cpu_nodes_fixed_count 3 Number of CPU nodes
cpu_nodes_platform — (required) CPU platform (e.g. cpu-e2)
cpu_nodes_preset — (required) CPU preset (e.g. 4vcpu-16gb)
cpu_disk_type NETWORK_SSD Disk type for CPU nodes
cpu_disk_size 96 Disk size in GB for CPU nodes
GPU Node Group (optional — created when gpu_nodes_platform is set)
Variable Default Description
gpu_nodes_autoscaling {} Autoscaling config (enabled, min_size, max_size)
gpu_nodes_platform null GPU platform (gpu-h100-sxm, gpu-h200-sxm, gpu-b200-sxm). Set to enable GPU nodes.
gpu_nodes_preset null GPU preset (e.g. 8gpu-128vcpu-1600gb)
infiniband_fabric null InfiniBand fabric name (e.g. fabric-3)
gpu_disk_type NETWORK_SSD Disk type for GPU nodes
gpu_disk_size 96 Disk size in GB for GPU nodes
Storage
Variable Default Description
filesystem_name — (required) Name for the shared filesystem
filestore_disk_size_gibibytes — (required) Filesystem size in GiB

Seqera Variables

These are configured in seqera/terraform.tfvars.

Variable Default Description
cluster_server "" K8s API URL — written by setup.sh
cluster_ca_cert "" Cluster CA certificate — written by setup.sh
seqera_k8s_token "" K8s SA token — written by setup.sh
seqera_access_token "" Seqera Platform access token
seqera_workspace_id 0 Seqera Platform workspace ID
seqera_ce_name nebius-k8s-tf Compute environment name
seqera_cred_name nebius-k8s-credentials Credential name
seqera_namespace tower-nf K8s namespace for Nextflow workloads
seqera_sa_name tower-launcher-sa Service account name
seqera_pvc_name tower-scratch PVC name for shared scratch storage
seqera_storage_mount /scratch Mount path inside pods
seqera_work_dir /scratch/work Nextflow work directory

Troubleshooting

CSI driver pods in CrashLoopBackOff

Symptom: mounted on ext4 fs, data loss may occur, aborting

The virtiofs filesystem is not mounted on the nodes. Verify:

kubectl debug node/<node-name> -it --image=busybox -- mount | grep virtiofs
# Expected: data on /host/mnt/data type virtiofs (rw,relatime)

If not mounted, the node group's cloud-init may not have run. Nodes may need to be replaced.

Pods stuck in Pending

Usually caused by the CSI driver not running. Fix the CSI driver first (see above), then pods will schedule.

Seqera CE creation fails

Check:

  • cluster_server, cluster_ca_cert, and seqera_k8s_token are all populated in seqera/terraform.tfvars
  • The SA token is valid (re-run setup.sh to refresh)
  • The cluster endpoint is reachable from Seqera Platform

terraform destroy fails for Seqera resources with "Missing Configuration for Required Attribute"

Symptom: Must set a configuration value for the compute_env.credentials_id attribute

The Seqera provider requires credentials_id to be resolvable even during destroy, but Terraform may destroy the credential and compute environment simultaneously. Use targeted destroy to remove them in the correct order:

cd seqera
terraform destroy -target=seqera_compute_env.nebius_k8s
terraform destroy -target=seqera_kubernetes_credential.nebius_k8s

nf-core pipeline fails with "Missing required parameter: outdir"

nf-core pipelines require --outdir. Pass it via --params-file:

--params-file <(echo '{"outdir": "/scratch/results"}')

Contributing

We welcome contributions from the community! Please see our Contributing Guidelines for details on:

  • Code of Conduct
  • Development process
  • How to submit pull requests
  • Coding standards

Support

This is not an official product of Seqeralabs support is community driven.

References

License

This project is licensed under the MIT License.

About Seqera

Seqera is the company behind Nextflow and Seqera Platform, providing solutions for data analysis and workflow orchestration in life sciences and beyond.


About

Git repository with instructions on how to setup Nebius managed Kubernetes with Seqera Platform to run Nextflow pipelines.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors