Deploy a managed Kubernetes cluster on Nebius AI Cloud with GPU support, shared filesystem storage, and Seqera Platform integration for bioinformatics workflow orchestration.
This repository contains Terraform configurations and Kubernetes manifests for deploying a production-ready Kubernetes cluster on Nebius AI Cloud. It provisions CPU and GPU node groups with InfiniBand fabric, shared ReadWriteMany storage via virtiofs + CSI, and Seqera Platform integration for running Nextflow bioinformatics pipelines at scale.
- Managed Kubernetes on Nebius: CPU and GPU (H100/H200/B200) node groups with InfiniBand fabric
- Shared Filesystem: Nebius managed filesystem (
network_ssd) exposed as ReadWriteMany PVC via virtiofs + CSI driver - GPU Support: Pre-installed CUDA drivers via
drivers_preset, InfiniBand GPU cluster for high-performance networking - Seqera Platform Integration: Kubernetes credentials and compute environment managed via Terraform
graph LR
subgraph nebius["☁️ Nebius AI Cloud"]
subgraph k8s["⎈ Managed Kubernetes Cluster"]
cpu1["🖥️ CPU Node<br/>(cpu-e2)"] & cpu2["🖥️ CPU Node<br/>(cpu-e2)"] & gpu["⚡ GPU Node<br/>(H100 · InfiniBand)"]
fs["💾 Shared FS · /mnt/data<br/>(network_ssd)"]
pvc["📦 PVC · /scratch<br/>ReadWriteMany"]
cpu1 & cpu2 & gpu -- "virtiofs" --> fs -- "CSI driver" --> pvc
end
end
pvc -- "K8s API" --> seqera["🧬 Seqera Platform<br/>Credentials · Compute Env · Nextflow Runs"]
style nebius fill:#e2f7f3,stroke:#087f68,color:#201637
style k8s fill:#ffffff,stroke:#31C9AC,color:#201637
style cpu1 fill:#e2f7f3,stroke:#087f68,color:#201637
style cpu2 fill:#e2f7f3,stroke:#087f68,color:#201637
style gpu fill:#087f68,stroke:#201637,color:#ffffff
style fs fill:#e2f7f3,stroke:#31C9AC,color:#201637
style pvc fill:#e2f7f3,stroke:#31C9AC,color:#201637
style seqera fill:#201637,stroke:#31C9AC,color:#31C9AC
- Terraform installed
- Nebius CLI installed and configured
- kubectl installed
- Helm installed
- Tower CLI (
tw) installed (for launching pipelines) - A Nebius AI Cloud account with a project and VPC subnet
- A Seqera Platform account with an access token
The infrastructure deploys in two Terraform modules with a manual K8s configuration step in between.
Step 1: Nebius Infrastructure
export NEBIUS_TENANT_ID='tenant-...'
export NEBIUS_PROJECT_ID='project-...'
export NEBIUS_REGION='eu-north1'
cd nebius
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your cluster_name, filesystem_name, filestore_disk_size_gibibytes
source environment.sh
terraform init && terraform plan -out=tfplan && terraform apply tfplanStep 2: Kubernetes Configuration
cd ..
./scripts/setup.shStep 3: Seqera Platform Integration
cd seqera
export TF_VAR_seqera_access_token="your-token"
terraform init && terraform applyPipeline launches use the tw CLI (not Terraform):
# Simple test
tw launch https://github.com/nextflow-io/hello \
--name "test-hello" \
--workspace <your-workspace-id> \
--compute-env "nebius-k8s-tf" \
--work-dir "/scratch/work"
# nf-core pipeline
tw launch https://github.com/nf-core/demo \
--name "test-nfcore-demo" \
--workspace <your-workspace-id> \
--compute-env "nebius-k8s-tf" \
--work-dir "/scratch/work" \
--profile "test,docker" \
--params-file <(echo '{"outdir": "/scratch/results"}')
# Monitor
kubectl -n tower-nf get pods -wThis creates the cluster stack in a single apply: Kubernetes cluster, CPU and GPU node groups, shared filesystem, and InfiniBand GPU fabric.
environment.sh bootstraps all Nebius resources and credentials needed before terraform apply. It must be sourced (not executed) so that exported variables are available in your shell:
- IAM token — fetches a short-lived access token via
nebius iam get-access-token - VPC subnet — looks up the default subnet in your project
- S3 bucket — creates (or reuses) an Object Storage bucket for Terraform remote state, named with a deterministic hash of your tenant + project IDs
- Service account — creates (or reuses) a
mk8s-seqera-saservice account in the project, ensures it is a member of the project-leveleditorsgroup (creating the group if it doesn't exist) - Access key — creates a temporary access key (expires in 24h) for the service account, used as AWS-compatible credentials for the S3 backend
- Backend config — writes
terraform_backend_override.tfpointing at the S3 bucket - TF_VAR exports — exports all required Terraform variables (
iam_token,parent_id,region,subnet_id,tenant_id,ssh_user_name,ssh_public_key)
Copy terraform.tfvars.example to terraform.tfvars and fill in at minimum the three required variables (cluster_name, filesystem_name, filestore_disk_size_gibibytes). See Variables below for all options. Then deploy.
The setup.sh script:
- Fetches kubeconfig from the Nebius cluster
- Applies the consolidated K8s manifest (namespace
tower-nf, service account, RBAC, token secret, PVC) - Installs the
csi-mounted-fs-pathCSI driver via Helm - Extracts the SA token and cluster connection details
- Writes
cluster_server,cluster_ca_cert, andseqera_k8s_tokentoseqera/terraform.tfvars
setup.sh has populated seqera/terraform.tfvars with the cluster connection details. Add your Seqera credentials:
# seqera/terraform.tfvars — already written by setup.sh:
# cluster_server = "https://..."
# cluster_ca_cert = "-----BEGIN CERTIFICATE-----<br/>..."
# seqera_k8s_token = "<token>"
# Add these manually:
seqera_access_token = "YOUR_SEQERA_ACCESS_TOKEN"
seqera_workspace_id = 123456789This creates:
- A
seqera_kubernetes_credentialwith the SA token - A
seqera_compute_env(typek8s-platform) pointing at the Nebius cluster
nebius/ # Nebius infrastructure (Terraform root module)
main.tf # K8s cluster, node groups, filesystem, GPU fabric
variables.tf # All configuration variables
provider.tf # Nebius, Kubernetes, Helm, kubectl providers
environment.sh # Sets IAM token, service account, S3 backend
k8s-cloud-init.tftpl # Cloud-init template (virtiofs mount, SSH keys)
pvc.yaml # Example CSI PVC + test pod
tower-launcher.yml # Example tower launcher manifest
seqera/ # Seqera Platform integration (Terraform root module)
terraform.tf # Seqera provider requirement
providers.tf # Bearer token auth
variables.tf # Cluster connection + Seqera platform variables
seqera.tf # K8s credential + compute environment
outputs.tf # credential_id, compute_env_id
terraform.tfvars.example
kubernetes/
seqera-k8s-config.yaml # Consolidated manifest: NS, SA, RBAC, token, PVC
scripts/
setup.sh # Post-cluster K8s setup (RBAC, CSI, PVC, token extraction)
Nextflow requires ReadWriteMany storage. The complete chain is:
- Terraform creates a
nebius_compute_v1_filesystem(NETWORK_SSD) - Node group attaches the filesystem via
template.filesystems - cloud-init mounts it on each node at
/mnt/datavia virtiofs - CSI driver (
csi-mounted-fs-path) exposes node-local mounts as K8s volumes - PVC with
storageClassName: csi-mounted-fs-path-scandReadWriteManyaccess
Without the cloud-init virtiofs mount, the CSI driver crashes with: mounted on ext4 fs, data loss may occur, aborting
These are configured in nebius/terraform.tfvars. Sensitive values (tenant_id, parent_id, subnet_id, region) are set via environment variables by environment.sh.
| Variable | Default | Description |
|---|---|---|
cluster_name |
— (required) | Kubernetes cluster name |
ssh_user_name |
ubuntu |
SSH username for node access |
ssh_public_key |
— (required) | SSH public key |
| Variable | Default | Description |
|---|---|---|
cpu_nodes_fixed_count |
3 |
Number of CPU nodes |
cpu_nodes_platform |
— (required) | CPU platform (e.g. cpu-e2) |
cpu_nodes_preset |
— (required) | CPU preset (e.g. 4vcpu-16gb) |
cpu_disk_type |
NETWORK_SSD |
Disk type for CPU nodes |
cpu_disk_size |
96 |
Disk size in GB for CPU nodes |
| Variable | Default | Description |
|---|---|---|
gpu_nodes_autoscaling |
{} |
Autoscaling config (enabled, min_size, max_size) |
gpu_nodes_platform |
null |
GPU platform (gpu-h100-sxm, gpu-h200-sxm, gpu-b200-sxm). Set to enable GPU nodes. |
gpu_nodes_preset |
null |
GPU preset (e.g. 8gpu-128vcpu-1600gb) |
infiniband_fabric |
null |
InfiniBand fabric name (e.g. fabric-3) |
gpu_disk_type |
NETWORK_SSD |
Disk type for GPU nodes |
gpu_disk_size |
96 |
Disk size in GB for GPU nodes |
| Variable | Default | Description |
|---|---|---|
filesystem_name |
— (required) | Name for the shared filesystem |
filestore_disk_size_gibibytes |
— (required) | Filesystem size in GiB |
These are configured in seqera/terraform.tfvars.
| Variable | Default | Description |
|---|---|---|
cluster_server |
"" |
K8s API URL — written by setup.sh |
cluster_ca_cert |
"" |
Cluster CA certificate — written by setup.sh |
seqera_k8s_token |
"" |
K8s SA token — written by setup.sh |
seqera_access_token |
"" |
Seqera Platform access token |
seqera_workspace_id |
0 |
Seqera Platform workspace ID |
seqera_ce_name |
nebius-k8s-tf |
Compute environment name |
seqera_cred_name |
nebius-k8s-credentials |
Credential name |
seqera_namespace |
tower-nf |
K8s namespace for Nextflow workloads |
seqera_sa_name |
tower-launcher-sa |
Service account name |
seqera_pvc_name |
tower-scratch |
PVC name for shared scratch storage |
seqera_storage_mount |
/scratch |
Mount path inside pods |
seqera_work_dir |
/scratch/work |
Nextflow work directory |
Symptom: mounted on ext4 fs, data loss may occur, aborting
The virtiofs filesystem is not mounted on the nodes. Verify:
kubectl debug node/<node-name> -it --image=busybox -- mount | grep virtiofs
# Expected: data on /host/mnt/data type virtiofs (rw,relatime)If not mounted, the node group's cloud-init may not have run. Nodes may need to be replaced.
Usually caused by the CSI driver not running. Fix the CSI driver first (see above), then pods will schedule.
Check:
cluster_server,cluster_ca_cert, andseqera_k8s_tokenare all populated inseqera/terraform.tfvars- The SA token is valid (re-run
setup.shto refresh) - The cluster endpoint is reachable from Seqera Platform
Symptom: Must set a configuration value for the compute_env.credentials_id attribute
The Seqera provider requires credentials_id to be resolvable even during destroy, but Terraform may destroy the credential and compute environment simultaneously. Use targeted destroy to remove them in the correct order:
cd seqera
terraform destroy -target=seqera_compute_env.nebius_k8s
terraform destroy -target=seqera_kubernetes_credential.nebius_k8snf-core pipelines require --outdir. Pass it via --params-file:
--params-file <(echo '{"outdir": "/scratch/results"}')We welcome contributions from the community! Please see our Contributing Guidelines for details on:
- Code of Conduct
- Development process
- How to submit pull requests
- Coding standards
This is not an official product of Seqeralabs support is community driven.
- Community Forum: Seqera Community
- Slack: Join our Slack workspace
- Email: support@seqera.io
- Issues: GitHub Issues
- Nebius: Filesystem over CSI
- Nebius: mk8s_v1_node_group Resource
- Nebius: GPU Configurations
- Nebius: InfiniBand Fabrics
- Seqera: Kubernetes Compute Environments
- Seqera Terraform Provider
- Tower CLI Documentation
This project is licensed under the MIT License.
Seqera is the company behind Nextflow and Seqera Platform, providing solutions for data analysis and workflow orchestration in life sciences and beyond.