Skip to content

SCHED-1071: Local NVMe support#874

Merged
ali-sattari merged 4 commits intosoperator-release-3.0from
SCHED-1071/local-nvme-support
Mar 16, 2026
Merged

SCHED-1071: Local NVMe support#874
ali-sattari merged 4 commits intosoperator-release-3.0from
SCHED-1071/local-nvme-support

Conversation

@ali-sattari
Copy link
Collaborator

Add support for (raw) local NVMe disks.

check region/platform/preset for validation

small fix

one mount_path to rule them all

if nvme then enable hc script

fixes and xfs

based on info in #proj-local-disks
@ali-sattari ali-sattari marked this pull request as ready for review March 13, 2026 15:59
theyoprst
theyoprst previously approved these changes Mar 13, 2026
Copy link
Collaborator

@theyoprst theyoprst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of things:

  1. cloud_init.yaml.tftpl - mdadm --create without --force may hang on recycled nodes with existing superblocks. consider running wipefs -a on each disk before creating the array, or pass --force

  2. cloud_init.yaml.tftpl:136 - NVMe detection via nvme list + awk is fragile across tool versions. lsblk -d -n -o NAME,TRAN | awk '$2=="nvme" {print "/dev/"$1}' would be more reliable

  3. cloud_init.yaml.tftpl:153 - hard fail on DISK_COUNT < 2. is 2+ disks guaranteed for all supported presets? if a single-disk config is possible, could just format directly without mdadm

  4. cloud_init.yaml.tftpl:210 - mount_path interpolated without quoting in runcmd: prepare-disks.sh ${local_nvme_mount_path}. minor since paths are validated, but quoting is cheap

  5. flux_release_nodesets.tf:29-37 / main.tf:148-155 - source_type, host_path, filesystem_type fields added to jail_submounts but only used in the nodesets template path. the source_type == "host_path" branch in the template is dead code today. is this pre-work for a future PR? if so, might be cleaner to defer

  6. no validation that local_nvme.mount_path doesn't collide with other jail submount paths or system mounts

otherwise looks solid - good validation chain, init container mount checks, and per-nodeset design is clean

@ali-sattari ali-sattari merged commit 7a0ca67 into soperator-release-3.0 Mar 16, 2026
1 check passed
@ali-sattari ali-sattari deleted the SCHED-1071/local-nvme-support branch March 16, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants