feat(nvmeof): implement configurable subsystem NQN#117
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughNQN generation was refactored to produce UUID-prefixed NQNs with a configurable prefix and propagate an overrideable Subnqn through NVMe-oF create/clone/adopt flows; the NVMe-oF subsystem API struct gained a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/driver/controller_nvmeof.go (2)
1019-1022:⚠️ Potential issue | 🔴 CriticalMissing
Subnqnfield in clone path'sCreateNVMeOFSubsystemcall.The creation path (line 494) sets
Subnqn: params.subsystemNQN, but the clone path here omits it. This means TrueNAS will auto-generate the SubNQN for cloned volumes, defeating the purpose of this PR.🐛 Add Subnqn field to match the creation path
subsystem, err := s.apiClient.CreateNVMeOFSubsystem(ctx, tnsapi.NVMeOFSubsystemCreateParams{ Name: subsystemNQN, + Subnqn: subsystemNQN, AllowAnyHost: true, })
1225-1228:⚠️ Potential issue | 🔴 CriticalMissing
Subnqnfield in adopt path'sCreateNVMeOFSubsystemcall.Same issue as the clone path (line 1019): the
Subnqnfield is not populated, so TrueNAS will auto-generate the SubNQN for adopted volumes.🐛 Add Subnqn field to match the creation path
newSubsys, err := s.apiClient.CreateNVMeOFSubsystem(ctx, tnsapi.NVMeOFSubsystemCreateParams{ Name: subsystemNQN, + Subnqn: subsystemNQN, AllowAnyHost: true, })
🤖 Fix all issues with AI agents
In `@pkg/driver/controller_nvmeof.go`:
- Around line 1218-1222: The logic that sets subsystemNQN is inverted: currently
it uses generateNQN(defaultNQNPrefix, volumeName) then checks
params["subsystemNQN"] with if !ok and wrongly tries to use nqnPrefix when the
param is absent. Fix by flipping the condition so that when
params["subsystemNQN"] is present (ok == true) you call subsystemNQN =
generateNQN(nqnPrefix, volumeName), otherwise keep the default generated with
generateNQN(defaultNQNPrefix, volumeName); adjust the block around subsystemNQN,
nqnPrefix, params["subsystemNQN"], and generateNQN(volumeName) accordingly.
- Around line 999-1003: The clone path in controller_nvmeof.go incorrectly
inverts the presence check for the "subsystemNQN" param: currently it uses `if
!ok` which causes generateNQN to be called with an empty prefix; change the
branch to use `if ok` so that when the params map contains "subsystemNQN" you
call subsystemNQN = generateNQN(nqnPrefix, volumeName) (keeping the initial
defaultNQNPrefix assignment otherwise) to match the logic used in
validateNVMeOFParams and ensure the provided NQN prefix is applied for clones.
🧹 Nitpick comments (1)
pkg/driver/controller_nvmeof.go (1)
74-80: Variableuuidshadows the imported packageuuid.On line 78,
uuid := uuid.New().String()declares a local variable that shadows theuuidpackage import. While this works because the package is only used once, it's a Go anticonvention that can confuse readers and will break if the function is ever extended to calluuidagain.♻️ Rename local variable to avoid shadowing
func generateNQN(nqnPrefix, volumeName string) string { - uuid := uuid.New().String() - return fmt.Sprintf("%s:uuid:%s:%s", nqnPrefix, uuid, volumeName) + id := uuid.New().String() + return fmt.Sprintf("%s:uuid:%s:%s", nqnPrefix, id, volumeName) }
f4f2aea to
e8ded60
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/driver/controller_nvmeof_test.go (2)
70-74:⚠️ Potential issue | 🟠 MajorTest assertion broken by new UUID-based NQN format — test will fail.
generateNQNnow producesnqn.2137.csi.tns:uuid:<uuid>:test-nvmeof-volume, but the mock still expects the old format"nqn.2137.csi.tns:test-nvmeof-volume". The assertion at line 72 will always fire, marking this test case as failed.The mock should validate the new format (e.g., check prefix/suffix/UUID) similar to
TestGenerateNQN, instead of exact string equality.🐛 Proposed fix
m.CreateNVMeOFSubsystemFunc = func(ctx context.Context, params tnsapi.NVMeOFSubsystemCreateParams) (*tnsapi.NVMeOFSubsystem, error) { - expectedNQN := "nqn.2137.csi.tns:test-nvmeof-volume" - if params.Name != expectedNQN { - t.Errorf("Expected NQN %s, got %s", expectedNQN, params.Name) + expectedPrefix := defaultNQNPrefix + ":uuid:" + expectedSuffix := ":test-nvmeof-volume" + if !strings.HasPrefix(params.Name, expectedPrefix) { + t.Errorf("Expected NQN prefix %s, got %s", expectedPrefix, params.Name) + } + if !strings.HasSuffix(params.Name, expectedSuffix) { + t.Errorf("Expected NQN suffix %s, got %s", expectedSuffix, params.Name) } if !params.AllowAnyHost { t.Error("Expected AllowAnyHost to be true") } return &tnsapi.NVMeOFSubsystem{ ID: 100, - Name: expectedNQN, - NQN: expectedNQN, + Name: params.Name, + NQN: params.Name, }, nil
132-135:⚠️ Potential issue | 🟠 MajorResponse NQN assertion also uses old format — needs updating to match UUID-based NQN.
After fixing the mock to return the actual UUID-based NQN (see comment above), this assertion will also fail. It should validate the NQN format (prefix + UUID + suffix) rather than exact string equality.
🐛 Proposed fix
- expectedNQN := "nqn.2137.csi.tns:test-nvmeof-volume" - if resp.Volume.VolumeContext["nqn"] != expectedNQN { - t.Errorf("Expected NQN %s, got %s", expectedNQN, resp.Volume.VolumeContext["nqn"]) - } + nqn := resp.Volume.VolumeContext["nqn"] + if !strings.HasPrefix(nqn, defaultNQNPrefix+":uuid:") { + t.Errorf("Expected NQN with prefix %s:uuid:, got %s", defaultNQNPrefix, nqn) + } + if !strings.HasSuffix(nqn, ":test-nvmeof-volume") { + t.Errorf("Expected NQN with suffix :test-nvmeof-volume, got %s", nqn) + }
🤖 Fix all issues with AI agents
In `@pkg/driver/controller_nvmeof.go`:
- Around line 25-28: The comment for defaultNQNPrefix is outdated: update the
descriptive comment to match the UUID-based NQN produced by generateNQN. Replace
the line that says "Format: nqn.2137.csi.tns:<volume-name>" with the correct
format "Format: nqn.2137.csi.tns:uuid:<uuid>:<volume-name>" (referencing
defaultNQNPrefix and the generateNQN function) so the comment accurately
documents the current NQN structure.
🧹 Nitpick comments (1)
pkg/driver/controller_nvmeof.go (1)
77-79: Variableuuidshadows the importeduuidpackage.The local variable on line 78 shadows the package-level
uuidimport. If this function is ever extended to make anotheruuid.X()call after line 78, it will fail to compile.♻️ Rename to avoid shadowing
func generateNQN(nqnPrefix, volumeName string) string { - uuid := uuid.New().String() - return fmt.Sprintf("%s:uuid:%s:%s", nqnPrefix, uuid, volumeName) + id := uuid.New().String() + return fmt.Sprintf("%s:uuid:%s:%s", nqnPrefix, id, volumeName) }
|
Thanks Arturo for your contribution but I've got few questions. But that's a minor issue. The bigger issue I'm seeing is the fact that you're planning to change NQN only on controller side. What about node side? It will try to connect to non-existing subsystem when we apply your change. We would have to pass NQN via VolumeContext to node to make it work. I think your changes also break indempotency. generateNQN is calling uuid.New every time so in case of some connection glitches we will create duplicate subsystems on TrueNAS side. |
|
Hi, the NQN is being passed down to the node through the VolumeContext, I've also verified its working by deployin it to my own test k8s cluster so that's not an issue. I'll fix the Teststhose broke due to the truenas api changes, and as for generateNQN you are right that can cause issues if CreateVolume is called multiple times so i'll get that sorted out. Going to remove the UUID logic and assume the user provides a unique NQN prefix per storageClass, should probably add a disclaimer encouraging users to provide their own NQN prefix. |
e8ded60 to
92d3842
Compare
|
I guess I will have to change the way integration tests set up env to make it working... |
|
With the changes i made the nqn prefix including the UUID can be set by the user in the storageclass params and if no nqn is passed it should use nqn.2137.csi.tns as the prefix giving a predicatable nqn of |
|
How exactly do you test it? |
|
Sorry haven't had much time to look at the test, I'll have a look and make sure the tests pass / modify whatever is needed for it. |
|
some tests are still failing, ill have the rest passing by tomorrow |
|
Can you please tell me what exactly is the problem you're trying to solve here? Maybe I could help somehow. |
|
Overall, this change allows passing the desired NQN prefix through the StorageClass. Right now, it only supports a statically defined prefix. This matters because Kubernetes does not guarantee PVC names are unique across clusters, so this prevents potential cross-cluster collisions in that edge case. I had to rework part of the unstage logic to retrieve the NQN from the filesystem, since the CSI spec does not expose volumeContext during the NodeUnstage call. Additionally, several tests had the old NQN hardcoded, so I updated them to derive the NQN from the PVC instead. I’ll resolve the conflicts and rebase shortly. |
Implement configurable NVMe-oF subsystem NQN handling that was intended but not fully wired through the stack. - Add StorageClass support for `subsystemNQN` in chart templates. - Generate subsystem NQNs in controller create/clone/adopt paths using configured prefix with default fallback, and pass both `name`/`subnqn` to TrueNAS API. - Update node unstage to derive NQN from staged device metadata (`findmnt`/symlink + /sys/class/nvme/*/subsysnqn) instead of rebuilding from volume ID. - Add shared dataset path parsing helpers. - Align unit tests with configurable NQN behavior. - Update NVMe-oF e2e tests to read NQN from PV CSI attributes and use configured TrueNAS pool for detached snapshot independence checks. Signed-off-by: Arturo Guerra <ar2roguerra@gmail.com>
3a41520 to
df2bce5
Compare
|
I ran all the unit tests and e2e tests locally and everything passes should be good to merge. |
- Add sentinel errors (ErrNVMeEmptyNQN, ErrNVMeNotNVMeDevice, ErrNVMeNonNVMeStagingDevice) to satisfy err113 linter - Replace filepath.Join with string concatenation for sysfs path to fix gocritic/filepathJoin warning - Remove unused utils.go (extractParentDatasetFromDataset, extractPoolFromDataset were never called)
Description
Fix NVMe-oF subsystem NQN handling by
subnqnexplicitly in subsystem creation requests.Type of Change
Motivation
Prevent SubNQN naming collisions across clusters (where PVC names may overlap) and ensure the TrueNAS API receives an explicit
subnqnvalue during NVMe-oF subsystem creation.Changes Made
subnqninCreateNVMeOFSubsystemAPI payload.NVMeOFSubsystemCreateParamswithSubnqn string.Testing
Manual NVMe-oF testing completed for the updated subsystem/NQN behavior.
Test environment:
Tests run:
make test)make lint)Documentation
Checklist
Security Considerations
Related Issues
Fixes #
Related to #
Additional Notes
This change is scoped to NVMe-oF subsystem naming and creation payload handling.
Summary by CodeRabbit