Skip to content

feat(nvmeof): implement configurable subsystem NQN#117

Merged
fenio merged 2 commits intofenio:mainfrom
ArturoGuerra:fix_subsysnqn
Feb 28, 2026
Merged

feat(nvmeof): implement configurable subsystem NQN#117
fenio merged 2 commits intofenio:mainfrom
ArturoGuerra:fix_subsysnqn

Conversation

@ArturoGuerra
Copy link
Contributor

@ArturoGuerra ArturoGuerra commented Feb 11, 2026

Description

Fix NVMe-oF subsystem NQN handling by subnqn explicitly in subsystem creation requests.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Test addition or update

Motivation

Prevent SubNQN naming collisions across clusters (where PVC names may overlap) and ensure the TrueNAS API receives an explicit subnqn value during NVMe-oF subsystem creation.

Changes Made

  • Added support for configurable subsystem NQN prefix via request/storage parameters.
  • Included subnqn in CreateNVMeOFSubsystem API payload.
  • Extended NVMeOFSubsystemCreateParams with Subnqn string.

Testing

Manual NVMe-oF testing completed for the updated subsystem/NQN behavior.

Test environment:

  • Kubernetes version: Not specified
  • TrueNAS version: Not specified
  • Protocol tested: [ ] NFS [x] NVMe-oF [ ] Both

Tests run:

  • Unit tests (make test)
  • Linting (make lint)
  • Sanity tests
  • Integration tests (manual or CI)

Documentation

  • Code comments updated
  • README updated (if needed)
  • Docs updated (if needed)
  • No documentation needed

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings or errors
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Security Considerations

  • This PR does not introduce security vulnerabilities
  • No secrets or credentials are included in this PR
  • I have reviewed SECURITY.md and followed best practices

Related Issues

Fixes #
Related to #

Additional Notes

This change is scoped to NVMe-oF subsystem naming and creation payload handling.

Summary by CodeRabbit

  • New Features
    • NQN generation now includes a UUID segment and supports a configurable prefix.
    • StorageClass parameter can override generated NQNs for created, cloned, and adopted volumes.
  • Tests
    • NQN generation tests updated to validate UUID format and dynamic components rather than exact strings.

@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

Warning

Rate limit exceeded

@ArturoGuerra has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 15 minutes and 55 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

NQN generation was refactored to produce UUID-prefixed NQNs with a configurable prefix and propagate an overrideable Subnqn through NVMe-oF create/clone/adopt flows; the NVMe-oF subsystem API struct gained a Subnqn field to send the full NQN to TrueNAS.

Changes

Cohort / File(s) Summary
NQN Generation & Flows
pkg/driver/controller_nvmeof.go
Changed generateNQN signature to generateNQN(nqnPrefix, volumeName string) and renamed prefix constant to defaultNQNPrefix; generateNQN now embeds a UUID. Propagated generated/overrideable subsystemNQN through create, clone, and adopt paths and logged/used subsystem.NQN from TrueNAS.
API Parameter Extension
pkg/tnsapi/client.go
Added Subnqn string (JSON subnqn) to NVMeOFSubsystemCreateParams to allow passing the full NQN in subsystem creation requests.
Tests
pkg/driver/controller_nvmeof_test.go
Updated tests to call generateNQN(defaultNQNPrefix, volumeName) and validate NQN format by checking prefix, UUID parseability, and volume-name suffix; added imports for strings and github.com/google/uuid.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I stitched an NQN with care and cheer,
A UUID tucked so very near,
Prefix set or overridden true,
Subnqn carried through and through,
Hop—now volumes find their name anew! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: UUID-based SubNQN generation and inclusion of subnqn in subsystem create requests.
Description check ✅ Passed The PR description follows the template structure with all major sections completed: motivation, changes made, testing details, and security considerations clearly documented.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ArturoGuerra ArturoGuerra changed the title NVMe-oF: Deterministic Subsystem NQN with UUID and explicit subnqn API usage fix(nvmeof): generate UUID-based SubNQN and send subnqn in subsystem create request Feb 11, 2026
@ArturoGuerra ArturoGuerra marked this pull request as ready for review February 11, 2026 09:34
@ArturoGuerra ArturoGuerra requested a review from fenio as a code owner February 11, 2026 09:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/driver/controller_nvmeof.go (2)

1019-1022: ⚠️ Potential issue | 🔴 Critical

Missing Subnqn field in clone path's CreateNVMeOFSubsystem call.

The creation path (line 494) sets Subnqn: params.subsystemNQN, but the clone path here omits it. This means TrueNAS will auto-generate the SubNQN for cloned volumes, defeating the purpose of this PR.

🐛 Add Subnqn field to match the creation path
 	subsystem, err := s.apiClient.CreateNVMeOFSubsystem(ctx, tnsapi.NVMeOFSubsystemCreateParams{
 		Name:         subsystemNQN,
+		Subnqn:       subsystemNQN,
 		AllowAnyHost: true,
 	})

1225-1228: ⚠️ Potential issue | 🔴 Critical

Missing Subnqn field in adopt path's CreateNVMeOFSubsystem call.

Same issue as the clone path (line 1019): the Subnqn field is not populated, so TrueNAS will auto-generate the SubNQN for adopted volumes.

🐛 Add Subnqn field to match the creation path
 		newSubsys, err := s.apiClient.CreateNVMeOFSubsystem(ctx, tnsapi.NVMeOFSubsystemCreateParams{
 			Name:         subsystemNQN,
+			Subnqn:       subsystemNQN,
 			AllowAnyHost: true,
 		})
🤖 Fix all issues with AI agents
In `@pkg/driver/controller_nvmeof.go`:
- Around line 1218-1222: The logic that sets subsystemNQN is inverted: currently
it uses generateNQN(defaultNQNPrefix, volumeName) then checks
params["subsystemNQN"] with if !ok and wrongly tries to use nqnPrefix when the
param is absent. Fix by flipping the condition so that when
params["subsystemNQN"] is present (ok == true) you call subsystemNQN =
generateNQN(nqnPrefix, volumeName), otherwise keep the default generated with
generateNQN(defaultNQNPrefix, volumeName); adjust the block around subsystemNQN,
nqnPrefix, params["subsystemNQN"], and generateNQN(volumeName) accordingly.
- Around line 999-1003: The clone path in controller_nvmeof.go incorrectly
inverts the presence check for the "subsystemNQN" param: currently it uses `if
!ok` which causes generateNQN to be called with an empty prefix; change the
branch to use `if ok` so that when the params map contains "subsystemNQN" you
call subsystemNQN = generateNQN(nqnPrefix, volumeName) (keeping the initial
defaultNQNPrefix assignment otherwise) to match the logic used in
validateNVMeOFParams and ensure the provided NQN prefix is applied for clones.
🧹 Nitpick comments (1)
pkg/driver/controller_nvmeof.go (1)

74-80: Variable uuid shadows the imported package uuid.

On line 78, uuid := uuid.New().String() declares a local variable that shadows the uuid package import. While this works because the package is only used once, it's a Go anticonvention that can confuse readers and will break if the function is ever extended to call uuid again.

♻️ Rename local variable to avoid shadowing
 func generateNQN(nqnPrefix, volumeName string) string {
-	uuid := uuid.New().String()
-	return fmt.Sprintf("%s:uuid:%s:%s", nqnPrefix, uuid, volumeName)
+	id := uuid.New().String()
+	return fmt.Sprintf("%s:uuid:%s:%s", nqnPrefix, id, volumeName)
 }

@ArturoGuerra ArturoGuerra force-pushed the fix_subsysnqn branch 2 times, most recently from f4f2aea to e8ded60 Compare February 11, 2026 09:48
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/driver/controller_nvmeof_test.go (2)

70-74: ⚠️ Potential issue | 🟠 Major

Test assertion broken by new UUID-based NQN format — test will fail.

generateNQN now produces nqn.2137.csi.tns:uuid:<uuid>:test-nvmeof-volume, but the mock still expects the old format "nqn.2137.csi.tns:test-nvmeof-volume". The assertion at line 72 will always fire, marking this test case as failed.

The mock should validate the new format (e.g., check prefix/suffix/UUID) similar to TestGenerateNQN, instead of exact string equality.

🐛 Proposed fix
 			m.CreateNVMeOFSubsystemFunc = func(ctx context.Context, params tnsapi.NVMeOFSubsystemCreateParams) (*tnsapi.NVMeOFSubsystem, error) {
-				expectedNQN := "nqn.2137.csi.tns:test-nvmeof-volume"
-				if params.Name != expectedNQN {
-					t.Errorf("Expected NQN %s, got %s", expectedNQN, params.Name)
+				expectedPrefix := defaultNQNPrefix + ":uuid:"
+				expectedSuffix := ":test-nvmeof-volume"
+				if !strings.HasPrefix(params.Name, expectedPrefix) {
+					t.Errorf("Expected NQN prefix %s, got %s", expectedPrefix, params.Name)
+				}
+				if !strings.HasSuffix(params.Name, expectedSuffix) {
+					t.Errorf("Expected NQN suffix %s, got %s", expectedSuffix, params.Name)
 				}
 				if !params.AllowAnyHost {
 					t.Error("Expected AllowAnyHost to be true")
 				}
 				return &tnsapi.NVMeOFSubsystem{
 					ID:   100,
-					Name: expectedNQN,
-					NQN:  expectedNQN,
+					Name: params.Name,
+					NQN:  params.Name,
 				}, nil

132-135: ⚠️ Potential issue | 🟠 Major

Response NQN assertion also uses old format — needs updating to match UUID-based NQN.

After fixing the mock to return the actual UUID-based NQN (see comment above), this assertion will also fail. It should validate the NQN format (prefix + UUID + suffix) rather than exact string equality.

🐛 Proposed fix
-			expectedNQN := "nqn.2137.csi.tns:test-nvmeof-volume"
-			if resp.Volume.VolumeContext["nqn"] != expectedNQN {
-				t.Errorf("Expected NQN %s, got %s", expectedNQN, resp.Volume.VolumeContext["nqn"])
-			}
+			nqn := resp.Volume.VolumeContext["nqn"]
+			if !strings.HasPrefix(nqn, defaultNQNPrefix+":uuid:") {
+				t.Errorf("Expected NQN with prefix %s:uuid:, got %s", defaultNQNPrefix, nqn)
+			}
+			if !strings.HasSuffix(nqn, ":test-nvmeof-volume") {
+				t.Errorf("Expected NQN with suffix :test-nvmeof-volume, got %s", nqn)
+			}
🤖 Fix all issues with AI agents
In `@pkg/driver/controller_nvmeof.go`:
- Around line 25-28: The comment for defaultNQNPrefix is outdated: update the
descriptive comment to match the UUID-based NQN produced by generateNQN. Replace
the line that says "Format: nqn.2137.csi.tns:<volume-name>" with the correct
format "Format: nqn.2137.csi.tns:uuid:<uuid>:<volume-name>" (referencing
defaultNQNPrefix and the generateNQN function) so the comment accurately
documents the current NQN structure.
🧹 Nitpick comments (1)
pkg/driver/controller_nvmeof.go (1)

77-79: Variable uuid shadows the imported uuid package.

The local variable on line 78 shadows the package-level uuid import. If this function is ever extended to make another uuid.X() call after line 78, it will fail to compile.

♻️ Rename to avoid shadowing
 func generateNQN(nqnPrefix, volumeName string) string {
-	uuid := uuid.New().String()
-	return fmt.Sprintf("%s:uuid:%s:%s", nqnPrefix, uuid, volumeName)
+	id := uuid.New().String()
+	return fmt.Sprintf("%s:uuid:%s:%s", nqnPrefix, id, volumeName)
 }

@fenio
Copy link
Owner

fenio commented Feb 11, 2026

Thanks Arturo for your contribution but I've got few questions.
First of all I'm relying closely on various tests and your PR is failing linter and in turn sanity test.

But that's a minor issue. The bigger issue I'm seeing is the fact that you're planning to change NQN only on controller side. What about node side? It will try to connect to non-existing subsystem when we apply your change. We would have to pass NQN via VolumeContext to node to make it work.

I think your changes also break indempotency. generateNQN is calling uuid.New every time so in case of some connection glitches we will create duplicate subsystems on TrueNAS side.

@ArturoGuerra
Copy link
Contributor Author

Hi, the NQN is being passed down to the node through the VolumeContext, I've also verified its working by deployin it to my own test k8s cluster so that's not an issue. I'll fix the Teststhose broke due to the truenas api changes, and as for generateNQN you are right that can cause issues if CreateVolume is called multiple times so i'll get that sorted out. Going to remove the UUID logic and assume the user provides a unique NQN prefix per storageClass, should probably add a disclaimer encouraging users to provide their own NQN prefix.

@fenio
Copy link
Owner

fenio commented Feb 11, 2026

I guess I will have to change the way integration tests set up env to make it working...
https://github.com/fenio/tns-csi/actions/runs/21923017334

@ArturoGuerra
Copy link
Contributor Author

With the changes i made the nqn prefix including the UUID can be set by the user in the storageclass params and if no nqn is passed it should use nqn.2137.csi.tns as the prefix giving a predicatable nqn of nqn.2137.csi.tns:<pvc>, truenas wont preppend anything to it now.

@fenio
Copy link
Owner

fenio commented Feb 14, 2026

How exactly do you test it?
https://github.com/fenio/tns-csi/actions/runs/22020966830

@ArturoGuerra
Copy link
Contributor Author

Sorry haven't had much time to look at the test, I'll have a look and make sure the tests pass / modify whatever is needed for it.

@ArturoGuerra
Copy link
Contributor Author

some tests are still failing, ill have the rest passing by tomorrow

@fenio
Copy link
Owner

fenio commented Feb 27, 2026

Can you please tell me what exactly is the problem you're trying to solve here? Maybe I could help somehow.
In the meantime there was some development done so you will have to rebase your changes anyway as they are now conflicting.

@ArturoGuerra
Copy link
Contributor Author

Overall, this change allows passing the desired NQN prefix through the StorageClass. Right now, it only supports a statically defined prefix. This matters because Kubernetes does not guarantee PVC names are unique across clusters, so this prevents potential cross-cluster collisions in that edge case.

I had to rework part of the unstage logic to retrieve the NQN from the filesystem, since the CSI spec does not expose volumeContext during the NodeUnstage call.

Additionally, several tests had the old NQN hardcoded, so I updated them to derive the NQN from the PVC instead.

I’ll resolve the conflicts and rebase shortly.

@ArturoGuerra ArturoGuerra changed the title fix(nvmeof): generate UUID-based SubNQN and send subnqn in subsystem create request fix(nvmeof): allow custom SubNQN Feb 28, 2026
@ArturoGuerra ArturoGuerra marked this pull request as draft February 28, 2026 05:49
Implement configurable NVMe-oF subsystem NQN handling that was intended
but not fully wired through the stack.

- Add StorageClass support for `subsystemNQN` in chart templates.
- Generate subsystem NQNs in controller create/clone/adopt paths using
  configured prefix with default fallback, and pass both `name`/`subnqn`
  to TrueNAS API.
- Update node unstage to derive NQN from staged device metadata
  (`findmnt`/symlink + /sys/class/nvme/*/subsysnqn) instead of rebuilding
  from volume ID.
- Add shared dataset path parsing helpers.
- Align unit tests with configurable NQN behavior.
- Update NVMe-oF e2e tests to read NQN from PV CSI attributes and use
  configured TrueNAS pool for detached snapshot independence checks.

Signed-off-by: Arturo Guerra <ar2roguerra@gmail.com>
@ArturoGuerra ArturoGuerra changed the title fix(nvmeof): allow custom SubNQN feat(nvmeof): implement configurable subsystem NQN Feb 28, 2026
@ArturoGuerra ArturoGuerra marked this pull request as ready for review February 28, 2026 07:51
@ArturoGuerra
Copy link
Contributor Author

I ran all the unit tests and e2e tests locally and everything passes should be good to merge.

- Add sentinel errors (ErrNVMeEmptyNQN, ErrNVMeNotNVMeDevice,
  ErrNVMeNonNVMeStagingDevice) to satisfy err113 linter
- Replace filepath.Join with string concatenation for sysfs path
  to fix gocritic/filepathJoin warning
- Remove unused utils.go (extractParentDatasetFromDataset,
  extractPoolFromDataset were never called)
@fenio fenio merged commit 456c09d into fenio:main Feb 28, 2026
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants