Skip to content

Add NVMe-oF connect concurrency limiter#131

Closed
ashtonian wants to merge 2 commits intofenio:mainfrom
ashtonian:feat/nvme-connect-concurrency-limiter
Closed

Add NVMe-oF connect concurrency limiter#131
ashtonian wants to merge 2 commits intofenio:mainfrom
ashtonian:feat/nvme-connect-concurrency-limiter

Conversation

@ashtonian
Copy link
Contributor

@ashtonian ashtonian commented Feb 26, 2026

Description

Same as before, it seems during mass reconnect events there can be contention around initializing the connection. This introduces a tunable to limit that.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Test addition or update

Motivation

Connecting 50+ volumes to a single host cascade and fails.

Changes Made

  • Add concurrency limiter around nvmeof connections.

Testing

Deployed in my cluster, needs verification.

Test environment:

  • Kubernetes version:
  • TrueNAS version:
  • Protocol tested: [ ] NFS [x] NVMe-oF [ ] Both

Tests run:

  • Unit tests (make test)
  • Linting (make lint)
  • Sanity tests
  • Integration tests (manual or CI)

Documentation

  • Code comments updated
  • README updated (if needed)
  • Docs updated (if needed)
  • No documentation needed

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings or errors
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Security Considerations

  • This PR does not introduce security vulnerabilities
  • No secrets or credentials are included in this PR
  • I have reviewed SECURITY.md and followed best practices

Related Issues

Related to #130

Additional Notes

@ashtonian ashtonian requested a review from fenio as a code owner February 26, 2026 21:29
@ashtonian ashtonian changed the title add nvme concurrent limiter Add NVMe-oF connect concurrency limiter Feb 26, 2026
fenio added a commit that referenced this pull request Feb 26, 2026
Prevent kernel NVMe subsystem registration lock contention when staging
many volumes simultaneously on a single node. Uses a channel-based
semaphore (default: 5) configurable via --max-concurrent-nvme-connects
flag and node.maxConcurrentNVMeConnects Helm value. Adds Prometheus
metrics for concurrent/waiting connect operations.

Co-authored-by: Ashton Kinslow <github@ashtonkinslow.com>
@fenio
Copy link
Owner

fenio commented Feb 26, 2026

Integrated manually onto main as 49eb208 — thanks @ashtonian!

@fenio fenio closed this Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants