Skip to content

Conversation

@gadididi
Copy link
Contributor

@gadididi gadididi commented Feb 9, 2026

Describe what this PR does

adding retry mechanism for listenerList grpc call
because there is unknown (under investigation) issue in the nvmeof gw, when user creates subsystem with autoListener option (it creates the listener automatically) but then, when the controller wants to retrieve the listeners list, it gets empty list. there is some
delay in the GW.
FYI- the listeners list comes from the ceph mon command nvme-gw listeners .

Related issues

fixed:
#6037

Future concerns

List items that are not part of the PR and do not impact it's
functionality, but are work items that can be taken up subsequently.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow
    guidelines in the developer
    guide
    .
  • Reviewed the developer guide on Submitting a Pull
    Request
  • Pending release
    notes

    updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

adding retry mechanism for listenerList grpc call
because there is unknowm (under investigation) issue
in the nvmeof gw, when user creates subsystem with
autoListener option (it creates the listener automatically)
but then, when the controller wants to retrieve the
listeners list, it gets empty list. there is some
delay in the GW.

Signed-off-by: gadi-didi <gadi.didi@ibm.com>
@gadididi gadididi self-assigned this Feb 9, 2026
@gadididi gadididi added the component/nvme-of Issues and PRs related to NVMe-oF. label Feb 9, 2026
@gadididi gadididi marked this pull request as ready for review February 11, 2026 10:19
@gadididi gadididi requested review from Madhu-1 and nixpanic February 11, 2026 10:19

return ConvertListenersFromProto(autoListeners.GetListeners()), nil
},
retry.Attempts(6), // ~100ms, 200ms, 400ms, 800ms, 1.6s, 3.2s = ~6.3s total
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gadididi will this work in all the clusters where we have 100 pvc etc or if ceph cluster is under some small stress? because retries are always very tricky

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to check it soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/nvme-of Issues and PRs related to NVMe-oF.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants