Skip to content

Conversation

@landreasyan
Copy link
Contributor

@landreasyan landreasyan commented Aug 1, 2025

What type of PR is this?
/kind bug

When node disk attach limit is reached, the driver should return ResourceExhausted errors.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Requirements:

Special notes for your reviewer:

Release note:

none

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 1, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: landreasyan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 1, 2025
@landreasyan landreasyan force-pushed the batching-fix-disk-limit-processing branch from d69464a to 81f5d4e Compare August 1, 2025 23:28
@nearora-msft
Copy link
Contributor

Does this fix any issues, or is this just to ensure that we return the right error to K8s?

@landreasyan
Copy link
Contributor Author

landreasyan commented Aug 2, 2025

Does this fix any issues, or is this just to ensure that we return the right error to K8s?

It fixes the logic of removing the disks queued for batching. Disks that cannot be processed in the current batch will be left in the queue to be processed with the next request if there is capacity or otherwise, report the right error.

@andyzhangx
Copy link
Member

@landreasyan ResourceExhausted error won't make reschedule happen, that won't mitigate the original MaxDataDiskNumExcceeded issue unless user deletes the pod manually to make reschedule happen

@landreasyan
Copy link
Contributor Author

@landreasyan ResourceExhausted error won't make reschedule happen, that won't mitigate the original MaxDataDiskNumExcceeded issue unless user deletes the pod manually to make reschedule happen

Yes, it won't but regardless, we should return the right error. This is called out in the csi spec as well: https://github.com/container-storage-interface/spec/blob/master/spec.md#controllerpublishvolume-errors

klog.V(2).Infof("no attach disk(%s) request on node(%s), diskMap len:%d, %+v", diskURI, nodeName, len(diskMap), diskMap)
return nil, nil
}
c.attachDiskMap.Store(nodeName, make(map[string]*provider.AttachDiskOptions))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

original logic is always clearing up the queue since if removeDisks > 0, keep the remaining disk in the queue would not succeed in the end. and CSI driver has retry logic. clearing up the queue would make the logic more straightfoward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really wanted to avoid dropping the disks that haven't been processed, even though I agree that k8s will eventually retry but more changes need to be made to do this safely, so I am adding the cleaning up the queue part back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this approach, the next attach request that comes, would add to this batch of already existing disks in the attachDiskMap? So, we won't create a fresh batch, but reuse the originally dropped disks?
Is that the correct understanding?

@landreasyan landreasyan force-pushed the batching-fix-disk-limit-processing branch from abf84ac to 0384dab Compare August 5, 2025 23:01
@k8s-ci-robot
Copy link
Contributor

@landreasyan: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-azuredisk-csi-driver-external-e2e-single-az 0384dab link true /test pull-azuredisk-csi-driver-external-e2e-single-az
pull-azuredisk-csi-driver-e2e-capz-multi-az 0384dab link true /test pull-azuredisk-csi-driver-e2e-capz-multi-az
pull-azuredisk-csi-driver-e2e-capz 0384dab link true /test pull-azuredisk-csi-driver-e2e-capz
pull-azuredisk-csi-driver-e2e-capz-vmssflex 0384dab link true /test pull-azuredisk-csi-driver-e2e-capz-vmssflex

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@andyzhangx
Copy link
Member

this PR has made following test fail:

Summarizing 1 Failure:
  [FAIL] Dynamic Provisioning [single-az] [It] should succeed without MaximumDataDisksExceeded
  /home/prow/go/src/sigs.k8s.io/azuredisk-csi-driver/test/e2e/testsuites/dynamically_provisioned_job_attach_batch_tester.go:97

https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_azuredisk-csi-driver/3237/pull-azuredisk-csi-driver-e2e-capz/1952867569819979776

@andyzhangx andyzhangx requested a review from Copilot August 9, 2025 14:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses a bug in the Azure disk CSI driver's batching logic to ensure only processable disks are removed from batches and proper error handling when node disk limits are exceeded. The changes implement early detection of disk limit violations and return appropriate ResourceExhausted errors per CSI specification.

  • Adds logic to check node disk limits before processing attach requests and return early ResourceExhausted errors
  • Modifies batch retrieval to only remove disks that can actually be processed, keeping the requested disk in the batch
  • Updates error handling to return ResourceExhausted status codes when disk limits are exceeded

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pkg/azuredisk/controllerserver.go Adds ResourceExhausted error handling for maximum disk exceeded scenarios
pkg/azuredisk/azure_controller_common.go Refactors batching logic to check disk limits early and only remove processable disks from batches
pkg/azuredisk/azure_controller_common_test.go Updates tests to validate new batching behavior and adds test for clearing attach disk requests

numDisksAllowed = 0
currentNodeDisks := len(attachedDisks)
maxNodeDisks := int(maxNodeDisks)
if currentNodeDisks > maxNodeDisks {
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition should use >= instead of > since reaching the maximum number of disks should also trigger the error. If currentNodeDisks equals maxNodeDisks, there's no room for additional disks.

Suggested change
if currentNodeDisks > maxNodeDisks {
if currentNodeDisks >= maxNodeDisks {

Copilot uses AI. Check for mistakes.
klog.Info("First call to CreateOrUpdate succeeded", "VM Name:", name, "Params:", params)
time.Sleep(100 * time.Millisecond) // Simulate some processing time to hold the node lock while the 3rd detach request is made
return nil, nil // First call succeeds
time.Sleep(1000 * time.Millisecond) // Simulate some processing time to hold the node lock while the 3rd detach request is made
Copy link

Copilot AI Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sleep duration was increased from 100ms to 1000ms (10x increase) without explanation. This significantly slows down the test. Consider using a more reasonable duration or explaining why such a long delay is necessary.

Suggested change
time.Sleep(1000 * time.Millisecond) // Simulate some processing time to hold the node lock while the 3rd detach request is made
time.Sleep(100 * time.Millisecond) // Simulate processing time to hold the node lock while the 3rd detach request is made; 100ms is sufficient for concurrency in tests

Copilot uses AI. Check for mistakes.
@landreasyan landreasyan changed the title chore: only remove disks that can be processed from the batch chore: when node disk attach limit is reached, the driver should return ResourceExhausted errors. Aug 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants