-
Notifications
You must be signed in to change notification settings - Fork 263
Description
Describe the bug
our trial cluster is experiencing a wave of "context cancelled" and "client rate limiter Wait returned an error: context canceled" type of errors, volume provisioning and attachment are very slow, with many retries. Please advise on how to troubleshoot this issue.
example logs:
I have tried to increase k8sAPIQPS parameter to 300 in tridentorchestrator (we use operator for trident installation), but this does not seem to help.
Cluster currently holds 2600 tridentvolumes.
Environment
We are using trident 25.06, aws fsx ONTAP , "ontap-san-economy" driver
trident controller runs in a gardener cluster.
- Trident version: 25.06
- Trident installation flags used: -n trident
- Container runtime: containerd 2.0
- Kubernetes version: v1.31.13
- Kubernetes orchestrator: Gardener
- Kubernetes enabled feature gates:
- OS: Garden Linux 1877.6
- NetApp backend types: AWS Ontap (linux)
- Other:
To Reproduce
This happens sporadically, and i do not know what exactly is the root cause, but it renders the system almost unusable
Expected behavior
provisioning, attaching and managing happens relatively quickly
Additional context
This happens to one of our productive clusters, (SAP)
We would be grateful for any tips or support on this case.
i have tried to create account here
https://mysupport.netapp.com/site/user/registration
To create support ticket,
but this site seems bugged as well, it does not let me past the email verification step, keeps asking for more codes, and sending them to email, entering the codes does not advance the registration process.