-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Hard crash and reboot of the hubagent if ClusterResourceSnapshot failed to list due to timeout
Environment
Please provide the following:
- Hub cluster details
The hub cluster is an aws eks cluster
The hubagent was installed with the following values
replicaCount: 1
logVerbosity: 1
enableWebhook: false
webhookServiceName: fleetwebhook
enableGuardRail: false
webhookClientConnectionType: service
enableV1Alpha1APIs: false
enableV1Beta1APIs: true
resources:
requests:
cpu: 2
memory: 8Gi
limits:
cpu: 4
memory: 16Gi
To Reproduce
Steps to reproduce the behavior:
- Install hubagent in the cluster
- Create a lot of objetcs in order to make the request to kubectl get clusterresourcesnapshot take more than 30 seconds
You should see the hubagent container to reboot showing the following errores in the logs
I0308 16:31:41.079738 1 controller/controller.go:190] "Starting controller" controller="cluster-resource-placement-controller-v1beta1"
I0308 16:31:41.147758 1 informer/informermanager.go:152] "Disabled an informer for a disappeared resource" res={"GroupVersionKind":{"Group":"","Version":"v1","Kind":"Event"},"GroupVersionResource":{"Group":"","Version":"v1","Resource":"events"},"IsClusterScoped":false}
W0308 16:32:21.369841 1 cache/reflector.go:535] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: failed to list *v1beta1.ClusterResourceSnapshot: the server was unable to return a response in the time allotted, but may still be processing the request (get clusterresourcesnapshots.placement.kubernetes-fleet.io)
I0308 16:32:21.369941 1 trace/trace.go:236] Trace[1547232315]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229 (08-Mar-2024 16:31:21.299) (total time: 60070ms):
Trace[1547232315]: ---"Objects listed" error:the server was unable to return a response in the time allotted, but may still be processing the request (get clusterresourcesnapshots.placement.kubernetes-fleet.io) 60069ms (16:32:21.369)
Trace[1547232315]: [1m0.070028406s] [1m0.070028406s] END
E0308 16:32:21.369964 1 cache/reflector.go:147] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: Failed to watch *v1beta1.ClusterResourceSnapshot: failed to list *v1beta1.ClusterResourceSnapshot: the server was unable to return a response in the time allotted, but may still be processing the request (get clusterresourcesnapshots.placement.kubernetes-fleet.io)```
### **Expected behavior**
The pod should not be rebooted
### **Screenshots**
If applicable, add screenshots to help explain your problem.
### **Additional context**
I think this could be solved if there is an option to increase the timeout for the kubernetes client.
Sorry for my english :S
Haladinoq
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working