CPU iowait > 10%. A high iowait means that you are disk or network bound. #8972
Replies: 10 comments 2 replies
-
|
This is unexpected. There should be only one running maintenance job for a specific BackupRepository. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
|
Could you please help collect the debug bundle by running Another thing is, when there is no running repo maintenance job, do you still see many |
Beta Was this translation helpful? Give feedback.
-
|
I generated a bundle but the bundle is 71Mb in size and the limit is 25Mb. Is there some other way I can send this to you? |
Beta Was this translation helpful? Give feedback.
-
|
How about using Slack? |
Beta Was this translation helpful? Give feedback.
-
|
Thanks. I checked the debug bundle. I found there are 8 BackupRepositories. As a result, one repo maintenance job per BackupRepository is normal. |
Beta Was this translation helpful? Give feedback.
-
|
Is there anything we can do to stagger the jobs or throttle them so they're not thrashing the disks all at once? Or maybe make them run in series instead of in parallel? |
Beta Was this translation helpful? Give feedback.
-
|
Actually where are you seeing 8 backup repositories? We have five backup jobs and they're all going to the same storage location? |
Beta Was this translation helpful? Give feedback.
-
|
Please run this command line to get all the Velero backup repositories. kubectl -n velero get backuprepositoriesThe Velero's BackupRepository number is not related to the backup number.
Take this scenario as an example:
As a result, three BackupRepositories are created for the first backup. Each BackupRepositories will require repository maintenance jobs. |
Beta Was this translation helpful? Give feedback.
-
|
Repository maintenance could be resource-consuming. There are some configurations that could avoid that. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
We're getting the following alert in alert manager on our kubernetes cluster:
Investigating the issue we see that its the velero maintenance job causing all the IO
I understand velero maintenance jobs by nature are IO intensive. But is there some way to limit the number of jobs running concurrently and/or can we limit how much IO they generate at once?
Or should I consider changing alert manager to exclude velero processes from being monitored?
Thanks!
Brad
Beta Was this translation helpful? Give feedback.
All reactions