PVCs that are slow to provision cause the autoscaler to choke (for that volume)

**Describe the bug**
We are running in GCP and we had a workload spawn a 10GB PVC. The underlying storage controller was having issues provisioning (due to GCP ratelimits) which lasted for ~30m. During that time the volume-autoscaler noticed the disk and treated the disk as 0 size; from our slack:

```
@channel ERROR: <project> FAILED requesting to scale up <volume> by 20% from 0 to 2G, it was using more than 70% disk or inode space over the last 1380 seconds
```

From looking at the code it seems that this is due to [here](https://github.com/DevOps-Nirvana/Kubernetes-Volume-Autoscaler/blob/master/helpers.py#L324-L327) -- where all exceptions consider a volume to be of 0 size.

**To Reproduce**
Steps to reproduce the behavior
1. create PVC where underlying disk won't be provisioned due to failure
1. wait for volumeautoscaler to kick in
1. profit

**Expected behavior**
In the event that the underlying disk doesn't exist it seems more appropriate for volume autoscaler to *skip* that pvc; If there is no underlying disk we can't really resize it. So it seems that the correct behavior here would be to change [this](https://github.com/DevOps-Nirvana/Kubernetes-Volume-Autoscaler/blob/master/helpers.py#L324-L327) to skip the pvc instead of assuming the size is 0.

**Screenshots**
n/a

**Extra Information Requested**
 - Kubernetes Version: v1.33.5-gke.1162000
 - Prometheus Version: GCP managed


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PVCs that are slow to provision cause the autoscaler to choke (for that volume) #33

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PVCs that are slow to provision cause the autoscaler to choke (for that volume) #33

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions