Skip to content

snapshot-controller throttled by default client-side rate limits, causing 8x slower performance for batch snapshot operations #1344

@vivakeram

Description

@vivakeram

Is your feature request related to a problem?/Why is this needed

The snapshot-controller uses default Kubernetes client rate limits (QPS ~5, Burst ~10) which severely throttles batch VolumeSnapshot creation, regardless of the --worker-threads setting. This results in 8.4x slower performance (143s vs 17s for 80 snapshots).

Describe the solution you'd like in detail

Make --kube-api-qps and --kube-api-burst configurable flags with higher defaults (e.g., QPS=50, Burst=100).

Describe alternatives you've considered

Additional context

Environment

  • Kubernetes Version: v1.34.1
  • snapshot-controller Version: v8.2.0 (registry.k8s.io/sig-storage/snapshot-controller:v8.2.0)
  • csi-snapshotter Version: v8.3.0 (sidecar in CSI driver deployment)
  • CSI Driver: NetApp Trident
  • Test Scale: 80 VolumeSnapshots created simultaneously
  • Cluster: 3-node Kubernetes cluster

Problem Description

Current Behavior

When creating 80 VolumeSnapshots simultaneously with default snapshot-controller configuration:

  • Time: 143-145 seconds (1.8s per snapshot)
  • Worker threads: Configured with --worker-threads=100 (no effect)

Expected Behavior

When tuning both snapshot-controller and csi-snapshotter with --kube-api-qps=50 --kube-api-burst=100:

  • Time: 16-17 seconds (0.2s per snapshot)
  • Performance improvement: 8.4x faster

Evidence: Client-Side Throttling

snapshot-controller logs showed extensive throttling:

I1103 16:56:15.123456 1 client.go:xxx] Waited for 2.858493583s due to client-side throttling, not priority and fairness
I1103 16:56:18.234567 1 client.go:xxx] Waited for 3.012345678s due to client-side throttling, not priority and fairness
I1103 16:56:21.345678 1 client.go:xxx] Waited for 3.456789012s due to client-side throttling, not priority and fairness
I1103 16:56:24.456789 1 client.go:xxx] Waited for 3.634567890s due to client-side throttling, not priority and fairness
I1103 16:56:27.567890 1 client.go:xxx] Waited for 3.812345678s due to client-side throttling, not priority and fairness
I1103 16:56:30.678901 1 client.go:xxx] Waited for 4.023456789s due to client-side throttling, not priority and fairness

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions