Skip to content

[Bug] autoscale up and down are slow #3909

@caican00

Description

@caican00

Search before asking

  • I searched the issues and found no similar issues.

KubeRay Component

ray-operator

What happened + What you expected to happen

  1. At around 09:59:20 on July 23, 2025, raycluster-219843 has updated the CRD (as shown in Figure 3), and the number of replicas needs to be scaled to 500
Image Image Image
  1. However, during the period from 09:59 to 10:08 on July 23, 2025, kuberay was constantly handling the expansion operation of another raycluster (RayCluster-538811), which lasted for 10 minutes.
    The time point for the first pod scaling up of raycluster-538811 is 2025-07-23 09:57:47.520
Image
  1. The time point of the last pod scaling up of raycluster-538811 is : 2025-07-23 10:08:32
Image
  1. The scaling up requirement for raycluster-219843 were not processed until 10:08 on July 23, 2025
Image
  1. The current ReconcileConcurrency value is 1, and the entire process has been pending for 10 minutes

Reproduction script

None

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions