[Serve] `downscale_delay_s` / `upscale_delay_s` effective wall-clock delay ignores `run_control_loop_step` duration

### What happened + What you expected to happen

The default autoscaling policy interprets `downscale_delay_s` and `upscale_delay_s` by counting control-loop iterations and comparing to `int(delay_s / CONTROL_LOOP_INTERVAL_S)`.

https://github.com/ray-project/ray/blob/23e587f22f764421f16d326915e18712e7146ab3/python/ray/serve/autoscaling_policy.py#L101

That implicitly assumes each iteration corresponds to roughly `CONTROL_LOOP_INTERVAL_S` seconds of wall time. In reality, the controller spends time executing `run_control_loop_step()` before sleeping for CONTROL_LOOP_INTERVAL_S, so the interval between two policy invocations is approximately `loop_step_duration + CONTROL_LOOP_INTERVAL_S`. 

When `loop_step_duration` is not negligible (e.g. large clusters, heavy deployment state), the observed time before scaling actions can be much larger than the configured delay, even though the parameter name and docs read as seconds.

The same counting logic applies to upscale (upscale_delay_s).

### Versions / Dependencies

ray=2.54

### Reproduction script

By adding an extra `await asyncio.sleep(0.5)` to the `async def run_control_loop_step` function to simulate a time-consuming step, the actual downscale time increased by a factor of 5.

### Issue Severity

Low: It annoys or frustrates me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Serve] `downscale_delay_s` / `upscale_delay_s` effective wall-clock delay ignores `run_control_loop_step` duration #62004

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Serve] downscale_delay_s / upscale_delay_s effective wall-clock delay ignores run_control_loop_step duration #62004

Description

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Serve] `downscale_delay_s` / `upscale_delay_s` effective wall-clock delay ignores `run_control_loop_step` duration #62004