Operating System
Linux
Version Information
Using the latest version of ML online endpoint
Steps to reproduce
I have a project to run an AI model using an online endpoint as a backend service, the endpoint is configured (manually set in the portal) to be auto-scale based on the number of requests.

Expected behavior
Expect the endpoint will scale up between 1-2 minutes like other services such as virtual machine scaleset, etc...
Actual behavior
With endpoint, scaling takes a long time, about 12-18 minutes.
Addition information
Do you have suggestions for speeding up the scaling time?