Hi all,
I noticed in the scaling doc page (https://github.com/kserve/modelmesh-serving/blob/main/docs/production-use/scaling.md) that now is possible to set the ServingRuntime autoscaling with HPA, but using metrics based on cpu utilization.
Is it possible to scale the ServingRuntime using metrics regarding GPU?
Thanks in advance