Skip to content

Check the model status before traffic gets redirected to the pods #66

@bdattoma

Description

@bdattoma

/kind feature

Describe the solution you'd like
As of now, while the model is being loaded into memory, user queries may still reach a pod which may not be ready to answer (since the model is not loaded). The larger the model the larger the time to load it the larger the "downtime"

Anything else you would like to add:
This could be useful in different tasks like model/runtime canary rollout, replicas auto-scaling, etc

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    No status

    Status

    To-do/Groomed

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions