-
Notifications
You must be signed in to change notification settings - Fork 11
Description
I see we need to do pre and post process on the client side. And I couldn't see any custom python backend and ensemble scheduling for pre and post process on server side like Triton Inference Server. Also, ensemble scheduling helps to create complex model like when we are using multiple model as doing the inference like primary, secondary or parallel inference. It is so important for complex model. When I build the pipeline on Gstreamer or any video decoding system, I need to add custom plugins for inference. So everything gets more complicated on the client side and I can't dynamically create the complex model. If you can support python backend and ensemble scheduling, we can add the complex models with model configuration file.
https://github.com/triton-inference-server/python_backend
Best way of the creating dynamic ensemble model is supporting these features like Nvidia.