It might be interesting if there were an option initialize classifiers lazily and to free their resources if unused for a while:
- load and initialize the models only when requested
- keep the classifiers active at their endpoints for a certain while after last use, then discard them
Not sure if this is something that should/could be implemented as part of this framework or if it would rather be something to handle at the level of the production server (e.g. gunicorn).