Support for caching #1142
Unanswered
fernandocamargoai
asked this question in
Ideas
Replies: 1 comment
-
Hi @fernandocamargoti - I think that is a great idea! It is a bit challenging to implement this with the current architecture since every batch's input and output are transferred between the frontend batching layer and model backend in one HTTP request. But @bojiang and I have been discussing moving this to a |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, guys.
First, I'd like to congratulate you on the 0.9.0 release. I really liked the discard() mechanism and now I'm able to validate using Pydantic easily.
I'd like to propose a new feature, using this new API, that is to early return a value. Right now, I have a cache implemented using Redis. But since I don't have a way to return the cached value early, if I'm processing a micro-batch and some of the requests have cached responses, I can avoid processing those requests again, but the whole micro-batch needs to be processed first.
So, my idea is that, similar to the current task.discard(), we would have a task.return() where we return a value. And this value would be returned early, instead of waiting for the whole micro-batch to be processed.
What do you guys think?
Beta Was this translation helpful? Give feedback.
All reactions