Limit the memory usage for pretrained models inside the container programmatically #9521
-
How to reproduce the behaviourThis is related to my previous issue #8554 Is there any way to limit memory programmatically, similar to limiting CPU being used (as in #8554)? Your Environment
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
We don't have any feature for this, no. You might want to limit the longest input you'll handle, or the number of simultaneous requests, or use a worker queue.
As far as I'm aware the models don't check the total available memory, and like most programs they just use what's available as needed. So it's not making a mistake about the amount of memory available, and there's no bug or anything that we can fix, it's just that you're using more memory than is available. |
Beta Was this translation helpful? Give feedback.
We don't have any feature for this, no. You might want to limit the longest input you'll handle, or the number of simultaneous requests, or use a worker queue.
As far as I'm aware the models don't check the total available memory, and like most programs they just use what's available as needed. So it's not making a mistake about the amount of memory available, and there's no bug or anything that we can fix, it's just that you're using more memory than is available.