Skip to content
Discussion options

You must be logged in to vote

Yes this is expected, they mention it in some of these GitHub discussions that you can fit in smaller RAM by turning off beam_size and best_of (so that you'll be using the Greedy decoder instead of BeamSearch Decoder). Also, using FP16 will use less RAM than FP32. I posted some GDDR measurements for small beam search values in #391 where you can see that beam_size=7 is slightly larger memory requirement than the others, but not by much.

Replies: 1 comment 12 replies

Comment options

You must be logged in to vote
12 replies
@dgoryeo
Comment options

@FurkanGozukara
Comment options

@dgoryeo
Comment options

@FurkanGozukara
Comment options

@dgoryeo
Comment options

Answer selected by jongwook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants