Effect of Inflight batching or Continous batching on the Results?

Hi Authors of Magic Dec, 

I really found you paper very insightful and thought provoking! 

I have a few doubts and was wondering if you could share your thoughts:
1. the calculation of S inflection was done in a benchmark setup which had inflight batching (IFB) / continous batching or it was without it?
2. With IFB, each step could be a mix of prefill and decode step so wondering if the results of S inflection would change?
3. any pointers to the code of the benchmark setup or descriptions of how batching was happening?


Looking forward to hearing from you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Effect of Inflight batching or Continous batching on the Results? #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Effect of Inflight batching or Continous batching on the Results? #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions