Skip to content

Effect of Inflight batching or Continous batching on the Results? #8

@ekagra-ranjan

Description

@ekagra-ranjan

Hi Authors of Magic Dec,

I really found you paper very insightful and thought provoking!

I have a few doubts and was wondering if you could share your thoughts:

  1. the calculation of S inflection was done in a benchmark setup which had inflight batching (IFB) / continous batching or it was without it?
  2. With IFB, each step could be a mix of prefill and decode step so wondering if the results of S inflection would change?
  3. any pointers to the code of the benchmark setup or descriptions of how batching was happening?

Looking forward to hearing from you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions