Skip to content

Inference speed details #139

@angermanc

Description

@angermanc

Hi all!
I tried to replicate the inference times as described in this simple plot.. After multiple tries, I could reach the same performance (approx. 10 seconds, 20 + 10 steps) using the following config:

  • A100 GPU
  • small-big version (small stage B model, large stage C model)
  • compile = True
  • bs = 4
  • bfloat16

However, when i apply the same config to the SDXL model, I achieved a inference speed of 16.1 seconds in average, which is much more faster than stated in the histogram. Can you give some details how you compared the inference speed for your implementation?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions