Skip to content

Optimize inference speed ⚡️ #8

@maxbbraun

Description

@maxbbraun

Experimenting with compiler options in branch fast-opts.

Switching from -Os to -O3 seems to have significant impact on tokens per second. (-Ofast doesn't noticeably add on top.)

->>> Averaged 2.60 tokens/s
+>>> Averaged 3.79 tokens/s

Unfortunately, something about this seems to break the camera input or TPU inference and I haven't debugged that yet.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions