-
Notifications
You must be signed in to change notification settings - Fork 696
Open
Labels
enhancementNot as big of a feature, but technically not a bug. Should be easy to fixNot as big of a feature, but technically not a bug. Should be easy to fixhigh prioritymodule: androidIssues related to Android code, build, and executionIssues related to Android code, build, and executionmodule: benchmarkIssues related to the benchmark infrastructureIssues related to the benchmark infrastructuretriage reviewItems require an triage reviewItems require an triage reviewtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone
Description
π Describe the bug
Issue
The metrics are frequently exceeding the 10% threshold from run-to-run, making it hard to interpret the metrics and decided actions.
It's happening to pretty much ALL models and ALL configs:



link to the dashboard
What areas should we look into?
- iOS benchmark app
- Android benchmark app
Solution Space
I think we can start with parameterizing the number of iterations for each model and its benchmark config, find the "right" value so that the metrics (load, inference, tps) fluctuation from run-to-run is within a reasonable range (<10%).
Versions
trunk
cc @mergennachin @kimishpatel @iseeyuan @kirklandsign @cbilgin @huydhn @shoumikhin
digantdesai and kirklandsign
Metadata
Metadata
Assignees
Labels
enhancementNot as big of a feature, but technically not a bug. Should be easy to fixNot as big of a feature, but technically not a bug. Should be easy to fixhigh prioritymodule: androidIssues related to Android code, build, and executionIssues related to Android code, build, and executionmodule: benchmarkIssues related to the benchmark infrastructureIssues related to the benchmark infrastructuretriage reviewItems require an triage reviewItems require an triage reviewtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Type
Projects
Status
In Progress
Status
Todo