-
Notifications
You must be signed in to change notification settings - Fork 121
Tolerances for CI Workflow - Grind & Exec Times (#750) #876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #876 +/- ##
==========================================
+ Coverage 42.95% 45.64% +2.69%
==========================================
Files 69 68 -1
Lines 19504 18646 -858
Branches 2366 2249 -117
==========================================
+ Hits 8377 8511 +134
+ Misses 9704 8775 -929
+ Partials 1423 1360 -63 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I picked the 0.98 tolerance for this. I looked through some old CI runs, and the grind times reported were basically all 0.99 or 1.00, so something worse than 0.98 seemed like it was worth looking at. @sbryngelson can comment on whether he thinks this is a reasonable tolerance or not. |
|
@mohdsaid497566 @wilfonba I really like the idea of this, but I would much rather have continuous benchmarking, which is more robust/repeatable, and insightful (fixing this PR is just a bit of a convenience for me). |
|
Sample benchmark output. It needs a minor fix. Comparing Benchmarks: Speedups from ../master/bench-cpu.yaml to bench-cpu.yaml are displayed below. Thus, numbers > 1 represent increases in performance.
Warning: Exec time speedup for pre_process is less than 0.9 - Case: ibm
Case Pre Process Simulation Post Process
──────────────────────────────────────────────────────────────────────────────
5eq_rk3_weno3_hllc Exec: Exec: 1.08 Exec: Exec: 1.00 Exec: Exec: 1.15
& Grind: N/A &
Grind: 1.00
ibm Exec: Exec: 0.84 Exec: Exec: 1.08 Exec: Exec: 1.08
& Grind: N/A &
Grind: 1.09
viscous_weno5_sgb… Exec: Exec: 1.04 Exec: Exec: 1.00 Exec: Exec: 1.01
& Grind: N/A &
Grind: 1.01
hypo_hll Exec: Exec: 1.02 Exec: Exec: 0.99 Exec: Exec: 0.99
& Grind: N/A &
Grind: 1.02
mfc: (venv) Exiting the Python virtual environment. |
|
We only really care about simulation grind time and exec. time and the pre/post process times aren't reliable anyway. we could probably even just get rid of them. |
|
can you make the PR source code slow so we can see it fail benchmarking? |
toolchain/mfc/bench.py
Outdated
| ["--output-summary", summary_filepath] + | ||
| case.args + | ||
| ["--", "--gbpp", ARG('mem')], | ||
| ["--", "--gbpp", 0.5], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed case optimization and capped gbpp to only 0.5.
This should slow all cases down drastically. I am not sure but It might exceed the allocated time. If happens to be the case, I will hop into Phoenix and figure out proper time for the bench job until I reach grind time failure mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently testing the failure mode on Delta and will post the outcomes here
I guess I will cap just the memory per process and keep case optimization to make it run slower but do not take forever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overthought over it needlessly. I can just induce an artificial failure by modifying the grind times in the pr/master benchmark yaml files locally then run bench_diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it worked out luckily
|
Failed indeed as intended to be. To replicate the artificial failure, download benchmark yaml artifacts from any PR results then tweak the numbers to create unreasonable discrepancies. Finally, run |
|
very good thanks! |
|
it looks like it failed the CPU benchmark because the exec. time was below 0.9 on preprocess? we should only run the test for the difference in speed on simulation |
|
@sbryngelson no, grind time is below threshold actually - 0.97. Ours is set for >=0.98. Exec time is meant to throw only warnings but cant terminate the job. |
toolchain/mfc/bench.py
Outdated
|
|
||
| grind_time_value = lhs_summary[target.name]["grind"] / rhs_summary[target.name]["grind"] | ||
| speedups[i] += f" & Grind: {grind_time_value:.2f}" | ||
| if grind_time_value <0.98: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
threshold for grind time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. well i think 0.98 is too strict against @wilfonba's suggestion. I would use 0.95 for now, can adjust later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kinda stringent I believe so. I will change it now.
|
cool it will probably pass and i will merge. thanks! |
…wCode#876) Co-authored-by: mohdsaid497566 <[email protected]> Co-authored-by: mohdsaid497566 <[email protected]>

Description
Enhancement for CI workflow. Raises MFCException if grind time is less than an acceptable threshold. For Exec, it prints an error.
Resolves/closes #750