-
Notifications
You must be signed in to change notification settings - Fork 125
benchmarks hardening! #1078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
benchmarks hardening! #1078
Conversation
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughCI workflows shorten timeouts and make job names matrix-driven; bench workflow now submits SLURM jobs and monitors them via a new script; bench tooling adds per-case validation and failure aggregation; tests gain per-test timeouts, global abort signaling, and failure-rate–based early abort. Changes
Sequence Diagram(s)sequenceDiagram
participant CI as CI Runner
participant WF as submit_and_monitor (workflow)
participant Monitor as monitor_slurm_job.sh
participant SLURM as SLURM (sbatch / squeue / sacct / scontrol)
participant FS as Filesystem (output file)
Note over CI,WF: Parallel submit_and_monitor for PR and Master
CI->>WF: invoke submit_and_monitor(branch)
WF->>SLURM: sbatch bench job
SLURM-->>WF: job_id
WF->>Monitor: run monitor_slurm_job.sh(job_id, output_file)
Monitor->>FS: wait for output_file (retries/backoff)
Monitor->>FS: tail -f output_file
loop while job running
Monitor->>SLURM: poll squeue
SLURM-->>Monitor: state info
FS-->>Monitor: appended logs
end
Monitor->>SLURM: scontrol / sacct for ExitCode (fallback)
alt ExitCode == 0:0
Monitor-->>WF: success (0)
else
Monitor-->>WF: failure (non-zero)
end
WF-->>CI: aggregate statuses and set workflow exit
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces "benchmarking hardening" features to improve the robustness and reliability of MFC's testing and benchmarking infrastructure. The changes add timeout mechanisms, early failure detection, enhanced error handling, and better job monitoring for SLURM-based workflows.
Key changes:
- Added per-test timeout (1 hour) and early abort on high failure rates (30% after 20 tests) to the test suite
- Enhanced benchmark validation with comprehensive error checking and failed case tracking
- Improved GitHub Actions workflows with better job monitoring and reduced timeout (from 1400 to 480 minutes)
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| toolchain/mfc/test/test.py | Added signal-based timeout handler (1 hour per test) and early abort logic when failure rate exceeds 30% after 20 tests |
| toolchain/mfc/bench.py | Added comprehensive validation (file existence, return codes, summary validation) and improved error reporting with failed case tracking |
| .github/workflows/test.yml | Updated job name formatting to include interface and reduced timeout from 1400 to 480 minutes |
| .github/workflows/bench.yml | Reduced timeout to 480 minutes and added detailed SLURM job monitoring with output streaming and exit code validation |
|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
.github/workflows/bench.yml (2)
105-164: Consider extracting submit_and_monitor to a standalone script.This ~60-line function inline in YAML is hard to test and maintain. Extracting it to
.github/scripts/monitor_slurm_job.shwould improve reusability and testability. This was also noted in a previous review.
146-149: Transient squeue failures could cause premature exit.If
squeuefails transiently (network blip, scheduler hiccup), the loop exits immediately even though the job may still be running. Consider adding a brief grace period or retry.# Wait for job to complete (will wait up to GitHub Actions timeout) - while squeue -j "$job_id" &>/dev/null; do + squeue_failures=0 + while true; do + if squeue -j "$job_id" &>/dev/null; then + squeue_failures=0 + else + squeue_failures=$((squeue_failures + 1)) + # Allow a few transient failures before concluding job is done + if [ $squeue_failures -ge 3 ]; then + break + fi + fi sleep 5 done
🧹 Nitpick comments (2)
toolchain/mfc/test/test.py (1)
337-339: Remove extraneousfprefix from strings without placeholders.- cons.print(f"[bold red]This suggests a systemic issue (bad build, broken environment, etc.)[/bold red]") - cons.print(f"[bold red]Aborting remaining tests to fail fast.[/bold red]\n") + cons.print("[bold red]This suggests a systemic issue (bad build, broken environment, etc.)[/bold red]") + cons.print("[bold red]Aborting remaining tests to fail fast.[/bold red]\n")toolchain/mfc/bench.py (1)
112-128:cons.unindent()may be skipped if validation fails without explicit continue.If an exception is raised inside the try block (line 124-126),
cons.unindent()at line 128 is still called. However, if validation_failed is True (line 112), you callcons.unindent()at 114 thencontinue, skipping line 128—which is correct. The logic is sound but fragile.Consider using a
finallyblock to guarantee unindent:try: + # ... existing try block code ... + except Exception as e: + cons.print(f"[bold red]ERROR[/bold red]: Unexpected error running {case.slug}: {e}") + failed_cases.append(case.slug) + finally: cons.unindent() - - cons.unindent()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/workflows/bench.yml(3 hunks).github/workflows/test.yml(1 hunks)toolchain/mfc/bench.py(5 hunks)toolchain/mfc/test/test.py(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Add and update focused tests for changes; ensure no regressions in golden test outputs without clear justification
Applied to files:
toolchain/mfc/bench.py
🧬 Code graph analysis (2)
toolchain/mfc/bench.py (3)
toolchain/mfc/common.py (3)
MFCException(31-32)system(34-41)file_load_yaml(65-70)toolchain/mfc/state.py (1)
ARG(85-93)toolchain/mfc/printer.py (1)
unindent(19-24)
toolchain/mfc/test/test.py (6)
toolchain/mfc/common.py (3)
MFCException(31-32)file_write(44-54)find(165-174)toolchain/mfc/test/case.py (1)
compute_tolerance(248-276)toolchain/mfc/state.py (2)
ARG(85-93)items(33-34)toolchain/mfc/packer/pack.py (3)
has_NaNs(90-96)save(46-88)find(34-35)toolchain/mfc/packer/tol.py (1)
compare(50-97)toolchain/mfc/test/cases.py (1)
modify_example_case(1035-1052)
🪛 Ruff (0.14.7)
toolchain/mfc/bench.py
42-42: Avoid specifying long messages outside the exception class
(TRY003)
124-124: Do not catch blind exception: Exception
(BLE001)
137-137: Avoid specifying long messages outside the exception class
(TRY003)
211-211: Do not catch blind exception: Exception
(BLE001)
toolchain/mfc/test/test.py
36-36: Unused function argument: signum
(ARG001)
36-36: Unused function argument: frame
(ARG001)
37-37: Avoid specifying long messages outside the exception class
(TRY003)
224-224: Avoid specifying long messages outside the exception class
(TRY003)
228-228: Avoid specifying long messages outside the exception class
(TRY003)
231-231: Avoid specifying long messages outside the exception class
(TRY003)
239-239: Avoid specifying long messages outside the exception class
(TRY003)
256-256: Avoid specifying long messages outside the exception class
(TRY003)
270-270: Avoid specifying long messages outside the exception class
(TRY003)
277-277: Avoid specifying long messages outside the exception class
(TRY003)
280-280: Avoid specifying long messages outside the exception class
(TRY003)
283-283: Avoid specifying long messages outside the exception class
(TRY003)
295-299: Avoid specifying long messages outside the exception class
(TRY003)
338-338: f-string without any placeholders
Remove extraneous f prefix
(F541)
339-339: f-string without any placeholders
Remove extraneous f prefix
(F541)
340-342: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Github (macos, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Code Cleanliness Check
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: Build & Publish
🔇 Additional comments (12)
.github/workflows/test.yml (3)
96-96: Dynamic job naming is clear and informative.The conditional formatting nicely handles the interface suffix, making job names like "Georgia Tech | Phoenix (gpu-acc)" or "Oak Ridge | Frontier (cpu)" more descriptive for debugging CI failures.
100-100: Significant timeout reduction from 1400 to 480 minutes.This is a substantial reduction (from ~23 hours to 8 hours). Verify this is sufficient for the longest-running test configurations, particularly on Frontier with GPU tests.
101-115: Matrix expansion and runner configuration look correct.The new matrix properly covers CPU and GPU variants across both clusters. The
runs-onconfiguration correctly uses thephoenixgroup with cluster-specific labels..github/workflows/bench.yml (2)
26-26: Job naming and timeout updates align with test.yml.Consistent naming pattern and timeout reduction across both workflows improves maintainability.
Also applies to: 76-76
173-174: Exit status capture pattern has subtle issues.The pattern
wait $pid && var=$? || var=$?always setsvarto 0 when the process succeeds because&&short-circuits. However, since you check both values at the end, this works correctly for the failure case.toolchain/mfc/test/test.py (2)
294-301: Good use of try/except/finally for alarm cleanup.The finally block ensures the alarm is always cancelled, preventing spurious timeouts on subsequent tests. The exception chaining with
from excpreserves the original traceback.
26-37: Timeout constants and exception class are well-defined.The 1-hour timeout and 30% failure threshold are reasonable defaults. Consider making these configurable via command-line arguments for flexibility in different environments.
toolchain/mfc/bench.py (5)
40-42: Good early validation of benchmark case files.Failing fast with a clear error message is better than failing deep in the execution with a cryptic error.
64-128: Robust error handling with comprehensive validation.The try/except block properly handles:
- Non-zero return codes
- Missing summary files
- Missing target data
- Unexpected exceptions
The structure ensures
failed_casesis always updated on failure. However, there's a potential issue withcons.unindent()placement.
124-126: Broad exception catch is acceptable here for robustness.In benchmark orchestration code, catching all exceptions ensures one failing case doesn't abort the entire benchmark run. The error is logged with context, and the case is tracked as failed.
131-137: Clear failure summary with actionable output.Listing all failed cases at the end and raising with the count helps quickly identify systemic issues. Good practice for CI diagnostics.
211-213: Improved error logging indiff()replaces silent pass.This is a significant improvement—previously, speedup computation failures were silently ignored. Now they're logged with context and properly set the error flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
toolchain/mfc/test/test.py (3)
202-204: SIGALRM in worker threads is still a critical issue.This has been flagged in previous reviews but appears unaddressed. The test scheduler uses
WorkerThread(seesched.py), andsignal.signal()/signal.alarm()only work in the main thread—calling from worker threads raisesValueError: signal only works in main thread.Consider one of these approaches:
- Guard with a thread check and skip timeout in worker threads
- Use
threading.Timerwith a flag-based approach- Move timeout handling to the subprocess level with
subprocess.run(timeout=...)
264-266: Directory existence check still missing.This was flagged in previous reviews.
os.listdir()will raiseFileNotFoundErrorif thesilo_hdf5/p0directory doesn't exist (which may happen for test cases that don't produce silo outputs).+ silo_dir = os.path.join(case.get_dirpath(), 'silo_hdf5', 'p0') + if not os.path.isdir(silo_dir): + # No silo outputs for this case; skip post-processing check + pass + else: - for silo_filepath in os.listdir(os.path.join(case.get_dirpath(), 'silo_hdf5', 'p0')): + for silo_filepath in os.listdir(silo_dir):
342-344: Exception propagation from worker task may break scheduler.As noted in previous reviews, raising
MFCExceptionhere propagates throughsched.sched, potentially leaving worker threads in an undefined state. Consider using a shared abort flag that the scheduler checks between tasks for a cleaner shutdown.
🧹 Nitpick comments (1)
toolchain/mfc/test/test.py (1)
340-341: Remove unnecessary f-string prefixes.These strings have no placeholders, so the
fprefix is extraneous.- cons.print(f"[bold red]This suggests a systemic issue (bad build, broken environment, etc.)[/bold red]") - cons.print(f"[bold red]Aborting remaining tests to fail fast.[/bold red]\n") + cons.print("[bold red]This suggests a systemic issue (bad build, broken environment, etc.)[/bold red]") + cons.print("[bold red]Aborting remaining tests to fail fast.[/bold red]\n")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
toolchain/mfc/test/test.py(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
toolchain/mfc/test/test.py (2)
toolchain/mfc/common.py (1)
MFCException(31-32)toolchain/mfc/state.py (1)
ARG(85-93)
🪛 Ruff (0.14.7)
toolchain/mfc/test/test.py
36-36: Unused function argument: signum
(ARG001)
36-36: Unused function argument: frame
(ARG001)
37-37: Avoid specifying long messages outside the exception class
(TRY003)
224-224: Avoid specifying long messages outside the exception class
(TRY003)
228-228: Avoid specifying long messages outside the exception class
(TRY003)
231-231: Avoid specifying long messages outside the exception class
(TRY003)
239-239: Avoid specifying long messages outside the exception class
(TRY003)
256-256: Avoid specifying long messages outside the exception class
(TRY003)
270-270: Avoid specifying long messages outside the exception class
(TRY003)
277-277: Avoid specifying long messages outside the exception class
(TRY003)
280-280: Avoid specifying long messages outside the exception class
(TRY003)
283-283: Avoid specifying long messages outside the exception class
(TRY003)
295-299: Avoid specifying long messages outside the exception class
(TRY003)
340-340: f-string without any placeholders
Remove extraneous f prefix
(F541)
341-341: f-string without any placeholders
Remove extraneous f prefix
(F541)
342-344: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Code Cleanliness Check
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Github (macos, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Build & Publish
🔇 Additional comments (2)
toolchain/mfc/test/test.py (2)
26-38: LGTM on timeout and abort constants.The timeout infrastructure is well-structured. The
TestTimeoutErrorproperly extendsMFCException, and the constants are reasonable (1-hour timeout, 30% failure threshold after 20 tests).Note: The unused
signumandframeparameters intimeout_handlerare required by the signal handler signature, so the Ruff warnings can be safely ignored.
294-301: Good timeout exception handling with cleanup.The
finallyblock ensures the alarm is always cancelled, preventing lingering signals. The re-raised exception provides helpful context including the log file path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 4 files
Prompt for AI agents (all 2 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="toolchain/mfc/test/test.py">
<violation number="1" location="toolchain/mfc/test/test.py:203">
P0: `signal.signal()` and `signal.alarm()` cannot be called from worker threads - they only work in the main thread. Since `_handle_case` is executed via `sched.sched()` which uses `WorkerThread`, this will raise `ValueError: signal only works in main thread`. Consider using `threading.Timer` or a timeout wrapper that works across threads instead.</violation>
</file>
<file name=".github/workflows/bench.yml">
<violation number="1" location=".github/workflows/bench.yml:159">
P2: Slurm reports exit codes as `X:Y`, but this regex only captures the first component. If a job finishes with `0:9` (success but terminated by signal), `$exit_code` becomes `0` and the failure check is bypassed, so signaled jobs are treated as success. Capture the full `X:Y` pair (or parse both fields) and ensure any non-`0:0` result fails the workflow.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 4 files (changes from recent commits).
Prompt for AI agents (all 2 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="toolchain/mfc/bench.py">
<violation number="1" location="toolchain/mfc/bench.py:127">
P2: The `finally` block will cause double `cons.unindent()` calls. Lines 81 and 89 still have explicit `cons.unindent()` before `continue`, but since `finally` always executes (even on `continue`), the unindent will be called twice in those error paths. Remove the `cons.unindent()` calls at lines 81 and 89, similar to how it was removed for the `validation_failed` case.</violation>
</file>
<file name=".github/scripts/monitor_slurm_job.sh">
<violation number="1" location=".github/scripts/monitor_slurm_job.sh:57">
P2: Exit code defaults to `0:0` (success) if `scontrol` or `grep` fails. This could mask real job failures when job information is unavailable. Consider failing explicitly if exit code cannot be determined.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
toolchain/mfc/bench.py (2)
58-83: Fix doublecons.unindent()on error paths to avoid printer stack underflowIn the non-zero return-code and missing-summary branches you call
cons.unindent()and thencontinue, butcons.unindent()is also executed in thefinallyblock. This will unindent twice for failing cases and can corrupt the printer’s indentation stack or raise underflow errors.Rely on the
finallyblock for unindent and drop the inner calls:- if rc != 0: - cons.print(f"[bold red]ERROR[/bold red]: Case {case.slug} failed with exit code {rc}") - cons.print(f"[bold red] Check log at: {log_filepath}[/bold red]") - failed_cases.append(case.slug) - cons.unindent() - continue + if rc != 0: + cons.print(f"[bold red]ERROR[/bold red]: Case {case.slug} failed with exit code {rc}") + cons.print(f"[bold red] Check log at: {log_filepath}[/bold red]") + failed_cases.append(case.slug) + continue @@ - if not os.path.exists(summary_filepath): - cons.print(f"[bold red]ERROR[/bold red]: Summary file not created for {case.slug}") - cons.print(f"[bold red] Expected: {summary_filepath}[/bold red]") - failed_cases.append(case.slug) - cons.unindent() - continue + if not os.path.exists(summary_filepath): + cons.print(f"[bold red]ERROR[/bold red]: Summary file not created for {case.slug}") + cons.print(f"[bold red] Expected: {summary_filepath}[/bold red]") + failed_cases.append(case.slug) + continueAlso applies to: 124-129
1-10: Movetracebackimport to the module top to satisfy lint and avoid repeated importsPylint is flagging
import tracebackinside theexceptblock (C0415 Import outside toplevel). Since this code path is not particularly hot, it’s simpler and clearer to import at the top-level and just use it in the handler.Suggested change:
-import os, sys, uuid, subprocess, dataclasses, typing, math +import os, sys, uuid, subprocess, dataclasses, typing, math, traceback @@ - except Exception as e: - import traceback + except Exception as e: cons.print( f"[bold red]ERROR[/bold red]: Failed to compute speedup for {target.name} in {slug}: {e}\n" f"{traceback.format_exc()}" )Also applies to: 211-217
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.github/scripts/monitor_slurm_job.sh(1 hunks).github/workflows/bench.yml(3 hunks).github/workflows/test.yml(1 hunks)toolchain/mfc/bench.py(5 hunks)toolchain/mfc/test/test.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
.github/workflows/bench.yml (1)
.github/workflows/phoenix/submit-bench.sh (1)
usage(5-7)
toolchain/mfc/test/test.py (2)
toolchain/mfc/common.py (4)
MFCException(31-32)file_write(44-54)find(165-174)does_command_exist(127-138)toolchain/mfc/state.py (1)
ARG(85-93)
toolchain/mfc/bench.py (2)
toolchain/mfc/common.py (3)
MFCException(31-32)system(34-41)file_load_yaml(65-70)toolchain/mfc/printer.py (1)
unindent(19-24)
🪛 actionlint (1.7.9)
.github/workflows/test.yml
105-105: label "gt" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
(runner-label)
109-109: label "gt" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
(runner-label)
113-113: label "gt" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
(runner-label)
118-118: label "frontier" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
(runner-label)
122-122: label "frontier" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
(runner-label)
126-126: label "frontier" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
(runner-label)
🪛 GitHub Actions: Lint Toolchain
toolchain/mfc/test/test.py
[error] 230-230: pylint: R1702 Too many nested blocks (6/5).
toolchain/mfc/bench.py
[error] 212-212: pylint: C0415 Import outside toplevel ('traceback').
🪛 Ruff (0.14.7)
toolchain/mfc/test/test.py
184-186: Avoid specifying long messages outside the exception class
(TRY003)
233-233: Abstract raise to an inner function
(TRY301)
233-233: Avoid specifying long messages outside the exception class
(TRY003)
238-238: Abstract raise to an inner function
(TRY301)
238-238: Avoid specifying long messages outside the exception class
(TRY003)
246-246: Avoid specifying long messages outside the exception class
(TRY003)
250-250: Avoid specifying long messages outside the exception class
(TRY003)
253-253: Avoid specifying long messages outside the exception class
(TRY003)
261-261: Avoid specifying long messages outside the exception class
(TRY003)
278-278: Avoid specifying long messages outside the exception class
(TRY003)
294-294: Avoid specifying long messages outside the exception class
(TRY003)
301-301: Avoid specifying long messages outside the exception class
(TRY003)
304-304: Avoid specifying long messages outside the exception class
(TRY003)
307-307: Avoid specifying long messages outside the exception class
(TRY003)
319-323: Avoid specifying long messages outside the exception class
(TRY003)
toolchain/mfc/bench.py
42-42: Avoid specifying long messages outside the exception class
(TRY003)
124-124: Do not catch blind exception: Exception
(BLE001)
137-137: Avoid specifying long messages outside the exception class
(TRY003)
211-211: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: Code Cleanliness Check
- GitHub Check: Georgia Tech | Phoenix (NVHPC) (cpu)
- GitHub Check: Github (macos, mpi, debug, false)
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Build & Publish
🔇 Additional comments (4)
.github/workflows/bench.yml (1)
25-27: Nice extraction of submit/monitor logic and parallel PR/master runsThe new
submit_and_monitorhelper plus.github/scripts/monitor_slurm_job.shintegration gives you much clearer control over submission errors, job IDs, and exit-code handling, while preserving parallel PR/master execution and avoidingset -epitfalls in thewaitlogic. The dynamic job name and reduced timeout also look reasonable.Also applies to: 76-77, 100-143
.github/workflows/test.yml (1)
95-101: Matrix reshaping and dynamic self-hosted job naming look consistentExplicit Phoenix/Frontier variants, the
lbl-driven labels, and the dynamic name format align cleanly with the later build/test steps and log handling. The reduced timeout is also appropriate for CI hardening, assuming your self-hosted runners are configured with these labels.Also applies to: 104-129
toolchain/mfc/test/test.py (2)
26-37: Early-abort mechanism viaabort_testsevent is structured cleanlyThe combination of:
- global
abort_testsevent,- per-case pre-check in
handle_case,- failure-rate computation gated on a minimum number of completed tests, and
- a post-scheduler check in
test()that surfaces a clear aggregated failure,gives you a deterministic, thread-safe way to fast‑fail runs with systemic breakage without throwing from worker threads. This is a solid improvement over raising directly inside
_handle_case.Also applies to: 179-187, 333-373
30-41: Timer-based timeout only detects timeouts aftercase.runreturns; add note and fix pylint nesting errorUsing
threading.Timerwith a flag is safe in worker threads, but it can't interrupt a blockedcase.runcall — ifcase.runhangs (e.g., stuck MPI job), the worker thread will remain blocked and the timeout will only be noticed after (or if)case.runreturns. That means this mechanism detects slow tests but does not strictly bound wall-clock time for truly hung runs.Two concrete suggestions:
Document/verify behavior of
case.run
Confirm (and, if possible, document) thatcase.runitself uses a timeout-capable mechanism (e.g., subprocess withtimeout=or equivalent), or that hung simulations are otherwise impossible/handled elsewhere; otherwise the 1‑hour timeout expectation is misleading.Resolve pylint R1702 ("too many nested blocks") cheaply
Given this function is already structurally complex and you're explicitly disabling other size checks, the quickest unblock is to extend the existing pylint disable:-# pylint: disable=too-many-locals, too-many-branches, too-many-statements, trailing-whitespace -def _handle_case(case: TestCase, devices: typing.Set[int]): +# pylint: disable=too-many-locals, too-many-branches, too-many-statements, too-many-nested-blocks, trailing-whitespace +def _handle_case(case: TestCase, devices: typing.Set[int]):If you prefer not to disable, you'd need to extract parts of
_handle_case(e.g., golden-file comparison andtest_allpost-processing) into helpers to reduce nesting.
|
/improve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 12 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 4 files (changes from recent commits).
Prompt for AI agents (all 2 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name=".github/workflows/bench.yml">
<violation number="1" location=".github/workflows/bench.yml:142">
P1: Bug: `$?` inside the `if` block captures the condition's exit status (0 after successful negation), not the process's actual exit code. This will always leave `pr_exit` as 0 even when the job fails, causing the workflow to incorrectly pass. Use `wait "$pr_pid" || pr_exit=$?` instead.</violation>
<violation number="2" location=".github/workflows/bench.yml:145">
P1: Bug: Same issue - `$?` will be 0 inside the block, not the actual exit code. Use `wait "$master_pid" || master_exit=$?` instead.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
toolchain/mfc/test/test.py (1)
266-289: Consider subprocess-level timeout for more precise interruption.The current timeout checks (lines 268-269, 272-274) detect timeouts between execution phases but don't interrupt a hung
case.run()call. For 1-hour timeouts this is acceptable, but if you need precise interruption, consider passing atimeoutparameter to the underlyingsubprocess.run()calls withincase.run().
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/scripts/monitor_slurm_job.sh(1 hunks).github/workflows/bench.yml(3 hunks)toolchain/mfc/bench.py(6 hunks)toolchain/mfc/test/test.py(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
.github/workflows/bench.yml (1)
.github/workflows/phoenix/submit-bench.sh (1)
usage(5-7)
toolchain/mfc/bench.py (5)
toolchain/mfc/common.py (3)
MFCException(31-32)system(34-41)file_load_yaml(65-70)toolchain/mfc/state.py (1)
ARG(85-93)toolchain/mfc/printer.py (1)
unindent(19-24)toolchain/mfc/run/run.py (1)
run(134-162)toolchain/mfc/run/input.py (1)
generate(109-114)
🪛 Ruff (0.14.7)
toolchain/mfc/bench.py
42-42: Avoid specifying long messages outside the exception class
(TRY003)
122-122: Do not catch blind exception: Exception
(BLE001)
136-136: Avoid specifying long messages outside the exception class
(TRY003)
210-210: Do not catch blind exception: Exception
(BLE001)
toolchain/mfc/test/test.py
185-188: Avoid specifying long messages outside the exception class
(TRY003)
189-191: Avoid specifying long messages outside the exception class
(TRY003)
220-220: Avoid specifying long messages outside the exception class
(TRY003)
226-229: Avoid specifying long messages outside the exception class
(TRY003)
232-235: Avoid specifying long messages outside the exception class
(TRY003)
238-241: Avoid specifying long messages outside the exception class
(TRY003)
269-269: Abstract raise to an inner function
(TRY301)
269-269: Avoid specifying long messages outside the exception class
(TRY003)
274-274: Abstract raise to an inner function
(TRY301)
274-274: Avoid specifying long messages outside the exception class
(TRY003)
282-282: Avoid specifying long messages outside the exception class
(TRY003)
286-286: Avoid specifying long messages outside the exception class
(TRY003)
289-289: Avoid specifying long messages outside the exception class
(TRY003)
297-297: Avoid specifying long messages outside the exception class
(TRY003)
314-314: Avoid specifying long messages outside the exception class
(TRY003)
320-320: Abstract raise to an inner function
(TRY301)
320-320: Avoid specifying long messages outside the exception class
(TRY003)
348-352: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Github (macos, mpi, debug, false)
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Code Cleanliness Check
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Build & Publish
🔇 Additional comments (24)
toolchain/mfc/bench.py (8)
1-1: LGTM: Traceback import added for enhanced error diagnostics.The
tracebackimport enables full stack traces in exception handlers (lines 124, 213), improving debuggability when benchmarks fail.
40-42: LGTM: Early file validation prevents cryptic failures.Checking case file existence before execution provides fast, clear feedback when benchmark definitions are missing.
52-52: LGTM: Failure tracking enables comprehensive error reporting.The
failed_caseslist allows the benchmark run to continue after individual failures and provides a complete failure summary at the end (lines 130-136).
64-81: LGTM: Robust execution with defensive error handling.The code properly addresses previous concerns:
- Line 71 converts
ARG('mem')to string, preventing potentialTypeErrorin subprocess calls- Lines 75-76 use
hasattrto safely handle different return types fromsystem(), avoidingAttributeErrorThe error messages provide clear diagnostics including exit codes and log file locations.
83-113: LGTM: Thorough post-execution validation ensures data integrity.The validation sequence properly checks:
- Summary file existence (lines 83-88)
- Target presence and required fields (lines 93-109)
- Specific validation for simulation's
grindmetric (lines 106-109)Each failure path provides clear diagnostics and continues to the next case, enabling complete failure visibility.
115-128: LGTM: Robust per-case error isolation with complete diagnostics.The broad
Exceptionhandler (line 122) is appropriate here because each benchmark case should be isolated—failures in one case shouldn't prevent others from running. The handler properly:
- Logs full tracebacks for debugging (line 124)
- Marks the case as failed
- Ensures cleanup via
finally(line 127)
129-136: LGTM: Clear failure summary for CI diagnostics.The final report provides actionable information:
- Complete list of failed case slugs
- Failure ratio for quick assessment
- Raises exception to properly fail the CI workflow
210-215: LGTM: Enhanced diff() error diagnostics.Replacing the silent
passwith explicit error logging including full traceback (line 213) significantly improves debuggability when speedup computations fail. The error is properly propagated viaerr = 1(line 215).toolchain/mfc/test/test.py (7)
26-41: LGTM: Well-designed early-abort mechanism with thread-safe signaling.The implementation properly addresses concurrency concerns:
abort_tests = threading.Event()provides thread-safe coordination (lines 33-37)- Custom
TestTimeoutErrorexception separates timeout failures from other errors- Public constants make thresholds configurable and self-documenting
This is a significant improvement over raising exceptions directly from worker threads.
179-191: LGTM: Abort logic properly handles edge cases.The code defensively prevents division by zero:
- Line 184 checks
total_completed > 0before calculating failure percentage (line 187)- Lines 189-191 provide a fallback message when no tests completed
This addresses the past review concern about potential
ZeroDivisionError.
214-241: LGTM: Well-structured helper improves code organization.Extracting silo file processing into a dedicated function improves maintainability. The function properly:
- Locates h5dump binary with fallback to system PATH
- Validates execution and output
- Detects NaN/Infinity values with clear error messages
The directory existence check is appropriately handled by the caller (line 326).
249-264: LGTM: Threading.Timer correctly replaces signal.alarm for worker threads.The implementation properly addresses the critical issue with signal handling in worker threads:
threading.Timerworks correctly in worker threads (lines 249-255), unlikesignal.alarm()- Clear comments explain the design rationale (lines 250-252)
- Timer is properly cancelled in all paths (line 263 for dry_run, line 354 in finally)
This resolves the past review concern that would have caused
ValueError: signal only works in main thread.
316-329: LGTM: Test_all flow properly validated and timeout-protected.The code addresses previous concerns:
- Line 319 adds timeout check before the potentially long post-process run
- Line 326 checks directory existence before listing, preventing
FileNotFoundError
340-354: LGTM: Comprehensive timeout error handling with conditional diagnostics.The exception handler properly addresses the past concern about missing log files:
- Lines 342-347 check file existence before referencing it
- Provides contextual error messages for debugging
- Line 354 ensures timer cleanup in all exit paths (normal, timeout, exception)
389-401: LGTM: Fail-fast logic correctly checks after every test.The early-abort check after each test (not just failures) is the correct design—it enables the fastest possible detection when the threshold is crossed. The implementation properly:
- Skips abort checks during dry-run (line 391)
- Uses thread-safe flag setting instead of raising (line 400), addressing past review concerns
- Returns gracefully to allow coordinated shutdown
.github/workflows/bench.yml (3)
26-26: LGTM: Enhanced job naming improves CI observability.The dynamic job name now includes the interface (e.g., "Phoenix (gpu-acc)"), making it easier to identify specific benchmark variants in the GitHub Actions UI.
76-76: LGTM: Timeout reduction aligns with fail-fast objectives.Reducing timeout from 1400 to 480 minutes (8 hours) is appropriate given the enhanced error detection, per-test timeouts (3600s), and early-abort logic introduced in this PR.
100-153: LGTM: Refactored workflow addresses all portability and reliability concerns.The implementation properly addresses previous review feedback:
- Line 117: Uses
sedfor portable job ID extraction (no-Pflag dependency)- Line 128: Delegates complex monitoring logic to a dedicated script for reusability
- Lines 138-153: Reliable exit code capture without shell evaluation pitfalls:
- Explicit initialization (lines 139-140)
- Clear
if ! waitpattern (lines 142-147)- Proper quoting and comparison (lines 150-153)
.github/scripts/monitor_slurm_job.sh (6)
1-17: LGTM: Well-structured script header with proper validation.The script follows bash best practices:
set -eensures error propagation- Clear usage documentation
- Input validation with helpful error message
18-42: LGTM: Robust retry logic handles transient SLURM failures.The exponential backoff mechanism (lines 37-40) properly distinguishes transient
squeuefailures from actual job completion, addressing the past review concern. The script only errors (lines 31-34) if the job is truly gone and no output file exists after multiple retries.
50-77: LGTM: Sophisticated completion detection with sacct fallback.The implementation comprehensively addresses SLURM monitoring best practices:
- Uses
sacctas fallback after repeatedsqueuefailures (lines 56-69)- Only treats terminal states as completion (COMPLETED, FAILED, etc., lines 61-62)
- Resets counter for non-terminal states to handle transient failures (line 67)
- Provides graceful degradation when
sacctis unavailable (lines 71-72)This resolves past review concerns about premature completion detection.
79-100: LGTM: File stabilization prevents output truncation.The stabilization logic (lines 79-97) ensures output is fully flushed before proceeding:
- Waits for file size to remain constant for 2 consecutive checks (~10s total)
- Prevents race conditions between SLURM output buffering and log capture
Line 100 safely terminates
tailwith|| trueto handle already-exited processes.
106-129: LGTM: Robust exit code retrieval with explicit failure handling.The implementation properly addresses all past review concerns:
- Line 112: Uses portable
grep -oE+cutinstead of PCRE-specific-Pwith\K- Lines 115-122: Falls back to
sacctfor aged-out jobs- Lines 124-129: Fails explicitly when exit code cannot be determined, preventing silent success on errors
This is a significant reliability improvement over defaulting to "0:0".
131-138: LGTM: Clear final exit status validation.The script properly validates only "0:0" as success (line 132), rejecting any signal or non-zero exit code, and provides clear success/failure messaging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
toolchain/mfc/test/test.py (1)
257-263: Timer cleanup is correct, but timeout doesn't interrupt blocking operations.The
finallyblock ensures the timer is always cancelled. However, the timeout mechanism only checks the flag before and aftercase.run()calls—it won't interrupt a blockingcase.run()that exceeds 1 hour. Ifcase.run()takes 2 hours, the timeout flag will be set after 1 hour, but the process continues until completion. This is a known limitation flagged by previous reviewers.For true interruption,
case.run()would need to usesubprocess.run(timeout=...)or check the timeout flag internally.Also applies to: 369-370
🧹 Nitpick comments (1)
toolchain/mfc/test/test.py (1)
307-322: Optimize: Reuse already-loaded golden file.At line 307, you load
golden = packer.load(golden_filepath). Then at line 320 (in the else branch), you reload the same file withpacker.load(golden_filepath). You can reuse thegoldenvariable instead.- err, msg = packtol.compare(pack, packer.load(golden_filepath), packtol.Tolerance(tol, tol)) + err, msg = packtol.compare(pack, golden, packtol.Tolerance(tol, tol))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
toolchain/mfc/test/test.py(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
toolchain/mfc/test/test.py (3)
toolchain/mfc/common.py (4)
MFCException(31-32)does_command_exist(127-138)get_program_output(104-106)find(165-174)toolchain/mfc/printer.py (1)
unindent(19-24)toolchain/mfc/state.py (1)
ARG(85-93)
🪛 Ruff (0.14.7)
toolchain/mfc/test/test.py
185-188: Avoid specifying long messages outside the exception class
(TRY003)
189-191: Avoid specifying long messages outside the exception class
(TRY003)
218-221: Avoid specifying long messages outside the exception class
(TRY003)
227-227: Avoid specifying long messages outside the exception class
(TRY003)
233-237: Avoid specifying long messages outside the exception class
(TRY003)
240-243: Avoid specifying long messages outside the exception class
(TRY003)
246-249: Avoid specifying long messages outside the exception class
(TRY003)
277-277: Abstract raise to an inner function
(TRY301)
277-277: Avoid specifying long messages outside the exception class
(TRY003)
282-282: Abstract raise to an inner function
(TRY301)
282-282: Avoid specifying long messages outside the exception class
(TRY003)
290-290: Avoid specifying long messages outside the exception class
(TRY003)
294-294: Avoid specifying long messages outside the exception class
(TRY003)
297-297: Avoid specifying long messages outside the exception class
(TRY003)
305-305: Avoid specifying long messages outside the exception class
(TRY003)
322-322: Avoid specifying long messages outside the exception class
(TRY003)
328-328: Abstract raise to an inner function
(TRY301)
328-328: Avoid specifying long messages outside the exception class
(TRY003)
336-339: Avoid specifying long messages outside the exception class
(TRY003)
364-368: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Code Cleanliness Check
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Build & Publish
🔇 Additional comments (5)
toolchain/mfc/test/test.py (5)
1-1: LGTM: Thread-safe timeout and abort mechanism.The use of
threading.Eventfor global abort signaling andthreading.Timerfor per-test timeouts is the correct approach for the worker-thread context. This properly addresses the past review concerns aboutsignal.alarm()not working in non-main threads.Also applies to: 26-41
342-345: LGTM: Directory existence check prevents errors.The
os.path.isdir(silo_dir)check properly guards againstFileNotFoundErrorwhen test cases don't produce silo output. This correctly addresses previous review feedback.
356-368: LGTM: Helpful timeout error messages.The conditional log message (lines 358-363) correctly handles cases where the timeout occurs before
out_pre_sim.txtis written, providing clear guidance to users. This addresses previous feedback about potentially misleading error messages.
179-191: LGTM: Thread-safe early abort with proper guards.The abort mechanism correctly uses
threading.Eventfor cross-thread signaling:
- Line 380: Pre-check allows graceful early exit
- Line 416:
abort_tests.set()from worker thread avoids raising exceptions that could leave the scheduler in an undefined state- Line 184: Guards against division by zero
- Line 407: Sensibly skips abort check during dry-run
The early abort check runs after every test (not just failures), which is slightly less efficient but ensures rapid detection when the failure rate crosses the threshold.
Also applies to: 378-380, 405-417
214-250: LGTM: Well-structured silo file validation.The
_process_silo_filehelper provides clear separation of concerns with appropriate checks:
- File existence validation before processing
- h5dump availability checks (both installed and system paths)
- Return code validation
- NaN/Infinity detection with actionable error messages
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
toolchain/mfc/test/test.py (2)
179-192: Abort check logic is sound, though the else branch appears unreachable.The division at line 185 is safe because
abort_testsis only set whentotal_completed >= MIN_CASES_BEFORE_ABORT(20) at line 465-472. However, the else branch at lines 189-191 appears technically unreachable given the current logic, though it provides a defensive safety net if the abort mechanism changes in the future.
257-263: Timeout mechanism is correct for threading context, but note it's soft.The
threading.Timerapproach properly works in worker threads and the timer is correctly cancelled in the finally block (line 426). The timeout flag checks at lines 276, 281, 327 provide reasonable timeout enforcement between operations.However, this is a soft timeout that only sets a flag—it won't interrupt a hung subprocess. If
case.run()blocks indefinitely (e.g., deadlocked simulation), the timeout won't trigger until the process completes or returns control. For stricter enforcement, consider passingtimeout=TEST_TIMEOUT_SECONDStosubprocess.run()withincase.run()if it uses subprocess internally.Also applies to: 276-283, 412-427
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
toolchain/mfc/test/test.py(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
toolchain/mfc/test/test.py (4)
toolchain/mfc/common.py (4)
MFCException(31-32)does_command_exist(127-138)get_program_output(104-106)find(165-174)toolchain/mfc/test/case.py (4)
TestCase(115-276)get_uuid(154-155)get_uuid(288-289)get_dirpath(157-158)toolchain/mfc/packer/pack.py (2)
has_NaNs(90-96)find(34-35)toolchain/mfc/packer/tol.py (1)
compare(50-97)
🪛 GitHub Actions: Lint Toolchain
toolchain/mfc/test/test.py
[error] 274-274: Pylint: R1702 Too many nested blocks (7/5) in test.py. (Command './mfc.sh lint' failed with exit code 8).
🪛 Ruff (0.14.7)
toolchain/mfc/test/test.py
185-188: Avoid specifying long messages outside the exception class
(TRY003)
189-191: Avoid specifying long messages outside the exception class
(TRY003)
218-221: Avoid specifying long messages outside the exception class
(TRY003)
227-227: Avoid specifying long messages outside the exception class
(TRY003)
233-237: Avoid specifying long messages outside the exception class
(TRY003)
240-243: Avoid specifying long messages outside the exception class
(TRY003)
246-249: Avoid specifying long messages outside the exception class
(TRY003)
277-277: Abstract raise to an inner function
(TRY301)
277-277: Avoid specifying long messages outside the exception class
(TRY003)
282-282: Abstract raise to an inner function
(TRY301)
282-282: Avoid specifying long messages outside the exception class
(TRY003)
290-290: Avoid specifying long messages outside the exception class
(TRY003)
294-294: Avoid specifying long messages outside the exception class
(TRY003)
297-297: Avoid specifying long messages outside the exception class
(TRY003)
305-305: Avoid specifying long messages outside the exception class
(TRY003)
322-322: Avoid specifying long messages outside the exception class
(TRY003)
328-328: Abstract raise to an inner function
(TRY301)
328-328: Avoid specifying long messages outside the exception class
(TRY003)
392-395: Avoid specifying long messages outside the exception class
(TRY003)
420-424: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Github (macos, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Oak Ridge | Frontier (gpu-acc)
- GitHub Check: Oak Ridge | Frontier (cpu)
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: Code Cleanliness Check
- GitHub Check: Oak Ridge | Frontier (CCE) (gpu-acc)
- GitHub Check: Oak Ridge | Frontier (CCE) (gpu-omp)
- GitHub Check: Georgia Tech | Phoenix (NVHPC) (cpu)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Build & Publish
🔇 Additional comments (3)
toolchain/mfc/test/test.py (3)
1-41: LGTM! Threading-based approach correctly addresses worker thread constraints.The use of
threading.Eventforabort_testsand the new timeout infrastructure usingthreading.Timer(lines 257-263) correctly addresses past review concerns aboutsignal.alarm()failing in worker threads. This approach works properly with thesched.schedscheduler that runs tests viaWorkerThread.
214-250: LGTM! Helper function is well-structured with proper error handling.The
_process_silo_filehelper correctly checks file existence (line 217) before processing, handles h5dump lookup gracefully, and provides informative error messages. The string-based NaN/Infinity detection is simple but effective for the use case.
429-475: LGTM! Early abort logic is correctly placed and thread-safe.The abort check at lines 461-473 correctly runs after each test completes (whether success or failure), allowing proper tracking of the overall failure rate. The placement after both the success path (line 452) and final failure path (line 456) is correct—contrary to a past review suggestion, this should not be moved inside the exception handler.
Setting
abort_testsflag (line 472) and returning gracefully (line 473) properly signals other worker threads without raising exceptions that could leave the scheduler in an undefined state. The check at line 434 ensures already-queued tasks exit cleanly when abort is requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
toolchain/mfc/test/test.py (1)
312-431: Timeout detection gaps around POST_PROCESS and non-preemptive behaviorThe
threading.Timer+timeout_flagapproach is a good fix for the earlier SIGALRM-in-worker-threads issue, and the timer is correctly cancelled infinally. Two points to consider:
No final timeout check after POST_PROCESS and packing
Fortest_allcases, you currently checktimeout_flag:
- Before the PRE_PROCESS+SIMULATION run.
- After PRE_PROCESS+SIMULATION completes.
- Before the POST_PROCESS run.
If POST_PROCESS (or subsequent packing / golden comparison) pushes the total case time beyond
TEST_TIMEOUT_SECONDS, the timeout will not be reported. You can add a final check near the end of_handle_case:if ARG("test_all"): ... if cmd.returncode != 0: ... raise MFCException( f"Test {case}: Failed to execute MFC (post-process). " f"See log at: {out_filepath}" )
case.delete_output()
# Final timeout check in case post-process or packing pushed us over the limitif timeout_flag.is_set():raise TestTimeoutError("Test case exceeded 1 hour timeout")case.delete_output()
- Timeout is still non-preemptive for hung
case.run()
As earlier reviewers noted, the current pattern can only detect a timeout oncecase.run(...)returns; it can’t interrupt a truly hung or extremely long-running subprocess. Fully enforcing hard per-test timeouts would require plumbing a timeout into the underlying execution layer (e.g.,subprocess.run(..., timeout=...)inside whatevercase.runuses) or running the test in a separate process that can be killed on timeout. That’s a larger change, but worth keeping in mind if “hard” time limits are required rather than post-facto detection.
🧹 Nitpick comments (1)
toolchain/mfc/test/test.py (1)
26-41: Global abort/timeout state persists across runs; consider resetting intest()
MIN_CASES_BEFORE_ABORT,FAILURE_RATE_THRESHOLD,TEST_TIMEOUT_SECONDS,abort_tests, and the global counters are all module-level, andtest()doesn’t reset them. In normal CLI usage (test()once per process) this is fine, but iftest()is ever invoked multiple times in the same interpreter, an earlier abort or non-zero nFAIL/nPASS will leak into subsequent runs.If you expect (now or later) to reuse
test()programmatically, consider resetting the global state at the top oftest():def test(): - # pylint: disable=global-statement, global-variable-not-assigned, too-many-statements - global nFAIL, nPASS, nSKIP, total_test_count - global errors + # pylint: disable=global-statement, global-variable-not-assigned, too-many-statements + global nFAIL, nPASS, nSKIP, total_test_count + global errors + + # Reset global state for a fresh test run (important if test() is reused) + abort_tests.clear() + nFAIL = nPASS = nSKIP = 0 + total_test_count = 0 + errors = []If
test()is strictly a one-shot CLI entry point, this can be left as-is.Also applies to: 109-113
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
toolchain/mfc/test/test.py(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
toolchain/mfc/test/test.py (2)
toolchain/mfc/common.py (4)
MFCException(31-32)does_command_exist(127-138)get_program_output(104-106)find(165-174)toolchain/mfc/state.py (1)
ARG(85-93)
🪛 Ruff (0.14.7)
toolchain/mfc/test/test.py
185-188: Avoid specifying long messages outside the exception class
(TRY003)
189-191: Avoid specifying long messages outside the exception class
(TRY003)
218-221: Avoid specifying long messages outside the exception class
(TRY003)
227-227: Avoid specifying long messages outside the exception class
(TRY003)
233-237: Avoid specifying long messages outside the exception class
(TRY003)
240-243: Avoid specifying long messages outside the exception class
(TRY003)
246-249: Avoid specifying long messages outside the exception class
(TRY003)
332-332: Abstract raise to an inner function
(TRY301)
332-332: Avoid specifying long messages outside the exception class
(TRY003)
337-337: Abstract raise to an inner function
(TRY301)
337-337: Avoid specifying long messages outside the exception class
(TRY003)
345-345: Avoid specifying long messages outside the exception class
(TRY003)
349-349: Avoid specifying long messages outside the exception class
(TRY003)
352-352: Avoid specifying long messages outside the exception class
(TRY003)
360-360: Avoid specifying long messages outside the exception class
(TRY003)
377-377: Avoid specifying long messages outside the exception class
(TRY003)
383-383: Abstract raise to an inner function
(TRY301)
383-383: Avoid specifying long messages outside the exception class
(TRY003)
397-400: Avoid specifying long messages outside the exception class
(TRY003)
425-429: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: Georgia Tech | Phoenix (NVHPC) (gpu-acc)
- GitHub Check: Oak Ridge | Frontier (CCE) (gpu-omp)
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: Code Cleanliness Check
- GitHub Check: Github (macos, mpi, debug, false)
- GitHub Check: Georgia Tech | Phoenix (cpu)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Oak Ridge | Frontier (gpu-omp)
- GitHub Check: Oak Ridge | Frontier (cpu)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Oak Ridge | Frontier (gpu-acc)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Georgia Tech | Phoenix (gpu-omp)
- GitHub Check: Georgia Tech | Phoenix (gpu-acc)
- GitHub Check: Build & Publish
🔇 Additional comments (2)
toolchain/mfc/test/test.py (2)
214-305: New silo post-process and multi-rank debug helpers look solid
_process_silo_fileand_print_multirank_debug_infoadd clear, localized validation and diagnostics: they check silo file existence, verifyh5dumpavailability with a PATH fallback, surface non-zero return codes and NaN/Inf findings asMFCExceptions, and print concrete multi-rank debug info (restart_data contents, key params, and post_process.inp snippet). This is a good hardening / observability improvement with no obvious correctness issues.
179-191: Early-abort coordination viaabort_testsEvent is well-structuredUsing a module-level
threading.Eventfor early abort solves the prior problem of raising from worker threads:
handle_casechecksabort_tests.is_set()before doing any work, so already-aborted runs exit cheaply.- After each case finishes (success or final failure),
handle_caserecomputes the global failure rate and setsabort_testsonceMIN_CASES_BEFORE_ABORTandFAILURE_RATE_THRESHOLDare met, then returns gracefully.- The main
test()function detectsabort_tests.is_set()aftersched.sched(...)completes and raises a single summaryMFCExceptionwith the aggregated failure statistics.This cleanly centralizes the “we are aborting the run” decision in the main thread while still letting worker threads stop themselves without throwing cross-thread exceptions.
Also applies to: 439-442, 466-478
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
toolchain/mfc/test/test.py (2)
239-249: String-based NaN/Infinity detection may have edge cases.The checks for
"nan,"and"inf,"rely on h5dump's output format. This works for typical cases but could miss variations like"nan]","-inf,", or locale-specific formatting. Consider using a more robust pattern if false negatives become an issue.- if "nan," in output: + if "nan," in output.lower():This at least handles case variations (e.g.,
"NaN,").
486-488: Minor race condition in failure rate calculation (benign).The reads of
nFAILandnPASSat lines 486-488 are not atomic across threads. However, since both counters only increment and the 30% threshold is a heuristic for detecting systemic failures, the imprecision is acceptable. The worst case is aborting slightly earlier or later than the exact threshold.If stricter accuracy is desired in the future, consider using
threading.Lockorthreading.atomiccounters, but this is not necessary for the current use case.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
toolchain/mfc/test/test.py(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
toolchain/mfc/test/test.py (3)
toolchain/mfc/common.py (5)
MFCException(31-32)does_command_exist(127-138)get_program_output(104-106)delete_file(81-83)find(165-174)toolchain/mfc/test/case.py (6)
TestCase(115-276)get_filepath(160-164)get_dirpath(157-158)delete_output(166-185)get_uuid(154-155)get_uuid(288-289)toolchain/mfc/state.py (1)
ARG(85-93)
🪛 Ruff (0.14.7)
toolchain/mfc/test/test.py
185-188: Avoid specifying long messages outside the exception class
(TRY003)
189-191: Avoid specifying long messages outside the exception class
(TRY003)
218-221: Avoid specifying long messages outside the exception class
(TRY003)
227-227: Avoid specifying long messages outside the exception class
(TRY003)
233-237: Avoid specifying long messages outside the exception class
(TRY003)
240-243: Avoid specifying long messages outside the exception class
(TRY003)
246-249: Avoid specifying long messages outside the exception class
(TRY003)
341-341: Abstract raise to an inner function
(TRY301)
341-341: Avoid specifying long messages outside the exception class
(TRY003)
346-346: Abstract raise to an inner function
(TRY301)
346-346: Avoid specifying long messages outside the exception class
(TRY003)
357-357: Avoid specifying long messages outside the exception class
(TRY003)
361-361: Avoid specifying long messages outside the exception class
(TRY003)
364-364: Avoid specifying long messages outside the exception class
(TRY003)
372-372: Avoid specifying long messages outside the exception class
(TRY003)
389-389: Avoid specifying long messages outside the exception class
(TRY003)
413-416: Avoid specifying long messages outside the exception class
(TRY003)
442-446: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: Github (macos, mpi, debug, false)
- GitHub Check: Github (macos, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Georgia Tech | Phoenix (NVHPC) (gpu-omp)
- GitHub Check: Oak Ridge | Frontier (CCE) (gpu-omp)
- GitHub Check: Georgia Tech | Phoenix (NVHPC) (cpu)
- GitHub Check: Code Cleanliness Check
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Build & Publish
🔇 Additional comments (6)
toolchain/mfc/test/test.py (6)
26-41: Well-designed abort and timeout mechanism.The use of
threading.Event()for cross-thread abort signaling is the correct approach—it avoids theValueErrorthat would occur withsignal.alarm()in worker threads. The constants and exception class are appropriately defined.
179-191: Good handling of abort state after scheduler completion.The check for
abort_tests.is_set()aftersched.schedreturns is correct. The division-by-zero edge case (total_completed > 0) is properly handled with a separate error message.
252-305: Well-implemented debug helper with proper error handling.The function handles filesystem errors gracefully, limits output to avoid log overflow, and provides useful diagnostic information for multi-rank failures. The
_paraminner function is a clean approach for defaulting missing parameters.
339-346: Timeout cannot interrupt a running subprocess.The
threading.Timersets a flag, but ifcase.run()spawns a long-running subprocess, it will continue until completion—the timeout won't interrupt it mid-execution. The checks at lines 340-341 and 345-346 only detect timeouts between operations.For true subprocess timeouts, consider passing a
timeoutparameter to the underlyingsubprocess.run()call withincase.run(). This is a known limitation that may be acceptable for now, but worth tracking for future improvement.
483-495: Graceful abort handling with thread-safe flag.Setting
abort_tests.set()instead of raising an exception from the worker thread is the correct approach—it avoids leaving the scheduler in an undefined state. The placement after every test (not just failures) is appropriate since the failure rate calculation depends on total completed tests.
406-416: Redundant return code check.This
cmd.returncode != 0check at line 406 is unreachable because the same condition was already checked at line 352, which raisesMFCException. SincetargetsincludesPOST_PROCESSwhenARG("test_all")is true, the singlecase.run(targets, ...)call at line 342 runs all stages including post-process, and any failure would have been caught earlier.- if cmd.returncode != 0: - cons.print(cmd.stdout) - - # Extra debug for multi-rank restart/post-process issues - if getattr(case, "ppn", 1) >= 2: - _print_multirank_debug_info(case) - - raise MFCException( - f"Test {case}: Failed to execute MFC (post-process). " - f"See log at: {out_post_filepath}" - )This entire block can be removed as the error is already handled at lines 352-357.
Likely an incorrect or invalid review comment.
User description
PR Type
Enhancement, Tests
Description
Add comprehensive error handling and validation to benchmark execution
Implement per-test timeout mechanism (1 hour) with signal handling
Add early abort logic when failure rate exceeds 30% threshold
Improve job monitoring in CI workflows with real-time output streaming
Reduce benchmark workflow timeout from 1400 to 480 minutes
Diagram Walkthrough
File Walkthrough
bench.py
Add comprehensive error handling and validation to benchmarkstoolchain/mfc/bench.py
execution
handling
fields
messages
silently passing
test.py
Add timeout mechanism and early abort on high failure ratetoolchain/mfc/test/test.py
timeout thresholds
timeout alarm
after 20 completed tests
references
bench.yml
Improve job monitoring with real-time output and error handling.github/workflows/bench.yml
submit_and_monitor function
codes
test.yml
Update job naming and reduce workflow timeout.github/workflows/test.yml
CodeAnt-AI Description
Add per-test 1-hour timeout, fail-fast on high failure rates, stricter benchmark validation, and improved CI bench monitoring
What Changed
Impact
✅ Fewer hung tests✅ Shorter CI runs on systemic failures✅ Clearer benchmark failure logs💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.