Skip to content

Conversation

marchdf
Copy link
Contributor

@marchdf marchdf commented Aug 25, 2025

Summary

This speeds up the ABL stats considerably. For the precursor I was profiling, compute_zi took 27% of the total runtime (!). This takes it down to just 1%. Currently just for nvidia GPU. Need to generalize.

More granular profiling of the ABLStats.

Pull request type

Please check the type of change introduced:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

Checklist

The following is included:

  • new unit-test(s)
  • new regression test(s)
  • documentation for new capability

This PR was tested by running:

  • the unit tests
    • on GPU
    • on CPU
  • the regression tests
    • on GPU
    • on CPU

Additional background

Issue Number:

@marchdf marchdf changed the title More granular profiling of the ABLStats Speedup ABLStats Sep 2, 2025
@marchdf
Copy link
Contributor Author

marchdf commented Sep 2, 2025

Before change in compute_zi:

Time spent in Evolve():      69.93182354

After:

Time spent in Evolve():      47.23603398

1.5X speedup for total simulation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant