Skip to content

Conversation

@VladiKrapp-Arm
Copy link
Collaborator

@VladiKrapp-Arm VladiKrapp-Arm commented Oct 15, 2024

Some workloads require specific sequences of events to happen
to fully simplify. This adds an extra full unrolling pass to help these
cases on the cores with branch predictors. It helps produce simplified
loops, which can then be SROA'd allowing further simplification, which
can be important for performance.
Feature adds extra compile time to get extra performance and
is enabled by the opt flag 'extra-LTO-loop-unroll' (off by default).

Original patch by David Green ([email protected])

Some workloads require specific sequences of events to happen
to fully simplify. This adds an extra full unrolling pass to help these
cases on the cores with branch predictors. It helps produce simplified
loops, which can then be SROA'd allowing further simplification, which
can be important for performance.
This is added under own flag - spending extra compile time to get extra
performance on specific user request.

Originally patch by David Green ([email protected])
@dcandler
Copy link
Collaborator

Very minor nit: the first patch should start 0001, i.e. 0001-LTOpasses-add-loop-unroll.patch to match the other folder and the likely output of git format-patch
The placeholder patch was numbered 0 since it's not a real patch file.

@VladiKrapp-Arm
Copy link
Collaborator Author

@stuij , @dcandler , I have made the suggested changes.
Any other issues?

Copy link
Member

@stuij stuij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but when merging it'd be good to have the commit message mirror the commit message changes you just made to the patch file when squash-merging:

Some workloads require specific sequences of events to happen
to fully simplify. This adds an extra full unrolling pass to help these
cases on the cores with branch predictors. It helps produce simplified
loops, which can then be SROA'd allowing further simplification, which
can be important for performance.

The feature adds extra compile time to get extra performance and
is enabled by the opt flag 'extra-LTO-loop-unroll' (off by default).

@VladiKrapp-Arm VladiKrapp-Arm merged commit 41e8b9f into main Oct 16, 2024
1 check passed
@VladiKrapp-Arm VladiKrapp-Arm deleted the more_lto_unroll branch November 5, 2024 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants