Skip to content

Conversation

@devreal
Copy link
Contributor

@devreal devreal commented Nov 2, 2022

The switch from tree to bruck between 512 and 1023 processes leads to unexpected latency changes in benchmarks of other collectives. We should be consistent here. There is no good reason for why bruck would perform better in that range but not beyond.

Fixes #11022 and #10947

Signed-off-by: Joseph Schuchart [email protected]

@jjhursey
Copy link
Member

jjhursey commented Nov 2, 2022

bot:ibm:retest

@bwbarrett
Copy link
Member

@devreal should we have committed this at some point? Should we now?

The switch from tree to bruck between 512 and 1023 processes leads
to unexpected latency changes in benchmarks of other collectives.
We should be consistent here. There is no good reason for why bruck
would perform better in that range but not beyond.


Signed-off-by: Joseph Schuchart <[email protected]>
@devreal devreal force-pushed the tuned-barrier-tree-instead-of-bruck branch from 3b44649 to 9bd7757 Compare October 25, 2025 23:10
@devreal
Copy link
Contributor Author

devreal commented Oct 25, 2025

I had totally forgotten about this one. I rebased it and think it should still go in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How to define and measure the performance of MPI_Bcast exactly?

4 participants