-
Couldn't load subscription status.
- Fork 928
Shmem ucx: fix missing variable declaration in segment_create #7140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Joseph Schuchart <[email protected]>
|
Note: it actually looks like |
|
It was this PR for reference: Note that I am not quite sure how that passed the build checks... |
It requires a version of UCX to be built with devel headers, see openucx/ucx#4388. Maybe such a UCX version can be added to the build checks? |
|
@devreal I'm not sure I understand your comment. How are we building against UCX if we don't have the UCX devel headers installed / available? |
|
@devreal Are you saying that we can't build Open MPI without internal UCX header files? (i.e., header files that are not normally installed) That seems... odd... |
|
@jsquyres The particular feature that was introduced with #6641 ( |
Indeed, in order to get a performance benefit with shmemx_malloc_with_hint() running over spml/ucx component, one needs to configure UCX with installing internal header files. |
|
@yosefe You're right, my description was not exact. The function I am not sure what's the best place to document this dependency. As a start, maybe a comment in the source code is sufficient for developers? That won't help users though. Printing a warning for unsupported hints might be too intrusive. Adding the status of UCX devel headers in the configure summary could be helpful to point out that this dependency exists. Is there precedence in Open MPI, @jsquyres? |
|
Not to be snarky, but it seems a trifle uneven to criticize/complain about the btl/uct component's reliance on UCX internals (which it uses to gain performance) and then turn around and do the exact same thing here for the same reason. To a somewhat disinterested party, it feels like the UCX team has an issue that it should address regarding what it needs to publicly expose for adopters to optimize performance. Perhaps that is where this PR needs to start (i.e., delay adoption here until UCX resolves this recurring problem)? |
|
@devreal No one reads Meaning: even if you put something there, no one will read it unless they realize they have a problem and (likely) we instructed them to go back and look at their @rhc54 is right, though -- if this is an undocumented / unpublished interface from UCX, then it's pretty much in exactly the same situation as BTL/UCT. If this is something that the UCX community feels is an important optimization, then it should be part of the public interface and then this discussion becomes moot (i.e., there should be no need for internal UCX headers to be used by anyone). |
|
Apologies if things got out of bounds here. I'm aware that the used internal UCX functions are an optimization that is not required for the interface implemented in #6641 to work but rather offers a potential optimization. This PR merely fixes an issue that occurs if these UCX internal interfaces are available. I'm not arguing for or against using them, I think that is a broader argument to have elsewhere. |
|
I believe the question we are raising is: should the correct fix for the issue be to remove the use of the internal UCX functions? I don't know where else in our code we rely on internal interfaces of a software package - certainly not in libevent, hwloc, PMIx, or libfabric. It feels a tad uncomfortable to be introducing such dependencies. I'm not sure if UCT also falls in that category - that is a separate issue that should be investigated but is outside the scope of this discussion. |
|
Should we create an issue and migrate this discussion over there? |
Adds a variable whose declaration was missing in #6641.
Signed-off-by: Joseph Schuchart [email protected]