Skip to content

Conversation

@jsquyres
Copy link
Member

@jsquyres jsquyres commented Oct 2, 2018

Per
#3035 (comment),
it looks like the IP address for a given interface is being stashed in
two places: on the endpoint and on the module.

  1. On the endpoint, it is storing the moral equivalent of a
    (struct sockaddr_in.sin_addr).
  2. On the module, it is storing a full (struct sockaddr_storage).

The call to opal_net_get_hostname() expects a full (struct sockaddr*)
-- not just the stripped-down (struct sockaddr_in.sin_addr). Hence,
when the original code was passing in the endpoint's (struct
sockaddr_in.sin_addr) and opal_net_get_hostname() was treating it
like a (struct sockaddr), hilarity ensued (i.e., we got the wrong
output).

This commit eliminates the call to opal_net_get_hostname() and just
calls inet_ntop() directly to convert the (struct
sockaddr_in.sin_addr) to a string.

NOTE: Per the github comment cited above, there can be a disparity
between the IP address cached on the endpoint vs. the IP address
cached on the module. This only happens with interfaces that have
more than one IP address. This commit does not fix that issue.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 5dae086)

Per
open-mpi#3035 (comment),
it looks like the IP address for a given interface is being stashed in
two places: on the endpoint and on the module.

1. On the endpoint, it is storing the moral equivalent of a
   (struct sockaddr_in.sin_addr).
2. On the module, it is storing a full (struct sockaddr_storage).

The call to opal_net_get_hostname() expects a full (struct sockaddr*)
-- not just the stripped-down (struct sockaddr_in.sin_addr).  Hence,
when the original code was passing in the endpoint's (struct
sockaddr_in.sin_addr) and opal_net_get_hostname() was treating it
like a (struct sockaddr), hilarity ensued (i.e., we got the wrong
output).

This commit eliminates the call to opal_net_get_hostname() and just
calls inet_ntop() directly to convert the (struct
sockaddr_in.sin_addr) to a string.

NOTE: Per the github comment cited above, there can be a disparity
between the IP address cached on the endpoint vs. the IP address
cached on the module.  This only happens with interfaces that have
more than one IP address.  This commit does not fix that issue.

Signed-off-by: Jeff Squyres <[email protected]>
(cherry picked from commit 5dae086)
@jsquyres jsquyres added this to the v3.1.3 milestone Oct 2, 2018
@jsquyres jsquyres requested a review from bosilca October 2, 2018 14:51
@bwbarrett bwbarrett merged commit 4957a66 into open-mpi:v3.1.x Oct 2, 2018
@jsquyres jsquyres deleted the pr/v3.1.x/fix-tcp-btl-show-help-ip-address branch December 7, 2021 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants