Skip to content

Conversation

@jjhursey
Copy link
Member

@jjhursey jjhursey commented Aug 25, 2016

  • Expand the use of the orte_keep_fqdn_hostnames MCA parameter when it is set to false.
  • If that parameter is set to false (default) then short hostnames (e.g., node01) will match with the long hostnames (e.g., node01.mycluster.org). This allows a user (or resource manager) to mix the use of short and long hostnames.
    • Note that this mechanism does not perform a DNS lookup, but instead strips off the FQDN by truncating the hostname string at the first . character (when not an IP address).
      • By default (false) the following is true: node01 == node01.mycluster.org == node01.bogus.com we use node01 as the hostname.

@jjhursey jjhursey added this to the v2.0.2 milestone Aug 25, 2016
@jjhursey jjhursey self-assigned this Aug 25, 2016
@jjhursey
Copy link
Member Author

Related to Issue #1614 but a more general solution.

@jjhursey
Copy link
Member Author

@rhc54 I don't know if you want to take a look at this PR before I merge it in. I'll let it sit here a for a little while, but it would be nice to get into 2.0.2 as it fixes a problem found on an internal test environment.

@rhc54
Copy link
Contributor

rhc54 commented Aug 25, 2016

scratching my head...it looks to me like this does exactly what happens when you set orte_keep_fqdn_hostnames=0, doesn't it? I mean, you did fill in a few places where we weren't checking it, but other than that - why a new param?

@jjhursey
Copy link
Member Author

You made a comment in Issue #1614 about some environments where the difference between short versus long hostnames is important. The orte_keep_fqdn_hostnames=0 behavior would make the mpirun/HNP hostname shortened. So I wasn't sure if there was a user depending on this mixing of hostnames behavior.

That being said, if the intention of orte_keep_fqdn_hostnames=0 was to do exactly what orte_use_mixed_hostnames=t is supposed to do then we can reuse that MCA variable. That would reduce the number of options, and future confusion about the difference between these two options.

So I'm good with just reusing orte_keep_fqdn_hostnames=0 if you are.

I think I got all of the places in the code that needed to be modified to make this work. Certainly it worked for our use cases with LSF, rankfiles, hostfiles, and -host.

@rhc54
Copy link
Contributor

rhc54 commented Aug 25, 2016

Hmmm...perhaps I should clarify that comment. What I was saying was that we don't support mixing short and long hostnames because we've had problems in the past with host confusion when we tried to do so. Thus, the param dictates that you use either all short names, or all long names.

The user was asking us to support mixing the two, and I refused. 😄

@jjhursey
Copy link
Member Author

Ah. In that case then:

  • orte_keep_fqdn_hostnames=true : use only long names (as available - I did not add code to do a DNS lookup on the hostnames to fill out this path.)
    • node01 != node01.mycluster.org != node01.bogus.com
  • orte_keep_fqdn_hostnames=false: (default) use only short names. If a long hostname is passed then we shorten it to the first section (before the .) before processing it internally.
    • node01 == node01.mycluster.org == node01.bogus.com : since all shorten to node01

@rhc54
Copy link
Contributor

rhc54 commented Aug 25, 2016

Yep, that is correct!

@jjhursey
Copy link
Member Author

OK. I'll update the PR tomorrow to bring those options together.

@jjhursey jjhursey changed the title orte: Add option for mixing of short and long hostnames orte: Expand use of !orte_keep_fqdn_hostnames MCA parameter Aug 26, 2016
@jjhursey
Copy link
Member Author

@rhc54 I updated the PR with a new commit (I'll squash them together before merge). This should match what we discussed.

@ibm-ompi
Copy link

Build Failed with XL compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/c34c2ef0c667c7d2968dc69e75dfa48e

@ibm-ompi
Copy link

Build Failed with GNU compiler! Please review the log, and get in touch if you have questions.

Gist: https://gist.github.com/50fb5793fcad3937828ba301bf17ccd7

@jjhursey
Copy link
Member Author

I always find it funny when the IBM CI fails on one of my PRs 😄 This is due to a temporary file system issue on the test machine. I've deactivated our CI until it comes back.

@rhc54
Copy link
Contributor

rhc54 commented Aug 26, 2016

👍
Thanks!

@jjhursey
Copy link
Member Author

bot:ibm:retest

 * Expand the use of the `orte_keep_fqdn_hostnames` MCA parameter when
   it is set to false.
 * If that parameter is set to false (default) then short hostnames
   (e.g., `node01`) will match with the long hostnames (e.g.,
   `node01.mycluster.org`). This allows a user (or resource manager)
    to mix the use of short and long hostnames.
  - Note that this mechanism does _not_ perform a DNS lookup, but
    instead strips off the FQDN by truncating the hostname string at
    the first `.` character (when not an IP address).
     - By default (`false`) the following is true:
       `node01 == node01.mycluster.org == node01.bogus.com`
       since we use `node01` as the hostname.
@jjhursey jjhursey force-pushed the topic/mixed-hostnames branch from cff8fae to d26dd2c Compare August 26, 2016 21:10
@jjhursey jjhursey merged commit b0d8638 into open-mpi:master Aug 29, 2016
@jjhursey jjhursey deleted the topic/mixed-hostnames branch September 30, 2016 03:17
markalle pushed a commit to markalle/ompi that referenced this pull request Sep 12, 2020
 * Related to c0038eded3544db94f68f3d5b58c89739834eb96
 * See discussion on Open MPI community PR:
   - open-mpi#2015
 * After broader discussion it was decided to expand the use of the
   !orte_keep_fqdn_hostnames MCA parameter to shorten all hostnames.
   This was exactly what the orte_use_mixed_hostnames MCA parameter
   was doing.
   - This also means that the LSF folks will get the behavior they
     want by default in Open MPI.
 * Upstream will see one commit that combines this commit and
   c0038eded3544db94f68f3d5b58c89739834eb96

(cherry picked from commit a33a2308ca80766fe6cf1f217b5a467687669603)
@edvinas31
Copy link

Hi, I know it is a way old thread here, but is there any way to specify orte_keep_fqdn_hostnames value globally in a system instead of doing it from the command line?

@rhc54
Copy link
Contributor

rhc54 commented Dec 2, 2023

Put the MCA param in the default param file where OMPI was installed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants