@@ -36,24 +36,33 @@ places. If you have:
3636 there.
3737
3838 .. note :: Because of spam, only subscribers to the mailing list are
39- allowed to post to the mailing list. Specifically: you must
40- subscribe to the mailing list before posting.
39+ allowed to post to the mailing list. Specifically: ** you must
40+ subscribe to the mailing list before posting. **
4141
42- * If you have a run-time question or problem, see the :ref: `For
43- run-time problems <getting-help-run-time-label>` section below for
44- the content of what to include in your email.
4542 * If you have a compile-time question or problem, see the :ref: `For
46- compile-time problems <getting-help-compile-time-label>` section
43+ problems building or installing Open MPI
44+ <getting-help-compile-time-label>` section below for the content
45+ of what to include in your email.
46+
47+ * If you have problems launching your MPI or OpenSHMEM application
48+ successfully, see the :ref: `For problems launching MPI or
49+ OpenSHMEM applications <getting-help-launching-label>` section
50+ below for the content of what to include in your email.
51+
52+ * If you have other questions or problems about running your MPI or
53+ OpenSHMEM application, see the :ref: `For problems running MPI or
54+ OpenSHMEM applications <getting-help-running-label>` section
4755 below for the content of what to include in your email.
4856
49- .. note :: The mailing lists have **a 150 KB size limit on
50- messages** (this is a limitation of the mailing list web
51- archives). If attaching your files results in an email larger
52- than this, please try compressing it and/or posting it on the
53- web somewhere for people to download. A `Github Gist
54- <https://gist.github.com/> `_ or a `Pastebin
55- <https://pastebin.com/> `_ might be a good choice for posting
56- large text files.
57+ .. important :: The more information you include in your report, the
58+ better. E-mails/bug reports simply stating, "It doesn't work!"
59+ are not helpful; we need to know as much information about your
60+ environment as possible in order to provide meaningful
61+ assistance.
62+
63+ **The best way to get help ** is to provide a "recipe" for
64+ reproducing the problem. This will allow the Open MPI developers
65+ to see the error for themselves, and therefore be able to fix it.
5766
5867 .. important :: Please **use a descriptive "subject" line in your
5968 email!** Some Open MPI question-answering people decide whether
@@ -75,82 +84,152 @@ places. If you have:
7584 there.
7685
7786If you're unsure where to send your question, subscribe and send an
78- email to the user's mailing list.
87+ email to the user's mailing list (i.e., option #1, above) .
7988
80- .. _getting-help-run -time-label :
89+ .. _getting-help-compile -time-label :
8190
82- For run-time problems
83- ---------------------
91+ For problems building or installing Open MPI
92+ --------------------------------------------
8493
85- Please provide *all * of the following information:
94+ If you cannot successfully configure, build, or install Open MPI,
95+ please provide *all * of the following information:
8696
87- .. important :: The more information you include in your report, the
88- better. E-mails/bug reports simply stating, "It doesn't work!"
89- are not helpful; we need to know as much information about your
90- environment as possible in order to provide meaningful assistance.
97+ #. The version of Open MPI that you're using.
9198
92- **The best way to get help ** is to provide a "recipe" for
93- reproducing the problem. This will allow the Open MPI developers
94- to see the error for themselves, and therefore be able to fix it.
99+ #. The stdout and stderr from running ``configure ``.
95100
96- #. The version of Open MPI that you're using.
101+ #. All ``config.log `` files from the Open MPI build tree.
102+
103+ #. Output from when you ran ``make V=1 all `` to build Open MPI.
104+
105+ #. Output from when you ran ``make install `` to install Open MPI.
106+
107+ The script below may be helpful to gather much of the above
108+ information (adjust as necessary for your specific environment):
109+
110+ .. code-block :: bash
111+
112+ #! /usr/bin/env bash
113+
114+ set -euxo pipefail
115+
116+ # Make a directory for the output files
117+ dir=" ` pwd` /ompi-output"
118+ mkdir $dir
119+
120+ # Fill in the options you want to pass to configure here
121+ options=" "
122+ ./configure $options 2>&1 | tee $dir /config.out
123+ tar -cf - ` find . -name config.log` | tar -x -C $dir -
124+
125+ # Build and install Open MPI
126+ make V=1 all 2>&1 | tee $dir /make.out
127+ make install 2>&1 | tee $dir /make-install.out
128+
129+ # Bundle up all of these files into a tarball
130+ filename=" ompi-output.tar.bz2"
131+ tar -jcf $filename ` basename $dir `
132+ echo " Tarball $filename created"
133+
134+ Then attach the resulting ``ompi-output.tar.bz2 `` file to your report.
135+
136+ .. caution :: The mailing lists have **a 150 KB size limit on
137+ messages** (this is a limitation of the mailing list web archives).
138+ If attaching the tarball makes your message larger than 150 KB, you
139+ may need to post the tarball elsewhere and include a link to that
140+ tarball in your mail to the list.
97141
98- #. The ``config.log `` file from the top-level Open MPI directory, if
99- available (**compress or post to a Github gist or Pastebin **).
142+ .. _getting-help-launching-label :
143+
144+ For problems launching MPI or OpenSHMEM applications
145+ ----------------------------------------------------
146+
147+ If you cannot successfully launch simple applications across multiple
148+ nodes (e.g., the non-MPI ``hostname `` command, or the MPI "hello world"
149+ or "ring" sample applications in the ``examples/ `` directory), please
150+ provide *all * of the information from the :ref: `For problems building
151+ or installing Open MPI <getting-help-compile-time-label>` section, and
152+ *all * of the following additional information:
100153
101154#. The output of the ``ompi_info --all `` command from the node where
102- you're invoking ``mpirun ``.
103-
104- #. If you have questions or problems about process affinity /
105- binding, send the output from running the ``lstopo -v ``
106- command from a recent version of `Hwloc
107- <https://www.open-mpi.org/projects/hwloc/> `_. *The detailed
108- text output is preferable to a graphical output. *
109-
110- #. If running on more than one node |mdash | especially if you're
111- having problems launching Open MPI processes |mdash | also include
112- the output of the ``ompi_info --version `` command **from each node
113- on which you're trying to run **.
114-
115- #. If you are able to launch MPI processes, you can use
116- ``mpirun `` to gather this information. For example, if
117- the file ``my_hostfile.txt `` contains the hostnames of the
118- machines on which you are trying to run Open MPI
119- processes::
120-
121- shell$ mpirun --map-by node --hostfile my_hostfile.txt --output tag ompi_info --version
122-
123-
124- #. If you cannot launch MPI processes, use some other mechanism
125- |mdash | such as ``ssh `` |mdash | to gather this information. For
126- example, if the file ``my_hostfile.txt `` contains the hostnames
127- of the machines on which you are trying to run Open MPI
128- processes:
129-
130- .. code-block :: sh
131-
132- # Bourne-style shell (e.g., bash, zsh, sh)
133- shell$ for h in ` cat my_hostfile.txt`
134- > do
135- > echo " === Hostname: $h "
136- > ssh $h ompi_info --version
137- > done
138-
139- .. code-block :: sh
140-
141- # C-style shell (e.g., csh, tcsh)
142- shell% foreach h (` cat my_hostfile.txt` )
143- foreach? echo " === Hostname: $h "
144- foreach? ssh $h ompi_info --version
145- foreach? end
146-
147- #. A *detailed * description of what is failing. The more
148- details that you provide, the better. E-mails saying "My
149- application doesn't work!" will inevitably be answered with
150- requests for more information about *exactly what doesn't
151- work *; so please include as much information detailed in your
152- initial e-mail as possible. We strongly recommend that you
153- include the following information:
155+ you are invoking :ref: `mpirun(1) <man1-mpirun >`.
156+
157+ #. If you have questions or problems about process mapping or binding,
158+ send the output from running the ``lstopo -v `` and ``lstopo --of
159+ xml `` commands from a recent version of `Hwloc
160+ <https://www.open-mpi.org/projects/hwloc/> `_.
161+
162+ #. If running on more than one node, also include the output of the
163+ ``ompi_info --version `` command **from each node on which you are
164+ trying to run **.
165+
166+ #. The output of running ``mpirun --map-by ppr:1:node --prtemca
167+ plm_base_verbose 100 --prtemca rmaps_base_verbose 100 --display
168+ alloc hostname ``. Add in a ``--hostfile `` argument if needed for
169+ your environment.
170+
171+ The script below may be helpful to gather much of the above
172+ information (adjust as necessary for your specific environment).
173+
174+ .. note :: It is safe to run this script after running the script from
175+ the :ref: `building and installing
176+ <getting-help-compile-time-label>` section.
177+
178+ .. code-block :: bash
179+
180+ #! /usr/bin/env bash
181+
182+ set -euxo pipefail
183+
184+ # Make a directory for the output files
185+ dir=" ` pwd` /ompi-output"
186+ mkdir -p $dir
187+
188+ # Get installation and system information
189+ ompi_info --all 2>&1 | tee $dir /ompi-info-all.out
190+ lstopo -v | tee $dir /lstopo-v.txt
191+ lstopo --of xml | tee $dir /lstopo.xml
192+
193+ # Have a text file "my_hostfile.txt" containing the hostnames on
194+ # which you are trying to launch
195+ for host in ` cat my_hostfile.txt` ; do
196+ ssh $host ompi_info --version 2>&1 | tee $dir /ompi_info-version-$host .out
197+ ssh $host lstopo -v | tee $dir /lstopo-v-$host .txt
198+ ssh $host lstopo --of xml | tee $dir /lstopo-$host .xml
199+ done
200+
201+ # Have a my_hostfile.txt file if needed for your environment, or
202+ # remove the --hostfile argument altogether if not needed.
203+ set +e
204+ mpirun \
205+ --hostfile my_hostfile.txt \
206+ --map-by ppr:1:node \
207+ --prtemca plm_base_verbose 100 \
208+ --prtemca rmaps_base_verbose 100 \
209+ --display alloc \
210+ hostname 2>&1 | tee $dir /mpirun-hostname.out
211+
212+ # Bundle up all of these files into a tarball
213+ filename=" ompi-output.tar.bz2"
214+ tar -jcf $filename ` basename $dir `
215+ echo " Tarball $filename created"
216+
217+ .. _getting-help-running-label :
218+
219+ For problems running MPI or OpenSHMEM applications
220+ --------------------------------------------------
221+
222+ If you can successfully launch parallel MPI or OpenSHMEM applications,
223+ but the jobs fail during the run, please provide *all * of the
224+ information from the :ref: `For problems building or installing Open
225+ MPI <getting-help-compile-time-label>` section, *all * of the
226+ information from the :ref: `For problems launching MPI or OpenSHMEM
227+ applications <getting-help-launching-label>` section, and then *all *
228+ of the following additional information:
229+
230+ #. A *detailed * description of what is failing. *The more details
231+ that you provide, the better. * Please include at least the
232+ following information:
154233
155234 * The exact command used to run your application.
156235
@@ -164,77 +243,21 @@ Please provide *all* of the following information:
164243 any required support libraries, such as libraries required
165244 for high-speed networks such as InfiniBand).
166245
167- #. Detailed information about your network:
246+ #. The source code of a short sample program (preferably in C or
247+ Fortran) that exhibits the problem.
248+
249+ #. If you are experiencing networking problems, include detailed
250+ information about your network.
168251
169252 .. error :: TODO Update link to IB FAQ entry.
170253
171254 #. For RoCE- or InfiniBand-based networks, include the information
172255 :ref: `in this FAQ entry <faq-ib-troubleshoot-label >`.
173256
174- #. For Ethernet-based networks (including RoCE-based networks,
257+ #. For Ethernet-based networks (including RoCE-based networks) ,
175258 include the output of the ``ip addr `` command (or the legacy
176259 ``ifconfig `` command) on all relevant nodes.
177260
178261 .. note :: Some Linux distributions do not put ``ip`` or
179262 ``ifconfig `` in the default ``PATH `` of normal users.
180263 Try looking for it in ``/sbin `` or ``/usr/sbin ``.
181-
182- .. _getting-help-compile-time-label :
183-
184- For compile problems
185- --------------------
186-
187- Please provide *all * of the following information:
188-
189- .. important :: The more information you include in your report, the
190- better. E-mails/bug reports simply stating, "It doesn't work!"
191- are not helpful; we need to know as much information about your
192- environment as possible in order to provide meaningful assistance.
193-
194- **The best way to get help ** is to provide a "recipe" for
195- reproducing the problem. This will allow the Open MPI developers
196- to see the error for themselves, and therefore be able to fix it.
197-
198- #. The version of Open MPI that you're using.
199-
200- #. All output (both compilation output and run time output, including
201- all error messages).
202-
203- #. Output from when you ran ``./configure `` to configure Open MPI
204- (**compress or post to a GitHub gist or Pastebin! **).
205-
206- #. The ``config.log `` file from the top-level Open MPI directory
207- (**compress or post to a GitHub gist or Pastebin! **).
208-
209- #. Output from when you ran ``make V=1 `` to build Open MPI (**compress
210- or post to a GitHub gist or Pastebin! **).
211-
212- #. Output from when you ran ``make install `` to install Open MPI
213- (**compress or post to a GitHub gist or Pastebin! **).
214-
215- To capture the output of the configure and make steps, you can use the
216- script command or the following technique to capture all the files in
217- a unique directory, suitable for tarring and compressing into a single
218- file:
219-
220- .. code-block :: sh
221-
222- # Bourne-style shell (e.g., bash, zsh, sh)
223- shell$ mkdir $HOME /ompi-output
224- shell$ ./configure {options} 2>&1 | tee $HOME /ompi-output/config.out
225- shell$ make all 2>&1 | tee $HOME /ompi-output/make.out
226- shell$ make install 2>&1 | tee $HOME /ompi-output/make-install.out
227- shell$ cd $HOME
228- shell$ tar jcvf ompi-output.tar.bz2 ompi-output
229-
230- .. code-block :: sh
231-
232- # C-style shell (e.g., csh, tcsh)
233- shell% mkdir $HOME /ompi-output
234- shell% ./configure {options} | & tee $HOME /ompi-output/config.out
235- shell% make all | & tee $HOME /ompi-output/make.out
236- shell% make install | & tee $HOME /ompi-output/make-install.out
237- shell% cd $HOME
238- shell% tar jcvf ompi-output.tar.bz2 ompi-output
239-
240- Then attach the resulting ``ompi-output.tar.bz2 `` file to your report.
0 commit comments