|
1 | 1 | # -*- text -*- |
2 | 2 | # |
3 | | -# Copyright (c) 2012-2020 Cisco Systems, Inc. All rights reserved. |
| 3 | +# Copyright (c) 2012-2022 Cisco Systems, Inc. All rights reserved. |
4 | 4 | # |
5 | 5 | # $COPYRIGHT$ |
6 | 6 | # |
@@ -104,16 +104,6 @@ value will be ignored. |
104 | 104 | Value: %s |
105 | 105 | Message: %s |
106 | 106 | # |
107 | | -[device present but not up] |
108 | | -Open MPI has found a usNIC device that is present / listed in Linux, |
109 | | -but in a "down" state. It will not be used by this MPI job. |
110 | | - |
111 | | -You may wish to check this device, especially if it is unexpectedly |
112 | | -down. |
113 | | - |
114 | | - Local server: %s |
115 | | - Device name: %s |
116 | | -# |
117 | 107 | [MTU mismatch] |
118 | 108 | The MTU does not match on local and remote hosts. All interfaces on |
119 | 109 | all hosts participating in an MPI job must be configured with the same |
@@ -224,22 +214,6 @@ Check the resulting "mymap*" files to see the exact pairing of IP |
224 | 214 | interfaces. Inconsistent results may be indicative of underlying |
225 | 215 | network misconfigurations. |
226 | 216 | # |
227 | | -[fi_av_insert timeout] |
228 | | -The usnic BTL failed to create addresses for remote peers within the |
229 | | -specified timeout. This usually means that ARP requests failed to |
230 | | -resolve in time. You may be able to solve the problem by increasing |
231 | | -the usnic BTL's ARP timeout. If that doesn't work, you should |
232 | | -diagnose why ARP replies are apparently not being delivered in a |
233 | | -timely manner. |
234 | | - |
235 | | -The usNIC interface listed below will be ignored. Your MPI |
236 | | -application will likely either run with degraded performance and/or |
237 | | -abort. |
238 | | - |
239 | | - Server: %s |
240 | | - usNIC interface: %s |
241 | | - Current ARP timeout: %d (btl_usnic_arp_timeout MCA param) |
242 | | -# |
243 | 217 | [fi_av_eq too small] |
244 | 218 | The usnic BTL was told to create an address resolution queue that was |
245 | 219 | too small via the mca_btl_usnic_av_eq_num MCA parameter. This |
@@ -283,56 +257,6 @@ connectivity map file will not be written. |
283 | 257 | Working directory: %s |
284 | 258 | Error: %s (%d) |
285 | 259 | # |
286 | | -[received too many short packets] |
287 | | -WARNING: The usnic BTL received a significant number of abnormally |
288 | | -short packets on a single network interface. This may be due to |
289 | | -corruption or congestion in the network fabric. It may be useful to |
290 | | -run a physical/layer 0 diagnostic. |
291 | | - |
292 | | -Your job will continue, but if this poor network behavior continues, |
293 | | -you may experience lower-than-expected performance due to overheads |
294 | | -caused by higher-than-usual retransmission rates (to compensate for |
295 | | -the corrupted received packets). |
296 | | - |
297 | | - Local server: %s |
298 | | - usNIC interface: %s |
299 | | - # of short packets |
300 | | - received so far: %d |
301 | | - |
302 | | -You will only receive this warning once per MPI process per job. |
303 | | - |
304 | | -If you know that your network environment is lossy/heavily congested |
305 | | -such that short/corrupted packets are expected, you can disable this |
306 | | -warning by setting the btl_usnic_max_short_packets MCA parameter to 0. |
307 | | -# |
308 | | -[non-receive completion error] |
309 | | -WARNING: The usnic BTL has detected an error in the completion of a |
310 | | -non-receive event. This is highly unusual, and may indicate an error |
311 | | -in the usNIC subsystem on this server. |
312 | | - |
313 | | -Your MPI job will continue, but you should monitor the job and ensure |
314 | | -that it behaves correctly. |
315 | | - |
316 | | - Local server: %s |
317 | | - usNIC interface: %s |
318 | | - Channel index: %d |
319 | | - Completion status: %s (%d) |
320 | | - Work request ID: %p |
321 | | - Opcode: %s (%d) |
322 | | - |
323 | | -If this error keeps happening, you should contact Cisco technical |
324 | | -support. |
325 | | -# |
326 | | -[device present but not up] |
327 | | -Open MPI has found a usNIC device that is present / listed in Linux, |
328 | | -but in a "down" state. It will not be used by this MPI job. |
329 | | - |
330 | | -You may wish to check this device, especially if it is unexpectedly |
331 | | -down. |
332 | | - |
333 | | - Local server: %s |
334 | | - Device name: %s |
335 | | -# |
336 | 260 | [transport mismatch] |
337 | 261 | Open MPI has found two servers with different underlying usNIC |
338 | 262 | transports. This is an unsupported configuration; all usNIC devices |
|
0 commit comments