Skip to content

Commit 8ff7b1c

Browse files
committed
orte/pmix: Do not set orted exit status to one from proc abort
The fact that application proc called Abort (read failed) doesn't mean that ORTE subsystem has failed - vice versa it does it's work to gracefuly exit the whole application. orted exiting with non-zero status creates a problem for at least plm/slurm environments where orteds are launched via `srun` with "--kill-on-bad-exit" flag. If one of orteds has exited with non- zero status slurm will immediately kill all other orteds. As the result we see a lot of leftover in the `/tmp` directory. (ported from 4af7a08) Signed-off-by: Artem Polyakov <[email protected]>
1 parent dcd7cf8 commit 8ff7b1c

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

orte/orted/pmix/pmix_server_gen.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
* Copyright (c) 2009 Cisco Systems, Inc. All rights reserved.
1515
* Copyright (c) 2011 Oak Ridge National Labs. All rights reserved.
1616
* Copyright (c) 2013-2016 Intel, Inc. All rights reserved.
17-
* Copyright (c) 2014 Mellanox Technologies, Inc.
17+
* Copyright (c) 2014-2017 Mellanox Technologies, Inc.
1818
* All rights reserved.
1919
* Copyright (c) 2014 Research Organization for Information Science
2020
* and Technology (RIST). All rights reserved.
@@ -102,7 +102,6 @@ int pmix_server_abort_fn(opal_process_name_t *proc, void *server_object,
102102
p->exit_code = status;
103103
}
104104

105-
ORTE_UPDATE_EXIT_STATUS(status);
106105
ORTE_ACTIVATE_PROC_STATE(proc, ORTE_PROC_STATE_CALLED_ABORT);
107106

108107
/* release the caller */

0 commit comments

Comments
 (0)