Skip to content

Commit ab2bf8e

Browse files
committed
PERF: SGE Graph submission was potentially very inefficient
Given an experiment where we have a simple 3 node workflow that we wish to run independantly on 4 subjects, we would expect the following dependancies (where dependancies are given in parenthesis: S1 S2 S3 S4 J1 J1 J1 J1 J2(S1.J1) J2(S2.J1) J2(S3.J1) J2(S4.J1) J3(S1.J2) J3(S2.J2) J3(S3.J2) J3(S4.J2) The problem is that the dependancies were created based on the assumption that job names are unique, and we subsequently rsulted in the following dependances: S1 S2 S3 S4 J1 J1 J1 J1 J2(S1.J1) J2(S1.J1,S2.J1) J2(S1.J1,S2.J2,S3.J1) J2(S1.J1,S2.J2,S3.J1,S4.J1) J3(S1.J2) J3(S1.J1,S2.J2) J3(S1.J2,S2.J2,S3.J2) J3(S1.J1,S2.J2,S3.J1,S4.J2) As the number of subjects grew, the interdependancies grew, and it enforced unnecessary constraints on when subsequent jobs could be run. In order to overcome this problem, job dependencies are now created based on the unique jobid's that are during qsub job submission, rather than the a-priori known (but often duplicated) job name. The effective changes to the "submit_jobs.sh" file are: <<<<<<<< #!/usr/bin/env bash - job00000=$(qsub -N job00000 /tmp/batch/node1.sh) - job00001=$(qsub -hold_jid job00000 -N job00001 /tmp/batch/node2.sh) ================ #!/usr/bin/env bash + job00000=$(qsub -N job00000 /tmp/batch/node1.sh | awk '{print $3}') + job00001=$(qsub -hold_jid ${job00000} -N job00001 /tmp/batch/node2.sh | awk '{print $3}') >>>>>>>> NOTE: In the second case, we can gaurantee uniqueness of tje dependancies no matter how many subjects are simultaneously run!
1 parent e0537ae commit ab2bf8e

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

nipype/pipeline/plugins/sgegraph.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -65,8 +65,8 @@ def _submit_graph(self, pyfiles, dependencies, nodes):
6565
if idx in dependencies:
6666
values = ' '
6767
for jobid in dependencies[idx]:
68-
values += 'job%05d,' % jobid
69-
if 'job' in values:
68+
values += '${job%05d},' % jobid
69+
if values != ' ': # i.e. if some jobs were added to dependency list
7070
values = values.rstrip(',')
7171
deps = '-hold_jid%s' % values
7272
jobname = 'job%05d' % (idx)
@@ -79,7 +79,7 @@ def _submit_graph(self, pyfiles, dependencies, nodes):
7979
if self._qsub_args.count('-o ') == 0:
8080
stdoutFile = '-o {outFile}'.format(
8181
outFile=batchscriptoutfile)
82-
full_line = '{jobNm}=$(qsub {outFileOption} {errFileOption} {extraQSubArgs} {dependantIndex} -N {jobNm} {batchscript})\n'.format(
82+
full_line = '{jobNm}=$(qsub {outFileOption} {errFileOption} {extraQSubArgs} {dependantIndex} -N {jobNm} {batchscript} | awk \'{{print $3}}\')\n'.format(
8383
jobNm=jobname,
8484
outFileOption=stdoutFile,
8585
errFileOption=stderrFile,

0 commit comments

Comments
 (0)