Environment variables detected with 1 task/cpu but fails with >1 cpu #3972
Replies: 2 comments 1 reply
-
It sounds like you are using a C program that you built yourself. In this case, you should install it to your host environment so that you can use it like any other program. If it expects some shared libraries, then you should install those libraries to the system path or add them to LD_LIBRARY_PATH in your This is the best practice when using Nextflow with locally-built programs. While you could have Nextflow take care of these environment variables, it should really be provided by the launching environment. That way you can use the pipeline regardless of whether the program is built from source or installed via package manager. By the way, Nextflow now supports Spack, so if your software can be installed via Spack then that's probably the easiest way to do it. If you can give me a more concrete example, I might be able to give some better advice. |
Beta Was this translation helpful? Give feedback.
-
So concrete example would be:
The hack that seems to allow for successfully running on multiple cpus involves:
So it seems that the I haven't heard of Spack but it seems quite useful for standardizing HPC workflows - I will check that out too. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there,
Setup:
I am running nextflow (on a
sge
cluster with Ubuntu linux) for a very simple workflow consisting of a:.nf
.config
Issue:
I am using a specific C program that has certain path requirements which means it has to be exported every run/task. Notably, I have no issue with the pipeline when only using a single cpu. However, with more than 1 cpu, the job fails everytime. When it fails, it returns the message saying "
[dependency]
cannot be found". The[dependency]
being the code related to the specific C program.Attempted failed solutions:
export
statement in thescript
blockexport
statement in thebeforeScript
statementexport
statement in.bashrc
and sourced the.bashrc
within and before the workflowexport
statement in the bash script used for job submission on the target cluster (sge
)PATH
Other diagnostics:
joblib Parallel
function is calledLikely problem and questions I need help with
export
being dropped, and path information being lost, when a parallel task is executedParallel
function in thejoblib
library. Does executing parallel tasks repeatedly under the hood cause problems in nextflow?export
?Thanks for the help. Please let me know if you need any specific outputs and I can add them in the follow-up.
Updates
May 25, 2023: Adding an update to this - I have found a workaround where exporting my path to both (i) LD_LIBRARY_PATH in beforeScript and (ii) LD_LIBRARY_PATH in os.environ['LD_LIBRARY_PATH'] in python resolves the issue. Ideally, it seems like incorrect functionality to have to call os.environ across all classes/functions that execute a Parallel joblib call to make the workflow run - so I will leave this issue open.
Beta Was this translation helpful? Give feedback.
All reactions