-
Notifications
You must be signed in to change notification settings - Fork 776
Description
Bug report
When a large number of files (>5000) are needed to be moved/copied over using publishDir, nextflow fails with the following error:
error [java.lang.OutOfMemoryError]: unable to create native thread: possibly out of memory or process/resource limits reached
Feb-01 14:38:05.816 [Task monitor] ERROR nextflow.processor.TaskProcessor - Execution aborted due to an unexpected error
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
From what I can see, the files are created in the workDir, but the failure only happens during the publishDir directive is being enforced.
I have already tried changing different parameters using NXF_OPTS, _JAVA_OPTIONS among others, none of them seem to help fix this issue.
Expected behavior and actual behavior
Expected behavior: output files are copied/moved to the publishDir.
Actual behavior: Pipeline fails with
ERROR ~ Execution aborted due to an unexpected error
-- Check '.nextflow.log' file for details
with the above detailed error message printed in the log file.
Steps to reproduce the problem
To simulate the scenario, I created the following dummy script and the dummy pipeline that just creates 20,000 files and copies it to another directory using publishDir. I get the same error as above when running it. Dummy script and nextflow pipeline below:
import os
for i in range(20000):
filename = f"file_{i}.txt"
with open(filename, "w") as f:
pass
#!/usr/bin/env nextflow
params.output_dir = ''
process create_files {
publishDir "${params.output_dir}", mode:'copy'
output:
path("*.txt")
script:
"""
python /path_to_python_script/test_20000_files/create_20k_files.py
"""
}
workflow {
create_files()
}
Program output
Environment
- Nextflow version: 23.10.1.5891
- Java version: openjdk 21-internal 2023-09-19
- Operating system: Linux
- Bash version: GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)