Skip to content

Speed up submission of Google Batch array jobs#6847

Open
thalassemia wants to merge 2 commits intonextflow-io:masterfrom
thalassemia:parallel-gcp-array
Open

Speed up submission of Google Batch array jobs#6847
thalassemia wants to merge 2 commits intonextflow-io:masterfrom
thalassemia:parallel-gcp-array

Conversation

@thalassemia
Copy link

@thalassemia thalassemia commented Feb 19, 2026

I noticed that Nextflow was submitting my Google Batch array jobs very slowly, maybe about one submission every 3 seconds. That was despite no explicit submission throttling in my Nextflow config. In my investigation, I identified two sources of latency.

First, the Google Batch executor uses TaskPollingMonitor to submit jobs. This blocks on each submission and greatly limits the achievable throughput compared to the ParallelPollingMonitor used by the AWS Batch executor. To fix this, I essentially copied the AwsBatchExecutor logic in GoogleBatchExecutor. The retry exceptions were sourced from BatchClient and documented here.

Second, creation of task runs for job arrays happens sequentially in a for loop. I parallelized that.

I tested these changes using the following config and Nextflow script:

process {
    executor = 'google-batch'
    container = 'ubuntu:22.04'
    machineType = 'e2-medium'
    cpus = 1
    memory = '4 GB'
}

google {
    // Set your project ID
    project = 'PROJECT_ID'
    location = 'us-central1'
    batch {
        spot = true
        maxSpotAttempts = 3
    }
}
params.num_tasks = 50

process sayHello {
    tag "$task_id"

    array 5
    
    input:
    val task_id

    output:
    stdout

    script:
    """
    echo "Hello from task $task_id on \$(hostname)"
    sleep 5
    echo "Task $task_id completed"
    """
}

workflow {
    task_ids = Channel.of(1..params.num_tasks)
    sayHello(task_ids) | view
}

By reading .nextflow.log, I got the following times from starting Nextflow to submission of the final job:

No changes ParallelPollingMonitor Parallel createTaskArray Both
27 s 28 s 17 s 8 s

Modeled off implementation in AWS Batch executor

Signed-off-by: Sean Cheah <cheah_sean@yahoo.com>
Signed-off-by: Sean Cheah <cheah_sean@yahoo.com>
@netlify
Copy link

netlify bot commented Feb 19, 2026

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit a52f860
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/69970c5e3eda1c0008bfa989

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants