Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions fusion_docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,31 @@ If you didn’t notice any performance improvement with Fusion, the bottleneck m
- [Amazon Elastic Kubernetes Service](https://docs.seqera.io/platform-cloud/compute-envs/eks)
- [Google Kubernetes Engine](https://docs.seqera.io/platform-cloud/compute-envs/gke)

### How does the scratch process directive interact with Fusion?

The Nextflow [`scratch`](https://www.nextflow.io/docs/latest/reference/process.html#scratch) process directive controls where a task runs:

- `process.scratch = false`: Tasks read and write directly through the Fusion-mounted work directory in cloud object storage.
- `process.scratch = true`: Nextflow stages task inputs to a local scratch directory (the path set by `$TMPDIR`, or `/tmp` if unset), runs the task there, and copies outputs back to the work directory. This bypasses Fusion for the task body and runs the workload on local instance storage.

For most workloads, `process.scratch = false` is faster and is the recommended default. Consider `process.scratch = true` for tasks that perform heavy small-file I/O. For example, processes that read or write many thousands of small files.

Apply `scratch = true` selectively to the affected processes rather than globally:

```groovy
process {
// Default: tasks run directly on the Fusion-mounted work directory
scratch = false

// Use local scratch for processes with heavy small-file I/O
withName: 'PROCESS_NAME' {
scratch = true
}
}
```

Ensure the compute environment provides enough fast local storage for the staged inputs and outputs.

### Can I pin a specific Fusion version to use with Nextflow?

Yes. Add the Fusion version's config URL using the `containerConfigUrl` option in the Fusion block of your Nextflow configuration (replace `v2.4.2` with the version of your choice):
Expand Down
2 changes: 1 addition & 1 deletion platform-cloud/docs/data/data-lineage.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ If data lineage is defined for a workspace, only that data is displayed in Platf

## Costs associated with data lineage

Monthly S3 object storage bucket and SQS costs will scale based on the number of pipeline runs launched with lineage enabled.
Monthly S3 object storage bucket and SQS costs will scale based on the number of pipeline runs launched with lineage enabled.

Typical SQS queue costs for a single rnaseq pipeline run daily are less than $10 USD/month.

Expand Down