-
Notifications
You must be signed in to change notification settings - Fork 59
Description
Overview
This issue addresses parameter precedence in the pipeline's multi-configuration system. After extensive discussion, we've reached consensus on a dual-mode approach that separates single-run production use (via config profiles) from multi-run benchmarking use (via paramsheet).
β Final Decision (2026-02-05)
Two-Mode Architecture
1. Single-Run Mode (Config Profiles) - For production/GxP use
nextflow run nf-core/differentialabundance \
-profile rnaseq_deseq2 \
--limma_use_voom=false- Standard Nextflow precedence:
CLI > profile > defaults - CLI flags override profile parameters
- Covers majority of production users
2. Multi-Run Mode (Paramsheet) - For benchmarking/exploration
nextflow run nf-core/differentialabundance \
--paramsheet my_configs.yaml- Paramsheet precedence:
paramsheet > CLI > defaults - Each row defines parameter overrides
- Runs multiple configurations in parallel
- CLI params generally don't override paramsheet (by design)
- No default paramsheet in pipeline (users provide their own)
Key Design Principles
- Separation of concerns: Profiles handle production paths, paramsheet handles benchmarking
- No functionality overlap: Profile-related configurations (e.g., deseq2+rnaseq+gsea) stay in config profiles, NOT in paramsheet
- Minimal maintenance: No default paramsheet to maintain
- Clear precedence: Each mode has well-defined, documented precedence rules
Timeline
- Target: End of March 2026
- Goal: Move standard configurations back to profile approach for meaningful end-to-end testing
π Problem Background
Original Issue
Paramsheet parameters had highest priority, preventing CLI overrides:
Current (problematic):
paramsheet > CLI flags / -params-file > defaults
Initially desired:
CLI flags / -params-file > paramsheet > defaults
However, achieving this in Nextflow required distinguishing CLI params from config defaults, which proved technically challenging.
π Discussion Summary
Solutions Explored
β Option 1: session.cliParams (Initially Preferred)
Approach: Use Nextflow's internal session.cliParams API
Verdict: Rejected - Internal API, incompatible with strict syntax (26.04+)
Key feedback from @bentsherman:
The Session class is internal to the Nextflow codebase, so it isn't meant to be accessed from a Nextflow script. It won't be accessible in the strict syntax, which will be enabled by default in 26.04
β Option 2: Schema-Based Detection
Approach: Compare params against nextflow_schema.json defaults to detect user-specified params
Verdict: Rejected - Edge case with bad failure mode
Critical edge case: If user explicitly sets param to schema default to override paramsheet, it won't be detected
- Example: schema default
param: "foo", paramsheet sets"bar", user runs--param=fooβ still gets"bar" - Failure mode: Silent - user won't know override was ignored
- Unacceptable for GxP environments
β Option 3: Null-Based Approach
Approach: Set all params to null in nextflow.config, handle defaults via paramsheet
Verdict: Rejected - Diverges too far from nf-core conventions, unclear benefit
β Option 4: CLI Parsing
Approach: Parse workflow.commandLine with regex
Verdict: Rejected - Fragile, high maintenance burden
Core Design Philosophy (from @bentsherman)
I think there are two core assumptions in Nextflow that you will run up against:
- Params define what is computed, config defines how it is computed
- The pipeline script only knows the final resolved params, it does not know whether a given param came from CLI / params file / config
This insight led to reconsidering the entire approach and recognizing that trying to force CLI overrides on paramsheet violated Nextflow's design principles.
π― Implementation Plan
Phase 1: Revert to Profile + Paramsheet Separation
- Move standard configurations (deseq2+rnaseq+gsea, etc.) back to config profiles
- Remove default paramsheet from pipeline
- Document two-mode usage patterns clearly
- Paramsheet remains available for multi-run benchmarking, but as an advanced feature
Phase 2: Testing & Validation
- End-to-end testing with profile-based approach
- Validate GxP-compatible single-run mode
- Test multi-run benchmarking scenarios
Phase 3: Documentation
- Clear separation of single-run vs multi-run modes
- Document precedence rules for each mode
- Usage examples for both patterns
π Related
- Issue unification of paramsheet and profileΒ #465 - Initial discussion on favoring paramset over profile
- PR Fix parameter priority using Nextflow session variablesΒ #623 - Implementation attempt using
session.cliParams - PR [Luna] Fix CLI parameter priority over paramsheet valuesΒ #624 - Follow-up implementation with filtering
- Module interfaces (nextflow#6737) - Future Nextflow feature that may enable cleaner solutions
- Module params (nextflow#6769) - Future Nextflow feature
π Long-Term Vision
Once Nextflow natively supports iterating over parameter configurations (via module interfaces, module params, or similar features), we may revisit unifying the two approaches. Until then, the dual-mode architecture provides a pragmatic solution that:
- β Supports production/GxP use cases with standard precedence
- β Enables advanced benchmarking workflows
- β Maintains nf-core standards compliance
- β Avoids technical edge cases and silent failures
Metadata
Metadata
Assignees
Labels
Type
Projects
Status