Merge pull request #42 from seqeralabs/seqera-ai/20251121-150947-fix-documentation-links-and-mermaid

FloWuenne · web-flow · commit c89704a0e863 · 2025-11-21T10:40:36.000-05:00
Fix documentation: remove non-existent modes references and fix Mermaid diagram
diff --git a/README.md b/README.md
@@ -105,14 +105,13 @@ nextflow run seqeralabs/nf-proteindesign \
 
 ### Key Documentation Pages
 
-- **[Quick Start Guide](https://flouwuenne.github.io/nf-proteindesign-2025/quick-start/)** - Get started in minutes
-- **[Installation](https://flouwuenne.github.io/nf-proteindesign-2025/getting-started/installation/)** - Setup and requirements
-- **[Usage Guide](https://flouwuenne.github.io/nf-proteindesign-2025/getting-started/usage/)** - Detailed usage instructions
-- **[Pipeline Modes](https://flouwuenne.github.io/nf-proteindesign-2025/modes/overview/)** - Design, Target modes
-- **[Analysis Tools](https://flouwuenne.github.io/nf-proteindesign-2025/analysis/ipsae/)** - Optional analysis modules
-- **[Parameters Reference](https://flouwuenne.github.io/nf-proteindesign-2025/reference/parameters/)** - Complete parameter list
-- **[Output Files](https://flouwuenne.github.io/nf-proteindesign-2025/reference/outputs/)** - Understanding results
-- **[Examples](https://flouwuenne.github.io/nf-proteindesign-2025/reference/examples/)** - Real-world use cases
+- **[Quick Start Guide](https://seqeralabs.github.io/nf-proteindesign/quick-start/)** - Get started in minutes
+- **[Installation](https://seqeralabs.github.io/nf-proteindesign/getting-started/installation/)** - Setup and requirements
+- **[Usage Guide](https://seqeralabs.github.io/nf-proteindesign/getting-started/usage/)** - Detailed usage instructions
+- **[Analysis Modules](https://seqeralabs.github.io/nf-proteindesign/analysis/proteinmpnn-protenix/)** - Optional analysis tools (ProteinMPNN, Protenix, ipSAE, PRODIGY, Foldseek)
+- **[Parameters Reference](https://seqeralabs.github.io/nf-proteindesign/reference/parameters/)** - Complete parameter list
+- **[Output Files](https://seqeralabs.github.io/nf-proteindesign/reference/outputs/)** - Understanding results
+- **[Examples](https://seqeralabs.github.io/nf-proteindesign/reference/examples/)** - Real-world use cases
 
 ## Samplesheet Format
 
diff --git a/docs/getting-started/installation.md b/docs/getting-started/installation.md
@@ -234,7 +234,7 @@ Once installed, check out:
 
 - [Quick Start Guide](../quick-start.md)
 - [Basic Usage](usage.md)
-- [Pipeline Modes](../modes/overview.md)
+- [Analysis Modules](../analysis/proteinmpnn-protenix.md)
 
 ---
 
diff --git a/docs/getting-started/quick-reference.md b/docs/getting-started/quick-reference.md
@@ -309,9 +309,10 @@ grep "Succeeded" results/pipeline_info/execution_trace.txt | wc -l
 ## :material-link: Quick Links
 
 - [Full Documentation](../index.md)
-- [Pipeline Modes](../modes/overview.md)
+- [Basic Usage](usage.md)
 - [Parameter Reference](../reference/parameters.md)
 - [Example Workflows](../reference/examples.md)
+- [Analysis Modules](../analysis/proteinmpnn-protenix.md)
 - [GitHub Repository](https://github.com/seqeralabs/nf-proteindesign)
 
 ---
diff --git a/docs/getting-started/usage.md b/docs/getting-started/usage.md
@@ -21,42 +21,32 @@ nextflow run seqeralabs/nf-proteindesign \
 
 ## :material-file-table: Samplesheet Format
 
-The samplesheet determines which mode the pipeline runs in.
-
-### Mode Auto-Detection
-
-The pipeline automatically detects the mode based on column headers:
-
-| Column Present | Mode | Description |
-|----------------|------|-------------|
-| `design_yaml` | Design | Use pre-made YAML files |
-| `target_structure` | Target/Binder | Generate design variants |
-### Required Columns by Mode
-
-=== "Design Mode"
-    | Column | Required | Description |
-    |--------|----------|-------------|
-    | `sample` | ✅ | Unique sample identifier |
-    | `design_yaml` | ✅ | Path to design YAML file |
-
-=== "Target Mode"
-    | Column | Required | Description |
-    |--------|----------|-------------|
-    | `sample` | ✅ | Unique sample identifier |
-    | `target_structure` | ✅ | Path to target structure (PDB/CIF) |
-    | `target_residues` | Optional | Binding site residues (comma-separated) |
-    | `chain_type` | Optional | Type: `protein`, `peptide`, `nanobody` |
-    | `min_length` | Optional | Minimum binder length |
-    | `max_length` | Optional | Maximum binder length |
-
-=== "Binder Mode"
-    | Column | Required | Description |
-    |--------|----------|-------------|
-    | `sample` | ✅ | Unique sample identifier |
-    | `target_structure` | ✅ | Path to target structure (PDB/CIF) |
-    | `chain_type` | Optional | Type: `protein`, `peptide`, `nanobody` |
-    | `min_length` | Optional | Minimum binder length |
-    | `max_length` | Optional | Maximum binder length |
+The pipeline uses a CSV samplesheet to specify design jobs. Each row represents a separate design run.
+
+### Required Columns
+
+| Column | Required | Description |
+|--------|----------|-------------|
+| `sample` | ✅ | Unique sample identifier |
+| `design_yaml` | ✅ | Path to design YAML file (see below) |
+
+### Optional Columns
+
+Additional columns can override default parameters per sample:
+
+| Column | Type | Description |
+|--------|------|-------------|
+| `num_designs` | Integer | Number of designs to generate (overrides `--num_designs`) |
+| `budget` | Integer | Number of final designs to keep (overrides `--budget`) |
+
+### Example Samplesheet
+
+```csv
+sample,design_yaml,num_designs,budget
+protein_binder,designs/egfr_binder.yaml,10000,50
+nanobody_design,designs/spike_nanobody.yaml,5000,20
+peptide_binder,designs/il6_peptide.yaml,3000,10
+```
 
 ## :material-file-document: Design YAML Format
 
@@ -149,14 +139,14 @@ results/
 
 ## :material-play-circle: Example Workflows
 
-### Example 1: Simple Design Mode
+### Example 1: Basic Protein Design
 
 ```bash
 # 1. Create design YAML
-cat > my_design.yaml << EOF
-name: antibody_target
+cat > protein_design.yaml << EOF
+name: egfr_binder
 target:
-  structure: data/target.pdb
+  structure: data/egfr.pdb
   residues: [10, 11, 12, 45, 46]
 designed:
   chain_type: protein
@@ -168,7 +158,7 @@ EOF
 # 2. Create samplesheet
 cat > samples.csv << EOF
 sample,design_yaml
-design1,my_design.yaml
+egfr_binder,protein_design.yaml
 EOF
 
 # 3. Run pipeline
@@ -178,43 +168,54 @@ nextflow run seqeralabs/nf-proteindesign \
     --outdir results
 ```
 
-### Example 2: Target Mode with Analysis
+### Example 2: Multiple Designs with Analysis
 
 ```bash
-# 1. Create samplesheet
-cat > targets.csv << EOF
-sample,target_structure,target_residues,chain_type,min_length,max_length
-egfr,data/egfr.pdb,"10,11,12,45,46",protein,60,120
-spike,data/spike.cif,"417,484,501",nanobody,110,130
+# 1. Create design YAMLs for different targets
+cat > egfr_design.yaml << EOF
+name: egfr_binder
+target:
+  structure: data/egfr.pdb
+  residues: [10, 11, 12, 45, 46]
+designed:
+  chain_type: protein
+  length: [60, 120]
 EOF
 
-# 2. Run with affinity prediction
+cat > spike_design.yaml << EOF
+name: spike_nanobody
+target:
+  structure: data/spike.cif
+  residues: [417, 484, 501]
+designed:
+  chain_type: nanobody
+  length: [110, 130]
+EOF
+
+# 2. Create samplesheet
+cat > samples.csv << EOF
+sample,design_yaml,num_designs,budget
+egfr_binder,egfr_design.yaml,10000,50
+spike_nanobody,spike_design.yaml,5000,20
+EOF
+
+# 3. Run with analysis modules
 nextflow run seqeralabs/nf-proteindesign \
     -profile docker \
-    --mode target \
-    --input targets.csv \
+    --input samples.csv \
     --outdir results \
-    --n_samples 30 \
-    --run_prodigy
+    --run_proteinmpnn \
+    --run_protenix_refold \
+    --run_prodigy \
+    --run_consolidation
 ```
 
-### Example 3: Binder Mode (No Binding Site)
+### Example 3: Test Run
 
 ```bash
-# 1. Create samplesheet
-cat > binders.csv << EOF
-sample,target_structure,chain_type,min_length,max_length
-binder1,data/target1.pdb,protein,50,100
-binder2,data/target2.pdb,nanobody,110,130
-EOF
-
-# 2. Run pipeline
+# Use built-in test profile
 nextflow run seqeralabs/nf-proteindesign \
-    -profile docker \
-    --mode binder \
-    --input binders.csv \
-    --outdir results \
-    --n_samples 20
+    -profile test_design_protein,docker
 ```
 
 ## :material-refresh: Resume Failed Runs
@@ -348,9 +349,9 @@ nextflow run ... --n_samples 10  # Reduce batch size
 
 ## :material-arrow-right: Next Steps
 
-- Learn about [Pipeline Modes](../modes/overview.md) in detail
 - Check the [Quick Reference](quick-reference.md) for common commands
 - Explore [Analysis Tools](../analysis/prodigy.md) integration
+- Review [Pipeline Parameters](../reference/parameters.md) for advanced configuration
 
 ---
 
diff --git a/docs/index.md b/docs/index.md
@@ -67,34 +67,34 @@
 ## :material-pipeline: Pipeline Workflow
 
 ```mermaid
-flowchart LR
-    A[Samplesheet] --> B[Boltzgen]
+graph LR
+    A[Samplesheet] --> B[Boltzgen Design]
     B --> C[Budget Designs]
     
     C --> D{ProteinMPNN?}
-    D -->|Yes| E[Optimize]
+    D -->|Yes| E[Sequence Optimization]
     E --> F{Protenix?}
-    F -->|Yes| G[Refold]
+    F -->|Yes| G[Structure Refold]
     
-    C --> H[Analysis]
+    C --> H[Analysis Modules]
     G --> H
     
-    H --> I[ipSAE]
-    H --> J[PRODIGY]
-    H --> K[Foldseek]
+    H --> I[ipSAE Scoring]
+    H --> J[PRODIGY Affinity]
+    H --> K[Foldseek Search]
     
     I --> L{Consolidate?}
     J --> L
     K --> L
-    L -->|Yes| M[Report]
+    L -->|Yes| M[Unified Report]
     
-    M --> N[Results]
+    M --> N[Final Results]
     C --> N
     
-    style B fill:#9C27B0,color:#fff
-    style E fill:#8E24AA,color:#fff
-    style G fill:#7B1FA2,color:#fff
-    style M fill:#6A1B9A,color:#fff
+    style B fill:#9C27B0,stroke:#9C27B0,color:#fff
+    style E fill:#8E24AA,stroke:#8E24AA,color:#fff
+    style G fill:#7B1FA2,stroke:#7B1FA2,color:#fff
+    style M fill:#6A1B9A,stroke:#6A1B9A,color:#fff
 ```
 
 ## :material-rocket-launch: Quick Start
diff --git a/docs/quick-start.md b/docs/quick-start.md
@@ -247,9 +247,9 @@ cat covid_binders/prodigy/spike_nb1_prodigy_predictions.csv
 
 Now that you're up and running:
 
-1. **Learn More About Modes**: Check the [Pipeline Modes](modes/overview.md) documentation
+1. **Learn Basic Usage**: Check the [Usage Guide](getting-started/usage.md) for detailed instructions
 2. **Optimize Parameters**: See the [Parameters Reference](reference/parameters.md)
-3. **Analyze Results**: Learn about [PRODIGY](analysis/prodigy.md) and [ipSAE](analysis/ipsae.md)
+3. **Enable Analysis Modules**: Learn about [ProteinMPNN/Protenix](analysis/proteinmpnn-protenix.md), [PRODIGY](analysis/prodigy.md), and [ipSAE](analysis/ipsae.md)
 4. **Advanced Usage**: Explore [Architecture](architecture/design.md) details
 
 ---