-
Notifications
You must be signed in to change notification settings - Fork 84
Add ecology "pool of amplicons map to reference" workflow #1053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add ecology "pool of amplicons map to reference" workflow #1053
Conversation
workflows/ecology/map-to-reference-workflow/test-data/visualize_consensus.html
Outdated
Show resolved
Hide resolved
|
Youhou ! Thank you Molène ! If some help needed, don't hesitate to ask! Don't hesitate Marius, we are trying to contribute to IWC thanks to Molène and Pauline work! We have around 10 workflows to push! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new workflow for processing amplicon pool sequencing data with a reference sequence. The workflow performs quality control, read pairing, filtering, and reference-based mapping to generate a consensus sequence and associated metadata.
Key changes:
- Introduces a complete map-to-reference workflow for amplicon pool sequencing
- Includes comprehensive documentation and testing infrastructure
- Implements metadata generation pipeline tracking coverage and mapping statistics
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| Map-to-reference.ga | Main workflow file defining the complete amplicon pool processing pipeline |
| Map-to-reference-tests.yml | Test configuration with input files and output assertions |
| README.md | Detailed workflow documentation explaining purpose, inputs, steps, and outputs |
| .dockstore.yml | Dockstore configuration for workflow publication |
| CHANGELOG.md | Version history tracking initial release |
| metadata.tabular | Test data file containing expected metadata output |
workflows/ecology/map-to-reference-workflow/Map-to-reference.ga
Outdated
Show resolved
Hide resolved
workflows/ecology/map-to-reference-workflow/Map-to-reference.ga
Outdated
Show resolved
Hide resolved
| "label": "fastqc_forward", | ||
| "output_name": "html_file", |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow_outputs label "fastqc_forward" uses underscores. According to IWC guidelines, workflow output labels should be human-readable with spaces. Consider "FastQC forward" or "FastQC forward reads".
| "label": "fastqc_reverse", | ||
| "output_name": "html_file", |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow_outputs label "fastqc_reverse" uses underscores. According to IWC guidelines, workflow output labels should be human-readable with spaces. Consider "FastQC reverse" or "FastQC reverse reads".
| "label": "visualize_consensus", | ||
| "output_name": "output", |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow_outputs label "visualize_consensus" uses underscores. According to IWC guidelines, workflow output labels should be human-readable with spaces. Consider "Consensus visualization" or "Visualize consensus".
| Forward primer: AGTGAGTTTCAACAAAACAYAAGGNCATNGG | ||
| Reverse primer: AGTGAGTAAACTTCAGGGTGTCCRAARAATCA | ||
| outputs: | ||
| fastqc_forward: |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test file output names must match the workflow_outputs labels exactly. If the workflow labels are updated to human-readable format (e.g., "FastQC forward"), these test output names must be updated accordingly.
| asserts: | ||
| has_text: | ||
| text: html | ||
| fastqc_reverse: |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test file output names must match the workflow_outputs labels exactly. If the workflow labels are updated to human-readable format (e.g., "FastQC forward"), these test output names must be updated accordingly.
| asserts: | ||
| has_text: | ||
| text: html | ||
| fastqc_paired: |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test file output names must match the workflow_outputs labels exactly. If the workflow labels are updated to human-readable format (e.g., "FastQC forward"), these test output names must be updated accordingly.
| asserts: | ||
| has_text: | ||
| text: html | ||
| visualize_consensus: |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test file output names must match the workflow_outputs labels exactly. If the workflow labels are updated to human-readable format (e.g., "FastQC forward"), these test output names must be updated accordingly.
Test Results (powered by Planemo)Test Summary
Failed Tests
|
Co-authored-by: Copilot <[email protected]>
Change the name of the workflow. Line 111.
Change the name of the workflow. Line 3.
Workflow for processing amplicon pool sequencing data with reference.
This workflow allows you to reconstruct a sequence from an amplicon pool using a reference sequence. To run this workflow, you need the reads from the pool library you want to analyse in FASTQ format, separated into two files: forward and reverse. You will also need your reference sequence in FASTA format. This workflow creates a consensus sequence and a metadata file containing the length of the consensus sequence, the number of reads mapped to it, and the average, minimum, and maximum coverage depth. You can also retrieve a file containing unmapped reads.