Replies: 1 comment
-
Hi @LPerlaza , Perhaps this could be solved using the common patterns that the communit has come up with, maybe it's worth asking in the nf-core slack as well. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi!
I'm just starting coding nextflow, and couldn't find an answer/example for my specific problem, but apologies if I miss it. I want to write a pipeline for metagenomes binning (and some other features). I have long reads and short reads. The pipeline should assemble the long reads, and use the short reads to polish/improve the long reads assembly. Note that for each long read assembly I have several pairs of short reads (different sequencing platforms or library preps). So, I do several mapping rounds. I want to do this in one workflow because I need to do some merging at the end.
Roughly it is something like:
The 3rd step is giving me the problems because I need the mapping of the short reads to map only for the long read assembly polished with those same reads. Is there a way that I can make the pipeline create a channel/variable that has the structure of a hash where the polished assembly of the long reads shares an ID with the corresponding short reads used for that polish?
the problematic step produces a files like for example:
CORRECT:
nanopore.gz_test_merged_map.sam_assemblyracon2correction.fasta_catalogue.mmi_test_map.sam
(1) nanopore.gz_(2) test_merged_map.sam (3) _assemblyracon2correction.fasta_catalogue.mmi (4)_test_map.sam
1.long reads file
2. reads used for polishing
3. arbitrary prefix
4. short reads used for mapping
INCORRECT:
nanopore.gz_test_merged_map.sam_assemblyracon2correction.fasta_catalogue.mmi_data_map.sam
(1) nanopore.gz_(2) test_merged_map.sam (3) _assemblyracon2correction.fasta_catalogue.mmi (4)_data_map.sam
examplePipeline.nf
The pipeline is just echoing commands at the moment so it is easy to reproduce. The input file could even be empty text files like:
The input files should be:
a long read file = nanopore.gz
short reads= test.R1.fq, testR2.fq, data.R1.fq, data.R2.fq
examplePipeline_modules.nf
the processes need to execute the pipeline code (include)
Thanks!!
Laura
Beta Was this translation helpful? Give feedback.
All reactions