Allow generic definition of reference databases in config file#74
Allow generic definition of reference databases in config file#74timrozday-mgnify wants to merge 65 commits intomainfrom
Conversation
…e databases for SSU, PR2, LSU, ITS and UNITE analysis.
…s and databases don't get muddled. Change modules so that reads/seqs and database files are combined in a single channel. Small edits of channels throughout to align with good nextflow style. Modify resultsDir to use db_label variable from input channels.
Merge with main
…ess duplication of modules and SWFs for different databases, instead done by maintaining and filtering a databases channel.
…and set up test config
…ey can be ignored by later steps
…paths and read assignment counts.
… to hadle groovy json writer error.
|
I've updated the test snapshot because a useful file output has been added (truncation point). Some file hashes have changed, however the final output files are the same size as before. Please check the diff of the test snapshot. One puzzling thing is that previously the tests didn't produce an output for SILVA-LSU but now they do. I'm not sure what the desired behavior is and if I've removed a filter that was previously in place. @chrisAta? |
|
Ah, still need to update module tests |
mberacochea
left a comment
There was a problem hiding this comment.
I haven't finished.. the bbmap refactor module caught my attention
| @@ -0,0 +1,65 @@ | |||
| process BBMAP_REFORMAT_STANDARDISE { | |||
There was a problem hiding this comment.
This one should be pushed into nf-core, or at least nf-modules. Also it has several "TODO nf-core" to clean up.
| itsonedb_mapseq_krona_tuple, | ||
| ) | ||
| ch_versions = ch_versions.mix(MAPSEQ_OTU_KRONA_ITSONEDB.out.versions) | ||
| if (!params.skip_asv) { |
There was a problem hiding this comment.
just one quick comment before I do a more in-depth review: why are we adding this parameter? do we have cases in production where we would want to skip ASVs completely for entire studies/samplesheets?
There was a problem hiding this comment.
It is to make it more flexible for other non-production uses.
…ons to validate complicated channel operations in the pipeline
For supporting the MIMICC project, allow custom databases to be used in the pipeline.
Also add capturing of truncation position by DADA2.