calling all output files together in next process for the creation of database folder #3704
-
Hi , Below my script: process panelnormalsone {
publishDir params.trim, mode : 'copy'
input:
tuple val(pair_id), file(normal_samplefile)
tuple val(pair_id), file(normal_samplefile_bai)
output:
tuple val(pair_id),file(panelvcfnormal_samplefile), emit : vcf_normal
tuple val(pair_id),file(panelvcfnormal_samplefile_stats), emit : vcf_normal_stats
tuple val(pair_id),file(panelvcfnormal_samplefile_tbi), emit : vcf_normal_tbi
script:
panelvcfnormal_samplefile = pair_id + '_recal_somatic.vcf.gz'
panelvcfnormal_samplefile_stats = pair_id + '_recal_somatic.vcf.gz.stats'
panelvcfnormal_samplefile_tbi = pair_id + '_recal_somatic.vcf.gz.tbi'
"""
/mnt/raid1/gatk_latest/gatk-4.1.9.0/./gatk --java-options "-Xmx24G" Mutect2 -R ${params.reference} -I $normal_samplefile --max-mnp-distance 0 -O $panelvcfnormal_samplefile
"""
}
process Genomics_DB_creation{
publishDir params.pondb, mode : 'copy'
input:
tuple val(pair_id),file(panelvcfnormal_samplefile)
tuple val(pair_id),file(panelvcfnormal_samplefile_stats)
tuple val(pair_id),file(panelvcfnormal_samplefile_tbi)
output:
"""
./gatk --java-options "-Xmx24G" GenomicsDBImport -R ${params.reference} -L ${params.intervalfile} --genomicsdb-workspace-path !{params.pondb} $ {input_vcf} --merge-input-intervals true --reader-threads 30
"""
}
Genomics_DB_creation(panelnormalsone.out)
| collect i have to run a command like this: /gatk --java-options "-Xmx24G" GenomicsDBImport -R ${params.reference} -L ${params.intervalfile} --genomicsdb-workspace-path !{params.pondb} -V file 1 -V file2 -V file3 -V file 4 -V file 4 -V file 4 But it takes one sample at a time and generated database for every input sample. ALL i want a single folder from all input samples file.Please suggest |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Hi @awast , I converted your issue into a Q&A and edited your post for formatting. If I understand correctly, you want to collect all of the samples produced by the (samples, stats, tbi) = panelnormalsone( /* ... */ )
Genomics_DB_creation( samples.collect(), stats.collect(), tbi.collect() ) That will run your db creation process only once with all of the samples. You can construct that command line with some custom Groovy code: script:
sampleOpts = samples.collect { p -> "-V ${p.name}" } .join(' ')
"""
/gatk --java-options "-Xmx24G" GenomicsDBImport -R ${params.reference} -L ${params.intervalfile} --genomicsdb-workspace-path !{params.pondb} ${sampleOpts}
""" |
Beta Was this translation helpful? Give feedback.
-
Hi Sir, |
Beta Was this translation helpful? Give feedback.
Hi @awast , I converted your issue into a Q&A and edited your post for formatting. If I understand correctly, you want to collect all of the samples produced by the
panelnormalsone
process and feed them into a single run of theGenomics_DB_creation
process. In that case, you should use thecollect
operator before the db creation process. Something like:That will run your db creation process only once with all of the samples. You can construct that command line with some custom Groovy code: