-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Hi @MikkelSchubert I decided to make a dedicated issue for general feedback of AdapterRemoval v3 testing, as I may find other points to discuss:
Version v3.0.0pre 344591c
-
Leaving in single reads with Ns--combined-output If set, all reads are written to the same file(s), specified by --output1 and --output2 (--output1 only if --interleaved-output is not set). Discarded reads are replaced with a single 'N' with Phred score 0 [default: off].While I used to do this, @ashildv recently was informed by the ENA that include 'discarded reads' with a single 'N' will not<- I realise could just do the custom output instead and make sure discarded goes in a separate file
be accepted by their pipeline (it breaks, and the data gets rejected). Maybe it would be worth having e.g. 5 Ns or
something (or remove them entirely)? -
--singletonflag: would it make sense for consistency to have--outputsingletonas the other output flags (1,2,merged) start with--output? -
--settings FILE: could maybe be renamed, as the bulk of the contents of the JSON is stats rather than the settings itself -
json output:
- it would be nice for this to also include the physical number of
entriesthat are in the resulting output files when also merged (as a separate value), sort of equivalent toretainedreads in v2.3.2. Currently the JSON only reports the number of output (passed) reads as it would be if everything was unmerged. So something like in addition to thepassed,discardedandunidentifiedsections of the output JSON, having something likein_filesoroutput_filewould be nice to have as it helps match the expectation of a (unfamiliar) end-user between the file itself and the JSON report. However I recognise that this could be complicated given the very flexible output system now. - It would nice to have some documentation for what each value means. I've tried playing around but I still can't work out how the various
readsentry in the JSON relate to each other as what is in the final outputFASTQfiles
- it would be nice for this to also include the physical number of
initial tests completed most of the above are more quality-of-life issues, otherwise everything is working as expected 👍