Every so often GOAT or ENA taxon search seem to crash preventing the pipeline from running.
It can be bypassed manually adding certain fields to the input.
To bypass ENA Taxsearch one needs:
- sample.tax_id
- sample.genetic_code
- samlpe.mito_code
- sample.domain
- sample.lineage
To bypass GOAT one needs
- sample.genome_size
- sample.haploid_number
- sample.ploidy
The code needs to be refactored such that if these processes fail, they're don't break the rest of the pipeline. At the moment everything downstream is dependent on these processes working, or being bypassed. When there's a failure due to server outage, or API change it's impossible to know what went wrong at the time. A more robust structure is needed for better workflow performance.
Related issues
Closes #353