Skip to content

Improve input block when using augur subsample #31

@victorlin

Description

@victorlin

Progress

Background

When a filepath is defined in Snakemake's input block, The mtime rerun trigger allows Snakemake to intelligently¹ rerun the workflow upon modifications to the file.

¹ The mtime trigger is also conditional on the file contents being changed (src). Note this only applies to small files (i.e. under --max-checksum-file-size, default 1MB).

Description

Currently, input blocks for subsampling rules in repos using augur subsample are problematic in two ways:

  1. Every config change, even to unrelated config sections, triggers a re-run of augur subsample.
    • This is because the entire config dump is specified as an input.
  2. Changes to files referenced within the subsampling config do not trigger re-runs of augur subsample.
    • This is because the referenced files are not specified as inputs.

Possible solutions

  1. Dump a subset of config (the section used for augur subsample) per invocation and specify it as input.
  2. Add a helper function to return a list of referenced files in augur subsample config.
    • This should consider filepath resolution for compatibility with nextstrain run.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions