Skip to content

Commit 0dd4d40

Browse files
committed
Merge branch 'trs/run/unmanaged-pathogens'
2 parents 7fb86eb + acd485d commit 0dd4d40

File tree

5 files changed

+121
-13
lines changed

5 files changed

+121
-13
lines changed

CHANGES.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,15 @@ supported Python version is always bundled with `nextstrain`.
3939
Batch job.
4040
([#460](https://github.com/nextstrain/cli/pull/460))
4141

42+
* `nextstrain run` now supports an alternative invocation where a path to a
43+
local directory that is a pathogen repository can be given instead of a
44+
pathogen name (and optionally version). This allows `nextstrain run` to be
45+
used with pathogen repos managed outside of Nextstrain CLI (i.e. not by
46+
`nextstrain setup` and `nextstrain update`), which can be useful for the
47+
analysis directory support and local testing. The workflow to run is still
48+
given separately by name (not path).
49+
([#476](https://github.com/nextstrain/cli/issues/476))
50+
4251
## Bug fixes
4352

4453
* `nextstrain setup <pathogen>@<version>` and `nextstrain update <pathogen>@<version>`
@@ -54,6 +63,16 @@ supported Python version is always bundled with `nextstrain`.
5463
provide.
5564
([#478](https://github.com/nextstrain/cli/issues/478))
5665

66+
* `nextstrain run` now overrides (i.e. suppresses) any ["workdir:"
67+
directives](https://snakemake.readthedocs.io/en/stable/snakefiles/configuration.html)
68+
in a workflow by explicitly setting the working directory when it invokes
69+
Snakemake. This avoids writing files into the pathogen/workflow source
70+
directories when non-compatible (or broken) workflows are used with
71+
`nextstrain run` despite the warnings issued. Such workflows are more likely
72+
to error and fail now early on rather than "succeed" but produce output files
73+
in the wrong location.
74+
([#476](https://github.com/nextstrain/cli/issues/476))
75+
5776
# 10.2.1.post1 (1 July 2025)
5877

5978
_See also changes in 10.2.1 which was an unreleased version._

doc/changes.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,15 @@ supported Python version is always bundled with `nextstrain`.
4343
Batch job.
4444
([#460](https://github.com/nextstrain/cli/pull/460))
4545

46+
* `nextstrain run` now supports an alternative invocation where a path to a
47+
local directory that is a pathogen repository can be given instead of a
48+
pathogen name (and optionally version). This allows `nextstrain run` to be
49+
used with pathogen repos managed outside of Nextstrain CLI (i.e. not by
50+
`nextstrain setup` and `nextstrain update`), which can be useful for the
51+
analysis directory support and local testing. The workflow to run is still
52+
given separately by name (not path).
53+
([#476](https://github.com/nextstrain/cli/issues/476))
54+
4655
(v-next-bug-fixes)=
4756
### Bug fixes
4857

@@ -59,6 +68,16 @@ supported Python version is always bundled with `nextstrain`.
5968
provide.
6069
([#478](https://github.com/nextstrain/cli/issues/478))
6170

71+
* `nextstrain run` now overrides (i.e. suppresses) any ["workdir:"
72+
directives](https://snakemake.readthedocs.io/en/stable/snakefiles/configuration.html)
73+
in a workflow by explicitly setting the working directory when it invokes
74+
Snakemake. This avoids writing files into the pathogen/workflow source
75+
directories when non-compatible (or broken) workflows are used with
76+
`nextstrain run` despite the warnings issued. Such workflows are more likely
77+
to error and fail now early on rather than "succeed" but produce output files
78+
in the wrong location.
79+
([#476](https://github.com/nextstrain/cli/issues/476))
80+
6281
(v10-2-1-post1)=
6382
## 10.2.1.post1 (1 July 2025)
6483

doc/commands/run.rst

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ nextstrain run
1212

1313
.. code-block:: none
1414
15-
usage: nextstrain run [options] <pathogen-name>[@<version>] <workflow-name> <analysis-directory> [<target> [<target> [...]]]
15+
usage: nextstrain run [options] <pathogen-name>[@<version>]|<pathogen-path> <workflow-name> <analysis-directory> [<target> [<target> [...]]]
1616
nextstrain run --help
1717
1818
@@ -44,12 +44,17 @@ positional arguments
4444

4545

4646

47-
.. option:: <pathogen-name>[@<version>]
47+
.. option:: <pathogen-name>[@<version>]|<pathogen-path>
4848

4949
The name (and optionally, version) of a previously set up pathogen.
5050
See :command-reference:`nextstrain setup`. If no version is
5151
specified, then the default version (if any) will be used.
5252

53+
Alternatively, the local path to a directory that is a pathogen
54+
repository. For this case to be recognized as such, the path must
55+
contain a separator (/) or consist entirely of the current
56+
directory (.) or parent directory (..) specifier.
57+
5358
Required.
5459

5560
.. option:: <workflow-name>

nextstrain/cli/command/run.py

Lines changed: 36 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,15 @@
2323
manage the run and download results after completion.
2424
"""
2525

26+
import os.path
2627
from inspect import cleandoc
2728
from shlex import quote as shquote
2829
from textwrap import dedent
2930
from .. import runner
3031
from ..argparse import add_extended_help_flags, MkDirectoryPath, SKIP_AUTO_DEFAULT_IN_HELP
31-
from ..debug import DEBUGGING
32+
from ..debug import DEBUGGING, debug
3233
from ..errors import UserError
33-
from ..pathogens import PathogenVersion
34+
from ..pathogens import PathogenVersion, UnmanagedPathogen
3435
from ..runner import aws_batch, docker, singularity
3536
from ..util import byte_quantity, split_image_name
3637
from ..volume import NamedVolume
@@ -39,7 +40,7 @@
3940

4041
def register_parser(subparser):
4142
"""
42-
%(prog)s [options] <pathogen-name>[@<version>] <workflow-name> <analysis-directory> [<target> [<target> [...]]]
43+
%(prog)s [options] <pathogen-name>[@<version>]|<pathogen-path> <workflow-name> <analysis-directory> [<target> [<target> [...]]]
4344
%(prog)s --help
4445
"""
4546

@@ -48,14 +49,19 @@ def register_parser(subparser):
4849
# Positional parameters
4950
parser.add_argument(
5051
"pathogen",
51-
metavar = "<pathogen-name>[@<version>]",
52+
metavar = "<pathogen-name>[@<version>]|<pathogen-path>",
5253
help = cleandoc(f"""
5354
The name (and optionally, version) of a previously set up pathogen.
5455
See :command-reference:`nextstrain setup`. If no version is
5556
specified, then the default version (if any) will be used.
5657
58+
Alternatively, the local path to a directory that is a pathogen
59+
repository. For this case to be recognized as such, the path must
60+
contain a separator ({{path_sep}}) or consist entirely of the current
61+
directory ({os.path.curdir}) or parent directory ({os.path.pardir}) specifier.
62+
5763
Required.
58-
"""))
64+
""".format(path_sep = " or ".join(sorted(set([os.path.sep, os.path.altsep or os.path.sep]))))))
5965

6066
parser.add_argument(
6167
"workflow",
@@ -226,21 +232,32 @@ def run(opts):
226232
""")
227233

228234
# Resolve pathogen and workflow names to a local workflow directory.
229-
pathogen = PathogenVersion(opts.pathogen)
235+
try:
236+
pathogen = UnmanagedPathogen(opts.pathogen)
237+
except ValueError:
238+
debug(f"Treating {opts.pathogen!r} as managed pathogen version")
239+
pathogen = PathogenVersion(opts.pathogen)
240+
else:
241+
debug(f"Treating {opts.pathogen!r} as unmanaged pathogen directory")
230242

231243
if opts.workflow not in pathogen.registered_workflows():
232244
print(f"The {opts.workflow!r} workflow is not registered as a compatible workflow, but trying to run anyways.")
233245

234246
workflow_directory = pathogen.workflow_path(opts.workflow)
235247

236248
if not workflow_directory.is_dir() or not (workflow_directory / "Snakefile").is_file():
237-
raise UserError(f"""
238-
No {opts.workflow!r} workflow for pathogen {opts.pathogen!r} found {f"in {str(workflow_directory)!r}" if DEBUGGING else "locally"}.
249+
if isinstance(pathogen, UnmanagedPathogen):
250+
raise UserError(f"""
251+
No {opts.workflow!r} workflow for pathogen {opts.pathogen!r} found {f"in {str(workflow_directory)!r}" if DEBUGGING else "locally"}.
252+
""")
253+
else:
254+
raise UserError(f"""
255+
No {opts.workflow!r} workflow for pathogen {opts.pathogen!r} found {f"in {str(workflow_directory)!r}" if DEBUGGING else "locally"}.
239256
240-
Maybe you need to update to a newer version of the pathogen?
257+
Maybe you need to update to a newer version of the pathogen?
241258
242-
Hint: to update the pathogen, run `nextstrain update {shquote(pathogen.name)}`.
243-
""")
259+
Hint: to update the pathogen, run `nextstrain update {shquote(pathogen.name)}`.
260+
""")
244261

245262
# The pathogen volume is the pathogen directory (i.e. repo).
246263
# The workflow volume is the workflow directory within the pathogen directory.
@@ -274,6 +291,14 @@ def run(opts):
274291
*(["--forceall"]
275292
if opts.force else []),
276293

294+
# Explicitly use Snakemake's current working directory as the
295+
# workflow's workdir, overriding any "workdir:" directive the workflow
296+
# may include. Snakemake uses the cwd by default in the absence of any
297+
# "workdir:" directive, but we want to _always_ use it to avoid writing
298+
# into the pathogen/workflow source directories if a non-compatible
299+
# workflow is run.
300+
"--directory=.",
301+
277302
# Workdir will be the analysis volume (/nextstrain/build in a
278303
# containerized runtime), so explicitly point to the Snakefile.
279304
"--snakefile=%s/Snakefile" % (

nextstrain/cli/pathogens.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -640,6 +640,46 @@ def __eq__(self, other) -> bool:
640640
and (self.url == other.url or self.url is None or other.url is None)
641641

642642

643+
class UnmanagedPathogen:
644+
"""
645+
A local directory that's a pathogen repo, not managed by Nextstrain CLI.
646+
647+
Used by ``nextstrain run``. Includes only the :cls:`PathogenVersion` API
648+
surface that ``nextstrain run`` requires.
649+
"""
650+
path: Path
651+
registration_path: Path
652+
653+
registration: Optional[dict] = None
654+
655+
def __init__(self, path: str):
656+
spec = PathogenSpec.parse(path)
657+
658+
if not spec.name or (spec.name not in set([os.path.curdir, os.path.pardir]) and not (set(spec.name) & set([os.path.sep, os.path.altsep or os.path.sep]))):
659+
raise ValueError(f"the {spec.name!r} part of {path!r} does not look like a path")
660+
661+
self.path = Path(path)
662+
663+
if not self.path.is_dir():
664+
raise UserError(f"""
665+
Path {str(self.path)!r} is not a directory (or does not exist).
666+
""")
667+
668+
self.registration_path = self.path / "nextstrain-pathogen.yaml"
669+
670+
if self.registration_path.exists():
671+
self.registration = read_pathogen_registration(self.registration_path)
672+
673+
registered_workflows = PathogenVersion.registered_workflows
674+
workflow_path = PathogenVersion.workflow_path
675+
676+
def __str__(self) -> str:
677+
return str(self.path)
678+
679+
def __repr__(self) -> str:
680+
return f"<UnmanagedPathogen path={str(self.path)!r}>"
681+
682+
643683
def every_pathogen_default_by_name(pathogens: Dict[str, Dict[str, PathogenVersion]] = None) -> Dict[str, PathogenVersion]:
644684
"""
645685
Scans file system to return a dict of :cls:`PathogenVersion` objects,

0 commit comments

Comments
 (0)