Skip to content

Conversation

@rohan-ibn-tariq
Copy link
Contributor

@rohan-ibn-tariq rohan-ibn-tariq commented Nov 27, 2025

Add individual Snakemake-Wrappers for pytrf sub-commands (findstr, findgtr, findatr, extract) to enable modular execution and parameter handling within Snakemake workflows. Each wrapper handles optional parameters, output files, and logging in a clean and minimalistic way.

QC

While the contributions guidelines are more extensive, please particularly ensure that:

  • test.py was updated to call any added or updated example rules in a Snakefile -> test_wrappers.py
  • input: and output: file paths in the rules can be chosen arbitrarily
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:)
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to -> wrapper wasn't creating temp file, as for underlying tool I didn't changed the default behaviour
  • the meta.yaml contains a link to the documentation of the respective tool or command under url:
  • conda environments use a minimal amount of channels and packages, in recommended ordering

Summary by CodeRabbit

  • New Features
    • Added PyTRF support: four CLI tools with Snakemake wrappers, metadata/manifests and demo input data.
  • Tests
    • Added end-to-end tests for findstr, findgtr and findatr; extract test marked to skip.
  • Chores
    • Added Conda environment manifests and pinned explicit environment snapshots for reproducible runs.

✏️ Tip: You can customize this high-level summary in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@bio/pytrf/findgtr/wrapper.py`:
- Around line 37-44: The wrapper currently only blocks -f/--out-format but must
also forbid output-path flags in snakemake.params.extra to avoid duplicate
outputs; update the check that inspects extra (the variable extra using is_arg)
to also detect "-o" and "--out-file" and raise a ValueError with a message
similar to the existing one (mention that output path flags should not be
provided in params.extra because the wrapper controls the output). Ensure you
reference the same is_arg and extra variables in wrapper.py when adding the
additional flag checks.
🧹 Nitpick comments (2)
bio/pytrf/findatr/wrapper.py (1)

46-56: Remove unnecessary try/except around shell execution.

The try/except block wrapping the shell command is unnecessary—Snakemake handles execution failures automatically. Based on learnings from this project, this pattern overcomplicates simple wrapper scripts.

Suggested simplification
 # Execute
-try:
-    shell("pytrf findatr"
-        " {input_file}"
-        " {extra}"
-        " -f {out_format}"
-        " -o {output_file}"
-        " {log}"
-    )
-except Exception as e:
-    raise RuntimeError(f"PyTRF findatr execution failed: {e}") from e
+shell(
+    "pytrf findatr"
+    " {input_file}"
+    " {extra}"
+    " -f {out_format}"
+    " -o {output_file}"
+    " {log}"
+)
bio/pytrf/extract/wrapper.py (1)

56-67: Remove unnecessary try/except around shell execution.

Per project conventions, Snakemake handles exceptions when the wrapper fails, so this try/except block adds unnecessary complexity without benefit. The original exception from shell() would actually provide more useful context than the wrapped RuntimeError.

Based on learnings from this repository, a try/except block is unnecessary because Snakemake will handle exceptions if the wrapper fails. This was also flagged in past review comments.

♻️ Suggested simplification
 # Execute
-try:
-    shell("pytrf extract"
-        " {input_file}"
-        " -r {repeat_file}"
-        " {extra}"
-        " -f {out_format}"
-        " -o {output_file}"
-        " {log}"
-    )
-except Exception as e:
-    raise RuntimeError(f"PyTRF extract execution failed: {e}") from e
+shell(
+    "pytrf extract"
+    " {input_file}"
+    " -r {repeat_file}"
+    " {extra}"
+    " -f {out_format}"
+    " -o {output_file}"
+    " {log}"
+)

@fgvieira
Copy link
Collaborator

The wrappers are quite similar (as far as I can see, there is only an extra input on extract) and, if so, I'd prefer a single wrapper, but I'll leave that to @johanneskoester .

Other than that, I think it looks good. Thanks for your work @rohan-ibn-tariq!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@bio/pytrf/extract/wrapper.py`:
- Around line 31-39: Normalize extensions returned by get_format(output_file) by
mapping "fa" and "fna" to "fasta" before validation: call
get_format(output_file) into out_format, if out_format is "fa" or "fna" set
out_format = "fasta", then validate against supported_formats =
{"tsv","csv","fasta"} and raise the existing ValueError if not supported; update
the check around out_format and supported_formats used by the pytrf extract
wrapper.
♻️ Duplicate comments (2)
bio/pytrf/findgtr/wrapper.py (1)

37-44: Also block output-path flags in params.extra.
Otherwise users can pass -o/--out-file and conflict with the wrapper’s output handling.

♻️ Proposed fix
-if is_arg("-f", extra) or is_arg("--out-format", extra):
+if (
+    is_arg("-f", extra)
+    or is_arg("--out-format", extra)
+    or is_arg("-o", extra)
+    or is_arg("--out-file", extra)
+):
     raise ValueError(
-        "Output format is automatically inferred from output file extension. "
-        "Do not specify -f/--out-format in params.extra. "
+        "Output format is inferred and output path is provided by Snakemake. "
+        "Do not specify -f/--out-format or -o/--out-file in params.extra. "
     )
PyTRF findgtr CLI options -o --out-file
bio/pytrf/extract/wrapper.py (1)

57-68: Drop the redundant try/except around shell().

Snakemake already surfaces command failures; re-wrapping exceptions adds noise and can obscure original failure context.

♻️ Suggested simplification
-try:
-    shell(
-        "pytrf extract"
-        " {input_file}"
-        " -r {repeat_file}"
-        " {extra}"
-        " -f {out_format}"
-        " -o {output_file}"
-        " {log}"
-    )
-except Exception as e:
-    raise RuntimeError(f"PyTRF extract execution failed: {e}") from e
+shell(
+    "pytrf extract"
+    " {input_file}"
+    " -r {repeat_file}"
+    " {extra}"
+    " -f {out_format}"
+    " -o {output_file}"
+    " {log}"
+)

Based on learnings, Snakemake handles wrapper failures without extra try/except.

@rohan-ibn-tariq
Copy link
Contributor Author

rohan-ibn-tariq commented Jan 26, 2026

The wrappers are quite similar (as far as I can see, there is only an extra input on extract) and, if so, I'd prefer a single wrapper, but I'll leave that to @johanneskoester .

Other than that, I think it looks good. Thanks for your work @rohan-ibn-tariq!

First of all thank you very much for your immense time and assistance.

@fgvieira After the changes requested by you and your review, I was in the similar doubt. As less thing(s) became applicable from my initital comment . Therefore, I checked with you here and thought you endorsed the idea of having multiple wrappers still.

I still find this is more consistent because:

  • Seems like more used or already in place Snakemake's convetional pattern for many wrappers if not all;
  • Separation of Concerns;
    • For e.g: Two wrappers have critical bugs like extract command is not working, and findstr have specific bed output format error, they go cleanly in their own meta.
    • A command like extract has special input although minimal code change if all combined to one wrapper, but the code remains cleaner with a single responsibility rather mixing them up.
    • The snakemake test file remains decoupled for the 4 wrappers, which is better as user can clearly see the context of the command to use and its usage like extract is bit different then other three without mixing concepts. Tight coupling would then mean information overload within a single view from what is not required if the user is to use one particular command. This might be not that much problem at the moment but for the future sake argument decoupled is clearer and better and also from UX perspective it is the right choice as the information view holds single responsibility and is crystal clear.
    • Also I purposed verbose meta for extra params from User experience perspective as it helps user, here, at the same time I understand this might be bit tough to keep the docs in-sync with main docs, So minimalism here might be right choice unless there is an automatic sync, right?
    • Future extensibility;

Although, one thing that I find bit extra is change at multiple places, but even if we combine them and later things pop up, it might make meta overload and more cluttered, therefore I prefer cleaner view.

Still, can change if required.

@snakemake snakemake deleted a comment from rohan-ibn-tariq Jan 26, 2026
@fgvieira
Copy link
Collaborator

Initially I could se the argument, since a lot of the parameters were not shared, but now it seems that the wrappers are nearly identical. So, I'd go for one wrapper, but let's hear @johanneskoester .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants