Skip to content

Merge new zsh-dependent config revamp into master#82

Merged
agillen merged 35 commits intomasterfrom
config_update_zsh
Jan 28, 2026
Merged

Merge new zsh-dependent config revamp into master#82
agillen merged 35 commits intomasterfrom
config_update_zsh

Conversation

@agillen
Copy link
Collaborator

@agillen agillen commented Jan 28, 2026

No description provided.

agillen and others added 23 commits August 28, 2024 13:50
Limit snakemake version to < 8.0
correct interpretation of basename in _get_fq_paths
add bam indexing to filter rules
correct indexing in bam_filter rules
protect against relative paths to `ln -s` using `readlink`
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Revamps the pipeline configuration to be zsh-based and to use hierarchical YAML configuration for per-sample/per-chemistry/per-platform settings.

Changes:

  • Switches the workflow shell to zsh and adds hierarchical config lookup via SAMPLES + chemistry.yaml.
  • Refactors cutadapt/STAR rules to pull parameters from config instead of TSV/JSON-driven logic.
  • Removes legacy sample/chemistry artifacts (sample_fastqs.tsv, chemistry.json, generator script) and updates docs/developer guide.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
scraps_conda.yml Updates environment deps (STAR constraint loosening; adds zsh).
sample_fastqs.tsv Removes legacy sample sheet.
Snakefile Switches to zsh and adds hierarchical _get_config() + new rule-all output assembly.
config.yaml Introduces DEFAULTS + SAMPLES structure replacing TSV-driven sample config.
chemistry.yaml Adds new hierarchical chemistry/platform parameter definitions.
rules/cutadapt_star.snake Refactors trimming/alignment params to come from _get_config()/chemistry.yaml.
rules/count.snake Updates inputs to match new alignment outputs and adds pre-featureCounts filtering.
rules/qc.snake Adjusts QC inputs to use SAMPLES keys.
inst/scripts/cut_paste_fastq.py Adds optional length-based trimming support for stitched FASTQ reconstruction.
README.md Updates setup/config documentation to match new YAML-based configuration.
AGENTS.md Adds developer guide and conventions for working in this repo.
chemistry_to_json.py Removes obsolete JSON generator.
chemistry.json Removes obsolete chemistry JSON.
Comments suppressed due to low confidence (1)

rules/count.snake:45

  • assign_sites_R1 writes an intermediate filtered BAM to {params.temp} but never removes it. This will leave large files behind (unlike assign_sites_paired, which cleans up). Add cleanup for {params.temp} once it's no longer needed (and consider declaring it as a temp() output instead of a params path).
    samtools sort \
    {params.out_bam} \
    -o {output.bam}

    samtools index {output.bam}
    rm -rf {params.out_bam}
    """

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +7 to 9
def paste_fastq(file_in, file_out, length1):
print(length1)
with open(file_in) as file, gzip.open(file_out, 'wt', compresslevel = 1) as file2:
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are leftover debug prints (print(length1)) in both paste_fastq and main(). This will add noisy stdout in Snakemake logs and can interfere with callers that expect the script to be quiet on success. Remove these prints or gate them behind an explicit --verbose flag.

Copilot uses AI. Check for mistakes.
Comment on lines +120 to +122
if [ -z {params.bc_cut} ] ; then
echo "no additional trimming"
cutadapt -j 24 \
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the paired trimming branch, [ -z {params.bc_cut} ] is unquoted. When bc_cut is empty this expands to [ with a missing argument, causing a shell error; when it contains special chars it can also undergo word-splitting. Quote the substitution (e.g. "{params.bc_cut}") or otherwise ensure a safe test expression.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

agillen and others added 3 commits January 28, 2026 00:17
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI commented Jan 28, 2026

@agillen I've opened a new pull request, #83, to work on those changes. Once the pull request is ready, I'll request review from you.

agillen and others added 9 commits January 28, 2026 00:19
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: agillen <4809242+agillen@users.noreply.github.com>
Fix unquoted shell parameter substitutions in cutadapt_paired rule
Add step to create Conda environment before running Snakemake.
Removed the step to create the Conda environment using mamba.
Activate the 'scraps_conda' environment before running Snakemake.
@agillen agillen merged commit 2e9ee7a into master Jan 28, 2026
2 checks passed
@agillen agillen deleted the config_update_zsh branch January 28, 2026 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants